ZAPF

Version 0.1, July 2009

 

Introduction

ZAPF, the Z-machine Assembler Program of the Future, is an assembler for the Z-machine interactive fiction platform. It provides nearly complete control over the Z-machine's memory layout, and supports two assembly syntaxes: the default syntax is similar to the original ZAP used by Infocom, and a syntax similar to Inform's assembler can also be selected.

ZAPF is a managed application and has been tested under Microsoft .NET (on Windows) as well as Mono (on Linux).

To use ZAPF, you should be familiar with the Z-machine architecture and instruction set. Refer to the Z-Machine Standards Document if not.

Usage

The simplest way to assemble a file called “foo.zap” is with the command:

 

zapf foo.zap

 

Or, if using Mono:

 

mono zapf.exe foo.zap

 

This will use the default (Infocom) syntax and generate an output file named according to the Z-machine version, for example “foo.z3”.

More options are available: start ZAPF with no parameters for details. In particular, you can change the output filename by specifying a new name after the input filename, and you can select the Inform syntax by specifying the “-i” switch before the input filename. You can change the Z-machine version with the “-v” switch, but the .NEW directive is preferred (see below).

Syntax

A ZAPF input file consists of comments, labels, directives, and instructions. One instruction or directive is allowed per line. Comments and labels may appear on any line, even lines with no instruction or directive. Blank lines are ignored.

Note: directives, instructions, labels, and all other names in ZAPF are case-sensitive.

Comments

Comments are ignored by the assembler. A comment begins with a semicolon and continues until the end of the line:

; This is a comment all by itself

ADD X,Y >Z; This is a comment after an instruction

Labels

Labels associate a name with a location in the output file. A label consists of a word followed by one or two colons. A label may appear before an instruction or directive, or by itself, but only one label may appear on a line.

A label with one colon is “local” and can only be referenced within the same routine (see the .FUNCT directive below). The name can be reused in other functions.

A label with two colons is “global” and can be referenced from anywhere else, thus the name must be unique within the whole program. On Z-machine versions 3 and 4, certain global labels have special meaning and must be defined somewhere in the program: see “Version Considerations” below.

Directives

Directives are special commands to the assembler. Some directives cause data to be written to the output file; others merely affect how other parts of the file are interpreted. Directive names must always be given in uppercase.

Some directives take one or more expressions as parameters. Such an expression can be either a number (a positive or negative decimal integers), a global symbol (the name of a global label, object, constant, etc.), or the sum of two or more numbers or constant names connected by “+” signs.

Some directives take a string as a parameter. Strings are delimited by quotation marks and may contain line breaks. If a string contains a quotation mark, the quotation mark must be doubled.

Some directives take one or more names as parameters. Names must be words containing only A-Z (uppercase or lowercase), digits 0-9, and specific punctuation: hyphen (-), dollar sign ($), hash mask (#), ampersand (&), or period (.). In the default syntax mode, question mark is also allowed; in Inform syntax mode, question mark is forbidden but underscore is allowed instead.

= (equal sign)

<name>=<expression>

Defines the specified name as a global constant whose value is given by the expression. The name may then be used later in the file in place of the expression.

.BYTE

.BYTE <expression> [,<expression>,…]

Writes one or more data bytes to the output file.

If a variable name is given as one of the expressions, the variable's number will be written, not its value.

.END

.END

Marks the end of the program.

.ENDI

.ENDI

Marks the end of an inserted file.

.ENDT

.ENDT

Marks the end of a table. If an expected size was supplied in the matching .TABLE directive, and the actual size of the table doesn't match, ZAPF will print a warning message.

.FSTR

.FSTR <name>,"string"

Writes an encoded string to the output file, and defines the specified name as a global symbol pointing to it (a word address, suitable for use in the WORDS table). If necessary, a zero byte will be written first to ensure that the string starts at an even address.

The string is also entered into the internal abbreviation table and automatically used to abbreviate game text. All abbreviations must be defined before any code or data that contains strings.

Note: this directive should not be used inside the WORDS table.

.FUNCT

.FUNCT <routine name> [,<local name> [=<expression>],...]

Writes a routine header to the output file, and defines the specified name as a global symbol pointing to it (a packed address, suitable for use with a CALL instruction). If necessary, one or more zero bytes will be written first to ensure that the routine starts at an address divisible by 2, 4, or 8 (depending on the Z-machine version).

This directive also clears any local symbols that were defined previously. If any additional names are specified after the routine name, they will be defined as local variables. On Z-machine versions 3 and 4, expressions may also be given to define the initial values for the local variables; on later versions, local variables are always initialized to zero, and ZAPF will print a warning if any default values are given.

.GSTR

.GSTR <name>,"string"

Writes an encoded string to the output file, and defines the specified name as a global symbol pointing to it (a packed address, suitable for use with a PRINT instruction). If necessary, one or more zero bytes will be written first to ensure that the string starts at an address divisible by 2, 4, or 8 (depending on the Z-machine version).

.GVAR

.GVAR <name> [=<expression>]

Defines the specified name as a global symbol pointing to the next unused global variable slot, and writes the variable's initial value to the output file. If an expression is given, it will be used as the initial value; otherwise the initial value will be zero. This directive should be used in the GLOBAL table.

.INSERT

.INSERT "filename"

Assembles the specified file in place of this directive, then resumes at the next line of the current file. The inserted file should end with a .ENDI directive.

If a file with this exact name is not found, ZAPF will try adding a “.zap” or “.xzap” extension before finally giving up.

.LEN

.LEN "string"

Encodes a string (without writing it to the output file), then writes a byte to the output file indicating the number of words taken up by the encoded form of the string.

.NEW

.NEW <expression>

Sets the Z-machine version number. Acceptable values range from 3 to 8.

.OBJECT

Z-machine version 3:

.OBJECT <name>,<flags1>,<flags2>,<parent>,
<sibling>,<child>,<properties>

Z-machine versions 4 and up:

.OBJECT <name>,<flags1>,<flags2>,<flags3>,
<parent>,<sibling>,<child>,<properties>

Writes an object record to the output file, and defines the specified name as a global symbol pointing to the next unused object number. This directive should be used in the OBJECT table.

All parameters after the name are expressions whose values are written into the object record. Typically, flags1, flags2, and flags3 are constants or sums of constants, parent, sibling, and child are object names, and properties is a global label pointing to a property table defined elsewhere.

.PROP

.PROP <length>,<number>

Writes a property header to the output file. The parameters are expressions giving the length (in bytes) of the property data which follows and the property number, respectively. This directive should be used in property tables referenced by the .OBJECT directive.

Note: this directive does not begin or end the property table. The property table must begin with a length-prefixed string (see .STRL) and end with .BYTE 0.

.STR

.STR "string"

Writes an encoded string to the output file.

.STRL

.STRL "string"

Writes an encoded string to the output file, prefixed by a byte indicating the number of words taken up by the encoded string. This is equivalent to .LEN followed by .STR for the same string.

.TABLE

.TABLE [<expression>]

Begins a table definition, which must be ended later with .ENDT. The expression, if specified, indicates the length of the table in bytes; .ENDT will print a warning if the table size is incorrect.

Table definitions may not be nested.

.VOCBEG

.VOCBEG <record length>,<key length>

Begins a block of sorted records, which must be ended later with .VOCEND. Record length and key length are expressions giving the length (in bytes) of each record, and of the sort key, which must appear at the beginning of each record.

Records within the block will be rearranged in increasing order of their sort keys, treating the key as a big-endian number. Within the block, labels may only appear at the beginning of a record: that is, at a multiple of record length bytes after .VOCBEG. The labels will be updated as the records are moved.

Typically this directive is used in the VOCAB table to sort dictionary words. In this case, record length should be the length of an entire dictionary entry, and key length should be the length (in bytes!) of a dictionary word for the selected Z-machine version (4 in version 3, or 6 in all later versions).

Sorted blocks may not be nested.

.VOCEND

.VOCEND

Ends a block of sorted records started with .VOCBEG.

.WORD

[.WORD] <expression> [,<expression>,…]

Writes one or more data words to the output file.

Note: the .WORD directive itself is optional. If one or more expressions separated by commas are written on a line, without a directive or instruction name in front, ZAPF will write them to the output file as data words.

.ZWORD

.ZWORD "string"

Writes an encoded string to the output file as a dictionary word. The string will be padded or truncated to contain the correct number of Z-characters for the Z-machine version (6 in version 3, or 9 in all later versions).

Instructions

Two Syntaxes

The “-i” switch affects instructions in two ways. First, it changes the general syntax of operands, stores, and branches, as shown in the following table.

 

Default syntax

Inform syntax

Plain instruction

MOVE x,y

insert_obj x y

Store

ADD x,y >r

add x y -> r

Branch

EQUAL? x,y /label

je x y ?label

Negated branch

EQUAL? x,y \label

je x y ?~label

Branch to return

ZERO? x /TRUE

jz x ?rtrue

 

Second, it changes the opcode names from Infocom's original names to the names used in the Z-Machine Standards Document, as shown in the following table. Note that opcode names are case-sensitive in both modes. (Also note that CHECKU and PRINTU were not in Infocom's original design.)

Default name

Inform name

Default name

Inform name

Default name

Inform name

ADD

add

ICALL1

call_1n

PUTB

storeb

ASHIFT

art_shift

ICALL2

call_2n

PUTP

put_prop

ASSIGNED?

check_arg_count

IGRTR?

inc_chk

QUIT

quit

BAND

and

IN?

jin

RANDOM

random

BCOM

not

INC

inc

READ

aread / sread

BOR

or

INPUT

read_char

REMOVE

remove_obj

BTST

test

INTBL?

scan_table

RESTART

restart

BUFOUT

buffer_mode

IRESTORE

restore_undo

RESTORE

restore

CALL

call_vs

ISAVE

save_undo

RETURN

ret

CALL1

call_1s

IXCALL

call_vn2

RFALSE

rfalse

CALL2

call_2s

JUMP

jump

RSTACK

ret_popped

CATCH

catch

LESS?

jl

RTRUE

rtrue

CHECKU

check_unicode

LEX

tokenise

SAVE

save

CLEAR

erase_window

LOC

get_parent

SCREEN

set_window

COLOR

set_colour

MARGIN

set_margins

SCROLL

scroll_window

COPYT

copy_table

MENU

make_menu

SET

store

CRLF

new_line

MOD

mod

SHIFT

log_shift

CURGET

get_cursor

MOUSE-INFO

read_mouse

SOUND

sound_effect

CURSET

set_cursor

MOUSE-LIMIT

mouse_window

SPLIT

split_window

DCLEAR

erase_picture

MOVE

insert_obj

SUB

sub

DEC

dec

MUL

mul

THROW

throw

DIRIN

input_stream

NEXT?

get_sibling

USL

show_status

DIROUT

output_stream

NEXTP

get_next_prop

VALUE

load

DISPLAY

draw_picture

NOOP

nop

VERIFY

verify

DIV

div

ORIGINAL?

piracy

WINATTR

window_style

DLESS?

dec_chk

PICINF

picture_data

WINGET

get_wind_prop

EQUAL?

jeq

PICSET

picture_table

WINPOS

move_window

ERASE

erase_line

POP

pull

WINPUT

put_wind_prop

FCLEAR

clear_attr

PRINT

print_paddr

WINSIZE

window_size

FIRST?

get_child

PRINTB

print_addr

XCALL

call_vs2

FONT

set_font

PRINTC

print_char

XPUSH

push_stack

FSET

set_attr

PRINTD

print_obj

ZERO?

jz

FSET?

test_attr

PRINTF

print_form

ZWSTR

encode_text

FSTACK

pop / pop_stack

PRINTI

print

 

 

GET

loadw

PRINTN

print_num

 

 

GETB

loadb

PRINTR

print_ret

 

 

GETP

get_prop

PRINTT

print_table

 

 

GETPT

get_prop_addr

PRINTU

print_unicode

 

 

GRTR?

jg

PTSIZE

get_prop_len

 

 

HLIGHT

set_text_style

PUSH

push

 

 

ICALL

call_vn

PUT

storew

 

 

Indirect Variable Operands

Some opcodes (SET, VALUE, INC, DEC, IGRTR?, DLESS?) take the number of a variable as their first parameter. However, a variable name is not treated specially in this position. This instruction stores 10 into the variable whose number is in the variable “X”:

SET X,10

To store 10 into “X” itself, prefix the variable name with an apostrophe:

SET 'X,10

Even in Inform mode, the apostrophe is still necessary (unlike in Inform's assembler):

store 'x 10

Default Store Target

If the target of a store instruction is omitted, the result will be stored to the stack by default.

Version Considerations

Header

Version 3 and 4

In these versions, ZAPF automatically assembles the game header. Therefore, certain global labels must be defined:

ENDLOD

Marks the end of low memory and the beginning of high memory. Some interpreters might conserve RAM by leaving high memory on the disk, so frequently used constant data should be (and all mutable data must be) located before this label.

IMPURE

Marks the end of “impure” (dynamic) memory and the beginning of “pure” (static) memory. This must be defined before ENDLOD.

START

Marks the instruction where the program begins.

VOCAB

Marks the beginning of the dictionary (vocabulary) table. See the Z-Machine Standards Document for the format of this table.

OBJECT

Marks the beginning of the object table. See the Z-Machine Standards Document for the format of this table, and see the .OBJECT directive above. This must be defined before ENDLOD.

GLOBAL

Marks the beginning of the global variable table, which consists of up to 240 words corresponding to the Z-machine's global variables. See the .GVAR directive above. This must be defined before ENDLOD.

WORDS

Marks the beginning of the abbreviation table, which consists of up to 96 word addresses (byte addresses divided by 2) pointing to abbreviation strings. See the .FSTR directive above.

 

Optionally, the constant RELEASEID may be defined to set the release number of the output file. If it is omitted, the release number will be 0.

Version 5 and up

In these versions, ZAPF does not automatically create a game header. The input file must start with data directives to assemble one; refer to the Z-Machine Standards Document for the format of the header. ZAPF will, however, fill in the length, checksum, and creator ID (a.k.a. “Inform version”) fields.

ZAPF History

0.1 – July 2, 2009