ZAPF
Version 0.1, July 2009
ZAPF, the Z-machine Assembler Program of the Future, is an assembler for the Z-machine interactive fiction platform. It provides nearly complete control over the Z-machine's memory layout, and supports two assembly syntaxes: the default syntax is similar to the original ZAP used by Infocom, and a syntax similar to Inform's assembler can also be selected.
ZAPF is a managed application and has been tested under Microsoft .NET (on Windows) as well as Mono (on Linux).
To use ZAPF, you should be familiar with the Z-machine architecture and instruction set. Refer to the Z-Machine Standards Document if not.
The simplest way to assemble a file called “foo.zap” is with the command:
zapf foo.zap
Or, if using Mono:
mono zapf.exe foo.zap
This will use the default (Infocom) syntax and generate an output file named according to the Z-machine version, for example “foo.z3”.
More options are available: start ZAPF with no parameters for details. In particular, you can change the output filename by specifying a new name after the input filename, and you can select the Inform syntax by specifying the “-i” switch before the input filename. You can change the Z-machine version with the “-v” switch, but the .NEW directive is preferred (see below).
A ZAPF input file consists of comments, labels, directives, and instructions. One instruction or directive is allowed per line. Comments and labels may appear on any line, even lines with no instruction or directive. Blank lines are ignored.
Note: directives, instructions, labels, and all other names in ZAPF are case-sensitive.
Comments are ignored by the assembler. A comment begins with a semicolon and continues until the end of the line:
; This is a comment all by itself
ADD X,Y >Z; This is a comment after an instruction
Labels associate a name with a location in the output file. A label consists of a word followed by one or two colons. A label may appear before an instruction or directive, or by itself, but only one label may appear on a line.
A label with one colon is “local” and can only be referenced within the same routine (see the .FUNCT directive below). The name can be reused in other functions.
A label with two colons is “global” and can be referenced from anywhere else, thus the name must be unique within the whole program. On Z-machine versions 3 and 4, certain global labels have special meaning and must be defined somewhere in the program: see “Version Considerations” below.
Directives are special commands to the assembler. Some directives cause data to be written to the output file; others merely affect how other parts of the file are interpreted. Directive names must always be given in uppercase.
Some directives take one or more expressions as parameters. Such an expression can be either a number (a positive or negative decimal integers), a global symbol (the name of a global label, object, constant, etc.), or the sum of two or more numbers or constant names connected by “+” signs.
Some directives take a string as a parameter. Strings are delimited by quotation marks and may contain line breaks. If a string contains a quotation mark, the quotation mark must be doubled.
Some directives take one or more names as parameters. Names must be words containing only A-Z (uppercase or lowercase), digits 0-9, and specific punctuation: hyphen (-), dollar sign ($), hash mask (#), ampersand (&), or period (.). In the default syntax mode, question mark is also allowed; in Inform syntax mode, question mark is forbidden but underscore is allowed instead.
<name>=<expression>
Defines the specified name as a global constant whose value is given by the expression. The name may then be used later in the file in place of the expression.
.BYTE <expression> [,<expression>,…]
Writes one or more data bytes to the output file.
If a variable name is given as one of the expressions, the variable's number will be written, not its value.
.END
Marks the end of the program.
.ENDI
Marks the end of an inserted file.
.ENDT
Marks the end of a table. If an expected size was supplied in the matching .TABLE directive, and the actual size of the table doesn't match, ZAPF will print a warning message.
.FSTR <name>,"string"
Writes an encoded string to the output file, and defines the specified name as a global symbol pointing to it (a word address, suitable for use in the WORDS table). If necessary, a zero byte will be written first to ensure that the string starts at an even address.
The string is also entered into the internal abbreviation table and automatically used to abbreviate game text. All abbreviations must be defined before any code or data that contains strings.
Note: this directive should not be used inside the WORDS table.
.FUNCT <routine name> [,<local name> [=<expression>],...]
Writes a routine header to the output file, and defines the specified name as a global symbol pointing to it (a packed address, suitable for use with a CALL instruction). If necessary, one or more zero bytes will be written first to ensure that the routine starts at an address divisible by 2, 4, or 8 (depending on the Z-machine version).
This directive also clears any local symbols that were defined previously. If any additional names are specified after the routine name, they will be defined as local variables. On Z-machine versions 3 and 4, expressions may also be given to define the initial values for the local variables; on later versions, local variables are always initialized to zero, and ZAPF will print a warning if any default values are given.
.GSTR <name>,"string"
Writes an encoded string to the output file, and defines the specified name as a global symbol pointing to it (a packed address, suitable for use with a PRINT instruction). If necessary, one or more zero bytes will be written first to ensure that the string starts at an address divisible by 2, 4, or 8 (depending on the Z-machine version).
.GVAR <name> [=<expression>]
Defines the specified name as a global symbol pointing to the next unused global variable slot, and writes the variable's initial value to the output file. If an expression is given, it will be used as the initial value; otherwise the initial value will be zero. This directive should be used in the GLOBAL table.
.INSERT "filename"
Assembles the specified file in place of this directive, then resumes at the next line of the current file. The inserted file should end with a .ENDI directive.
If a file with this exact name is not found, ZAPF will try adding a “.zap” or “.xzap” extension before finally giving up.
.LEN "string"
Encodes a string (without writing it to the output file), then writes a byte to the output file indicating the number of words taken up by the encoded form of the string.
.NEW <expression>
Sets the Z-machine version number. Acceptable values range from 3 to 8.
Z-machine version 3:
.OBJECT <name>,<flags1>,<flags2>,<parent>,
<sibling>,<child>,<properties>
Z-machine versions 4 and up:
.OBJECT <name>,<flags1>,<flags2>,<flags3>,
<parent>,<sibling>,<child>,<properties>
Writes an object record to the output file, and defines the specified name as a global symbol pointing to the next unused object number. This directive should be used in the OBJECT table.
All parameters after the name are expressions whose values are written into the object record. Typically, flags1, flags2, and flags3 are constants or sums of constants, parent, sibling, and child are object names, and properties is a global label pointing to a property table defined elsewhere.
.PROP <length>,<number>
Writes a property header to the output file. The parameters are expressions giving the length (in bytes) of the property data which follows and the property number, respectively. This directive should be used in property tables referenced by the .OBJECT directive.
Note: this directive does not begin or end the property table. The property table must begin with a length-prefixed string (see .STRL) and end with .BYTE 0.
.STR "string"
Writes an encoded string to the output file.
.STRL "string"
Writes an encoded string to the output file, prefixed by a byte indicating the number of words taken up by the encoded string. This is equivalent to .LEN followed by .STR for the same string.
.TABLE [<expression>]
Begins a table definition, which must be ended later with .ENDT. The expression, if specified, indicates the length of the table in bytes; .ENDT will print a warning if the table size is incorrect.
Table definitions may not be nested.
.VOCBEG <record length>,<key length>
Begins a block of sorted records, which must be ended later with .VOCEND. Record length and key length are expressions giving the length (in bytes) of each record, and of the sort key, which must appear at the beginning of each record.
Records within the block will be rearranged in increasing order of their sort keys, treating the key as a big-endian number. Within the block, labels may only appear at the beginning of a record: that is, at a multiple of record length bytes after .VOCBEG. The labels will be updated as the records are moved.
Typically this directive is used in the VOCAB table to sort dictionary words. In this case, record length should be the length of an entire dictionary entry, and key length should be the length (in bytes!) of a dictionary word for the selected Z-machine version (4 in version 3, or 6 in all later versions).
Sorted blocks may not be nested.
.VOCEND
Ends a block of sorted records started with .VOCBEG.
[.WORD] <expression> [,<expression>,…]
Writes one or more data words to the output file.
Note: the .WORD directive itself is optional. If one or more expressions separated by commas are written on a line, without a directive or instruction name in front, ZAPF will write them to the output file as data words.
.ZWORD "string"
Writes an encoded string to the output file as a dictionary word. The string will be padded or truncated to contain the correct number of Z-characters for the Z-machine version (6 in version 3, or 9 in all later versions).
The “-i” switch affects instructions in two ways. First, it changes the general syntax of operands, stores, and branches, as shown in the following table.
| Default syntax | Inform syntax |
Plain instruction | MOVE x,y | insert_obj x y |
Store | ADD x,y >r | add x y -> r |
Branch | EQUAL? x,y /label | je x y ?label |
Negated branch | EQUAL? x,y \label | je x y ?~label |
Branch to return | ZERO? x /TRUE | jz x ?rtrue |
Second, it changes the opcode names from Infocom's original names to the names used in the Z-Machine Standards Document, as shown in the following table. Note that opcode names are case-sensitive in both modes. (Also note that CHECKU and PRINTU were not in Infocom's original design.)
Default name | Inform name | Default name | Inform name | Default name | Inform name |
ADD | add | ICALL1 | call_1n | PUTB | storeb |
ASHIFT | art_shift | ICALL2 | call_2n | PUTP | put_prop |
ASSIGNED? | check_arg_count | IGRTR? | inc_chk | QUIT | quit |
BAND | and | IN? | jin | RANDOM | random |
BCOM | not | INC | inc | READ | aread / sread |
BOR | or | INPUT | read_char | REMOVE | remove_obj |
BTST | test | INTBL? | scan_table | RESTART | restart |
BUFOUT | buffer_mode | IRESTORE | restore_undo | RESTORE | restore |
CALL | call_vs | ISAVE | save_undo | RETURN | ret |
CALL1 | call_1s | IXCALL | call_vn2 | RFALSE | rfalse |
CALL2 | call_2s | JUMP | jump | RSTACK | ret_popped |
CATCH | catch | LESS? | jl | RTRUE | rtrue |
CHECKU | check_unicode | LEX | tokenise | SAVE | save |
CLEAR | erase_window | LOC | get_parent | SCREEN | set_window |
COLOR | set_colour | MARGIN | set_margins | SCROLL | scroll_window |
COPYT | copy_table | MENU | make_menu | SET | store |
CRLF | new_line | MOD | mod | SHIFT | log_shift |
CURGET | get_cursor | MOUSE-INFO | read_mouse | SOUND | sound_effect |
CURSET | set_cursor | MOUSE-LIMIT | mouse_window | SPLIT | split_window |
DCLEAR | erase_picture | MOVE | insert_obj | SUB | sub |
DEC | dec | MUL | mul | THROW | throw |
DIRIN | input_stream | NEXT? | get_sibling | USL | show_status |
DIROUT | output_stream | NEXTP | get_next_prop | VALUE | load |
DISPLAY | draw_picture | NOOP | nop | VERIFY | verify |
DIV | div | ORIGINAL? | piracy | WINATTR | window_style |
DLESS? | dec_chk | PICINF | picture_data | WINGET | get_wind_prop |
EQUAL? | jeq | PICSET | picture_table | WINPOS | move_window |
ERASE | erase_line | POP | pull | WINPUT | put_wind_prop |
FCLEAR | clear_attr | print_paddr | WINSIZE | window_size | |
FIRST? | get_child | PRINTB | print_addr | XCALL | call_vs2 |
FONT | set_font | PRINTC | print_char | XPUSH | push_stack |
FSET | set_attr | PRINTD | print_obj | ZERO? | jz |
FSET? | test_attr | PRINTF | print_form | ZWSTR | encode_text |
FSTACK | pop / pop_stack | PRINTI |
|
| |
GET | loadw | PRINTN | print_num |
|
|
GETB | loadb | PRINTR | print_ret |
|
|
GETP | get_prop | PRINTT | print_table |
|
|
GETPT | get_prop_addr | PRINTU | print_unicode |
|
|
GRTR? | jg | PTSIZE | get_prop_len |
|
|
HLIGHT | set_text_style | PUSH | push |
|
|
ICALL | call_vn | PUT | storew |
|
|
Some opcodes (SET, VALUE, INC, DEC, IGRTR?, DLESS?) take the number of a variable as their first parameter. However, a variable name is not treated specially in this position. This instruction stores 10 into the variable whose number is in the variable “X”:
SET X,10
To store 10 into “X” itself, prefix the variable name with an apostrophe:
SET 'X,10
Even in Inform mode, the apostrophe is still necessary (unlike in Inform's assembler):
store 'x 10
If the target of a store instruction is omitted, the result will be stored to the stack by default.
In these versions, ZAPF automatically assembles the game header. Therefore, certain global labels must be defined:
ENDLOD | Marks the end of low memory and the beginning of high memory. Some interpreters might conserve RAM by leaving high memory on the disk, so frequently used constant data should be (and all mutable data must be) located before this label. |
IMPURE | Marks the end of “impure” (dynamic) memory and the beginning of “pure” (static) memory. This must be defined before ENDLOD. |
START | Marks the instruction where the program begins. |
VOCAB | Marks the beginning of the dictionary (vocabulary) table. See the Z-Machine Standards Document for the format of this table. |
OBJECT | Marks the beginning of the object table. See the Z-Machine Standards Document for the format of this table, and see the .OBJECT directive above. This must be defined before ENDLOD. |
GLOBAL | Marks the beginning of the global variable table, which consists of up to 240 words corresponding to the Z-machine's global variables. See the .GVAR directive above. This must be defined before ENDLOD. |
WORDS | Marks the beginning of the abbreviation table, which consists of up to 96 word addresses (byte addresses divided by 2) pointing to abbreviation strings. See the .FSTR directive above. |
Optionally, the constant RELEASEID may be defined to set the release number of the output file. If it is omitted, the release number will be 0.
In these versions, ZAPF does not automatically create a game header. The input file must start with data directives to assemble one; refer to the Z-Machine Standards Document for the format of the header. ZAPF will, however, fill in the length, checksum, and creator ID (a.k.a. “Inform version”) fields.
•.Initial release.
•.Known issues: packed function and string addresses are not calculated correctly for V6 or V7 (workaround: set the string and code offsets in the header to 0). The status line “time” flag cannot be set for V3.