Struct
DirectiveDeclare
Directivenew
OperatorTyped Inform is an extension of Inform 6.31 with the goal of introducing a general, straightforward way to manipulate values larger than a single virtual machine word.
For more information, contact vaporware on ifMUD, or follow the email snake:
j ansp com m h r . c @ e e grew stig
Objects aren't suitable for some uses: there's a fixed number of them, and they're always referred to by their numbers—i.e. by reference. Structs can be passed by value, and they can be used as local variables.
Suppose you want to do some 32-bit math (on the Z-machine): each of those numbers is going to be made up of two 16-bit words. If you used an object for each number, the overhead would be huge.
That's only slightly better than using objects, because either way you still only have a fixed number of long-integer variables. You need to define a separate array for each term in an expression. Those arrays are shared among all routines, so you have to be careful with reentrancy, and any changes you make will be visible everywhere.
Remember, the Z-machine imposes a limit on the number of local variables and an even tighter limit on the number of parameters to a routine. If you use two locals for each value, a routine can only accept three and a half values as parameters, and it only has enough local storage space for seven and a half values. (Oh, and how are you going to return one of those two-word values?)
You sure can. By the way, with Typed Inform you can also have as many local variables as you want, even if they're just regular words (up to the maximum size of the routine frame, which is configurable).
Only enough to store two stack frames, and the size of those frames is a configurable, fixed number of words.
That's the old kind of stack, baby. With Typed Inform, "stack" means "free memory you don't have to worry about". (See Virtual Stack Frames below for details.)
malloc()
into
Inform, are you?Yes.
struct Point { int x, y; }; struct Rect { Point topLeft, bottomRight; }; [ Constrain: in/out Point pt, in Rect bounds; if (pt.x < bounds.topLeft.x) pt.x = bounds.topLeft.x; else if (pt.x > bounds.bottomRight.x) pt.x = bounds.bottomRight.x; if (pt.y < bounds.topLeft.y) pt.y = bounds.topLeft.y; else if (pt.x > bounds.bottomRight.y) pt.y = bounds.bottomRight.y; ]; [ MakePoint: in int x, in int y, local Point pt, return Point; pt.x = x; pt.y = y; ]; [ Main: local Rect bounds, local Point pt; bounds.topLeft = MakePoint(0, 0); bounds.bottomRight = MakePoint(10, 10); pt = MakePoint(random(20), random(20)); Constrain(pt, bounds); ];
Struct
Directive! Define a new type: struct struct_name { ! One or more members can be declared at once: type member_name; type member_name, member_name; };
In these definitions (and those below), the type of each member can be any of the following:
int
or object
, built-in types the size of a
single word,struct *
, an untyped pointer,Note that int *
and object *
are not valid
types.
As in C, when more than one member of a type is defined at the same time,
the asterisk meaning "pointer" only attaches to the member name immediately
following it. That is, "Point *p, q;
" defines p as a
pointer-to-Point and q as an actual Point; "Point *p,
*q;
" defines them both as pointers-to-Point.
Global var_name : struct_name; Global pointer_var_name : struct_name *;
Global variables of struct or pointer types may be declared by writing the type after a colon. Pointer variables count toward $MAX_GLOBAL_VARIABLES; struct variables count toward $MAX_ARRAYS. Struct variables may be initialized with a struct constant expression (see below).
Property prop_name : type;
Properties containing structs or pointers may be declared similarly. The structs, however, must be small enough to fit into a property: no more than 4 words for V3, 32 words for V4+, or 32,768 words for Glulx. Default values may not be specified for typed properties.
! as part of an expression: mypoint = Point-->(3, 5); ! the "Point-->" part may be omitted when ! initializing a global variable or property: Global origin : Point = (0, 0); Property location : Point; Object foo with location (100, 100);
A constant struct expression may be written by following a type name with a long arrow and a parenthesized list of values, one value for each word in the type. These expressions may not be nested: even if a struct is made up of smaller "sub-structs", the constant expression is written as if the members of those sub-structs were actually part of the larger struct, as in the following example.
struct Point { int x, y; }; struct Rect { Point topLeft; Point bottomRight; }; ! to initialize topLeft to (0, 0) and bottomRight to (100, 100): Global my_rect : Rect = (0, 0, 100, 100);
! Old style routine header [ routine_name local1 local2;
This still works but it isn't very exciting.
! New style routine header [ routine_name: direction type var_name, direction type var_name, return type;
Here, only one variable can be declared at a time; the header is
terminated by a semicolon. type is the same as above. The optional
direction indicates how each variable is passed into or out of the
routine as follows; if it's omitted, the default is local
:
in
: The variable is a parameter passed in ("by value"). Any
changes made to it from within the routine will be purely local, invisible
to the calling routine.out
: The variable is a parameter passed
out—essentially an extra return value. An initial value will not be
passed in from the caller, but whatever value the variable holds when the
routine exits will be made available to the caller.in/out
: The variable is a parameter passed in and out ("by
reference"). An initial value will be passed in, and any changes will be
passed back out to the caller when the routine exits.local
: The variable is not passed in or out. The initial
value is zero (for single-word variables) or a structure full of zeros.The return
declaration sets the return type of the routine.
When the return type is omitted, the routine is presumed to return a single
word (int or object); when present, the routine may return a struct or
reference-counted pointer.
Declare
DirectiveDeclare Constrain : in/out Point pt, in Rect bounds;
The Declare
directive establishes a call signature for a
routine that will be defined later. This is necessary when calling a routine
that uses struct parameters or out parameters, or returns a struct, but is
declared further down in the source code from the point where it's
called.
When the routine is eventually defined, the return type must match, and
so must the type, order, and direction of every parameter. (Local variables,
however, do not need to be mentioned in the Declare
directive.)
new
Operator[ Test: Point *pt; pt = new Point; pt->x = 123; pt->y = 456; ];
The new
operator allocates a new block of memory, big enough
to hold the specified type, and returns a reference-counted pointer to
it.
x = malloc(100); ... mfree(x);
The malloc
system function can be used to allocate memory
manually, bypassing the reference-counting mechanism used by the
new
operator and allowing the size to be specified as an
arbitrary number of bytes. The return value is the address of the new block,
or 0 if a block that large couldn't be allocated.
Since reference counting is not used on these blocks, the memory must be
recycled with mfree
when it is no longer needed.
struct DataWrapper { int datablock; destructor [; mfree(datablock); ]; };
A destructor
routine may optionally be embedded in a struct
definition. The destructor will be called when a struct value is about to be
disposed of, either because its memory is being reclaimed (for reference
counted pointers) or because it is a local variable of a routine which is
about to return. If the struct contains manually managed pointers, as in the
example, it is a good idea to free them here.
First, some definitions: the VM stack is a feature built into the Z-machine and Glulx VMs, by which data can be stored for temporary use within a routine. A routine is not allowed to pop data off of the VM stack that was put there by a previous routine, which makes the VM stack unsuitable for passing parameters between routines (as a C compiler would do). The only way to access the VM stack is by pushing or popping a word at a time; the VM stack is not part of RAM, and therefore is exempt from the 64K limit on Z-machine RAM.
A routine frame is a block of memory laid out by the Typed Inform compiler that stores all parameters being passed into or out of a routine, as well as the routine's local variables. (Not quite all of them, actually: the VM local variables are used when possible.)
Typed Inform uses two virtual stack frames in RAM to simulate a stack. One is the local frame, which is used by the currently executing routine, and the remote frame is used by any subroutines it calls.
Before calling a subroutine, we first set up the remote frame by copying any "in" parameter values into it. We then push the contents of our local frame onto the VM stack, copy the remote frame into the local frame, and call the subroutine. Now the subroutine runs with the local frame we set up for it, and it's free to use the same trick in turn if it needs to call any other routines. After the subroutine returns, we copy its local frame back into the remote frame, pop our local frame back off of the VM stack, and copy any "out" parameters from the remote frame to their final destinations.
By ensuring that the virtual stack frames are always removed from the VM stack in the opposite order from the way they were stored there, and that each virtual frame is popped by the same routine that pushed it, we're able to use the VM stack as free storage instead of keeping several stack frames in RAM. (It isn't "free" from your computer's perspective—you still need physical RAM or virtual memory in your computer to hold all this, of course—but it is free from the perspective of the Z-machine's limited address space.)
What's the catch? Well, the VM stack doesn't have a specified minimum size, so we can't be sure how much free storage we actually get. On the other hand, it doesn't have a specified maximum size either—interpreter authors are free to choose a big stack size, or make it a setting that players can increase as needed, or even make the stack grow automatically. (The recommended size to run Inform 7 games is at least 16K, and ideally 64K or more. If your interpreter doesn't support a stack that big, pester the author.)
The other catch is that we can't take the address of anything in the local frame and pass that address to another routine, because by the time the other routine starts executing, whatever was in the local frame will have been moved onto the VM stack and replaced with something else. However, since parameters can be passed both in and out, we don't need to pass addresses to subroutines; we can pass the actual data.
For each routine you define that needs a frame of its own, Typed Inform generates a stub routine that handles the details of setting up the frame, allowing callers to simply call the stub instead of inserting a big pile of code at each call site. The stub routine's address is substituted for the original routine's address wherever it's called. For example, consider this routine:
[ Foo: in int a, in/out int b, out int c; c = b; b = a; ];
The actual code generated looks something like this:
[ Foo a; ! [begin frame setup section] ! set our preferred frame size (#local_frame_start-->0) = 2; ! [end frame setup section] (#local_frame_start-->2) = (#local_frame_start-->1); (#local_frame_start-->1) = a; ! [begin frame teardown section] ! empty in this example ! [end frame teardown section] ]; [ Foo__stub a b c __retval; ! copy parameters in @copy_table b (#remote_frame_start+2*WORDSIZE) 2; ! set the initial frame size (#remote_frame_start-->0) = 1; ! perform the call using a veneer routine __retval = TI__Call(Foo, a); ! copy parameters out @copy_table (#remote_frame_start+2*WORDSIZE) b 2; @copy_table (#remote_frame_start+4*WORDSIZE) c 2; return __retval; ];
Notes:
TI__Call
, not the stub.return
, rtrue
, and
rfalse
statements are compiled into jumps to the top of the
teardown code, so that local variable reference counts can be updated before
the routine exits.Memory allocated with the new
operator is managed by
reference counting. As long as the pointer returned by that operator is
always stored in a pointer-type variable or passed as a pointer-type
parameter, the number of references to the memory will be counted as the
pointer is copied around, and the memory will automatically be reclaimed
when the number of references drops to zero. This will cascade to other
references as well: if A is a structure containing the last pointer to B,
then when A is reclaimed, B will also be reclaimed.
As with any reference counting system, this will fail to reclaim memory if the only remaining references form a circle. For example, if A contains a pointer to B, B contains a pointer to C, and C contains a pointer back to A, then each pointer will keep the other structures "alive", and the memory will not be reclaimed. To avoid this situation, break the circle by setting least one of the pointers to zero.
To facilitate reference counting when structures are involved, Typed
Inform generates a "type routine" for each structure definition. The
typeof
system function returns the address of this
routine—for example, typeof(Point)
—which can be
called like so:
typeroutine(0)
: Returns the size of a Point struct, in
words.typeroutine(1)
: Returns the string "Point".typeroutine(2)
:
Invokes Point's destructor for the structure at the address given by the
variable "self". (No effect if there is no destructor.)typeroutine(routine, param)
: Calls
routine once for each member of the struct, with the parameters
(param, type, offset, size, name), and
finally returns the total size in words.
type is 0 for ints, 1 for objects, 2 for any reference counted
pointer, or a type routine address for structs. offset and
size are given in words. size and name are omitted in
version 3 games.