guile/doc/ref/vm.texi

@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
@c Copyright (C)  2008,2009,2010,2011,2013,2015
@c   Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.

@node A Virtual Machine for Guile
@section A Virtual Machine for Guile

Guile has both an interpreter and a compiler. To a user, the difference
is transparent---interpreted and compiled procedures can call each other
as they please.

The difference is that the compiler creates and interprets bytecode
for a custom virtual machine, instead of interpreting the
S-expressions directly. Loading and running compiled code is faster
than loading and running source code.

The virtual machine that does the bytecode interpretation is a part of
Guile itself. This section describes the nature of Guile's virtual
machine.

@menu
* Why a VM?::
* VM Concepts::
* Stack Layout::
* Variables and the VM::
* VM Programs::
* Object File Format::
* Instruction Set::
@end menu

@node Why a VM?
@subsection Why a VM?

@cindex interpreter
For a long time, Guile only had an interpreter. Guile's interpreter
operated directly on the S-expression representation of Scheme source
code.

But while the interpreter was highly optimized and hand-tuned, it still
performed many needless computations during the course of evaluating an
expression. For example, application of a function to arguments
needlessly consed up the arguments in a list. Evaluation of an
expression always had to figure out what the car of the expression is --
a procedure, a memoized form, or something else. All values have to be
allocated on the heap. Et cetera.

The solution to this problem was to compile the higher-level language,
Scheme, into a lower-level language for which all of the checks and
dispatching have already been done---the code is instead stripped to
the bare minimum needed to ``do the job''.

The question becomes then, what low-level language to choose? There
are many options. We could compile to native code directly, but that
poses portability problems for Guile, as it is a highly cross-platform
project.

So we want the performance gains that compilation provides, but we
also want to maintain the portability benefits of a single code path.
The obvious solution is to compile to a virtual machine that is
present on all Guile installations.

The easiest (and most fun) way to depend on a virtual machine is to
implement the virtual machine within Guile itself. Guile contains a
bytecode interpreter (written in C) and a Scheme to bytecode compiler
(written in Scheme). This way the virtual machine provides what Scheme
needs (tail calls, multiple values, @code{call/cc}) and can provide
optimized inline instructions for Guile (@code{cons}, @code{struct-ref},
etc.).

So this is what Guile does. The rest of this section describes that VM
that Guile implements, and the compiled procedures that run on it.

Before moving on, though, we should note that though we spoke of the
interpreter in the past tense, Guile still has an interpreter. The
difference is that before, it was Guile's main evaluator, and so was
implemented in highly optimized C; now, it is actually implemented in
Scheme, and compiled down to VM bytecode, just like any other program.
(There is still a C interpreter around, used to bootstrap the compiler,
but it is not normally used at runtime.)

The upside of implementing the interpreter in Scheme is that we preserve
tail calls and multiple-value handling between interpreted and compiled
code. The downside is that the interpreter in Guile 2.2 is still about
twice as slow as the interpreter in 1.8.  Since Scheme users are mostly
running compiled code, the compiler's speed more than makes up for the
loss.  In any case, once we have native compilation for Scheme code, we
expect the self-hosted interpreter to handily beat the old hand-tuned C
implementation.

Also note that this decision to implement a bytecode compiler does not
preclude native compilation. We can compile from bytecode to native
code at runtime, or even do ahead of time compilation. More
possibilities are discussed in @ref{Extending the Compiler}.

@node VM Concepts
@subsection VM Concepts

Compiled code is run by a virtual machine (VM).  Each thread has its own
VM.  The virtual machine executes the sequence of instructions in a
procedure.

Each VM instruction starts by indicating which operation it is, and then
follows by encoding its source and destination operands.  Each procedure
declares that it has some number of local variables, including the
function arguments.  These local variables form the available operands
of the procedure, and are accessed by index.

The local variables for a procedure are stored on a stack.  Calling a
procedure typically enlarges the stack, and returning from a procedure
shrinks it.  Stack memory is exclusive to the virtual machine that owns
it.

In addition to their stacks, virtual machines also have access to the
global memory (modules, global bindings, etc) that is shared among other
parts of Guile, including other VMs.

The registers that a VM has are as follows:

@itemize
@item ip - Instruction pointer
@item sp - Stack pointer
@item fp - Frame pointer
@end itemize

In other architectures, the instruction pointer is sometimes called the
``program counter'' (pc). This set of registers is pretty typical for
virtual machines; their exact meanings in the context of Guile's VM are
described in the next section.

@node Stack Layout
@subsection Stack Layout

The stack of Guile's virtual machine is composed of @dfn{frames}. Each
frame corresponds to the application of one compiled procedure, and
contains storage space for arguments, local variables, and some
bookkeeping information (such as what to do after the frame is
finished).

While the compiler is free to do whatever it wants to, as long as the
semantics of a computation are preserved, in practice every time you
call a function, a new frame is created. (The notable exception of
course is the tail call case, @pxref{Tail Calls}.)

The structure of the top stack frame is as follows:

@example
   | ...              |
   +==================+ <- fp + 2 = SCM_FRAME_PREVIOUS_SP (fp)
   | Dynamic link     |
   +------------------+
   | Return address   |
   +==================+ <- fp
   | Local 0          |
   +------------------+
   | Local 1          |
   +------------------+
   | ...              |
   +------------------+
   | Local N-1        |
   \------------------/ <- sp
@end example

In the above drawing, the stack grows downward.  At the beginning of a
function call, the procedure being applied is in local 0, followed by
the arguments from local 1.  After the procedure checks that it is being
passed a compatible set of arguments, the procedure allocates some
additional space in the frame to hold variables local to the function.

Note that once a value in a local variable slot is no longer needed,
Guile is free to re-use that slot.  This applies to the slots that were
initially used for the callee and arguments, too.  For this reason,
backtraces in Guile aren't always able to show all of the arguments: it
could be that the slot corresponding to that argument was re-used by
some other variable.

The @dfn{return address} is the @code{ip} that was in effect before this
program was applied.  When we return from this activation frame, we will
jump back to this @code{ip}.  Likewise, the @dfn{dynamic link} is the
offset of the @code{fp} that was in effect before this program was
applied, relative to the current @code{fp}.

To prepare for a non-tail application, Guile's VM will emit code that
shuffles the function to apply and its arguments into appropriate stack
slots, with two free slots below them.  The call then initializes those
free slots with the current @code{ip} and @code{fp}, and updates
@code{ip} to point to the function entry, and @code{fp} to point to the
new call frame.

In this way, the dynamic link links the current frame to the previous
frame.  Computing a stack trace involves traversing these frames.

Each stack local in Guile is 64 bits wide, even on 32-bit architectures.
This allows Guile to preserve its uniform treatment of stack locals
while allowing for unboxed arithmetic on 64-bit integers and
floating-point numbers.  @xref{Instruction Set}, for more on unboxed
arithmetic.

As an implementation detail, we actually store the dynamic link as an
offset and not an absolute value because the stack can move at runtime
as it expands or during partial continuation calls.  If it were an
absolute value, we would have to walk the frames, relocating frame
pointers.

@node Variables and the VM
@subsection Variables and the VM

Consider the following Scheme code as an example:

@example
  (define (foo a)
    (lambda (b) (list foo a b)))
@end example

Within the lambda expression, @code{foo} is a top-level variable,
@code{a} is a lexically captured variable, and @code{b} is a local
variable.

Another way to refer to @code{a} and @code{b} is to say that @code{a} is
a ``free'' variable, since it is not defined within the lambda, and
@code{b} is a ``bound'' variable. These are the terms used in the
@dfn{lambda calculus}, a mathematical notation for describing functions.
The lambda calculus is useful because it is a language in which to
reason precisely about functions and variables.  It is especially good
at describing scope relations, and it is for that reason that we mention
it here.

Guile allocates all variables on the stack. When a lexically enclosed
procedure with free variables---a @dfn{closure}---is created, it copies
those variables into its free variable vector. References to free
variables are then redirected through the free variable vector.

If a variable is ever @code{set!}, however, it will need to be
heap-allocated instead of stack-allocated, so that different closures
that capture the same variable can see the same value. Also, this
allows continuations to capture a reference to the variable, instead
of to its value at one point in time. For these reasons, @code{set!}
variables are allocated in ``boxes''---actually, in variable cells.
@xref{Variables}, for more information. References to @code{set!}
variables are indirected through the boxes.

Thus perhaps counterintuitively, what would seem ``closer to the
metal'', viz @code{set!}, actually forces an extra memory allocation and
indirection.  Sometimes Guile's optimizer can remove this allocation,
but not always.

Going back to our example, @code{b} may be allocated on the stack, as
it is never mutated.

@code{a} may also be allocated on the stack, as it too is never
mutated. Within the enclosed lambda, its value will be copied into
(and referenced from) the free variables vector.

@code{foo} is a top-level variable, because @code{foo} is not
lexically bound in this example.

@node VM Programs
@subsection Compiled Procedures are VM Programs

By default, when you enter in expressions at Guile's REPL, they are
first compiled to bytecode.  Then that bytecode is executed to produce a
value.  If the expression evaluates to a procedure, the result of this
process is a compiled procedure.

A compiled procedure is a compound object consisting of its bytecode and
a reference to any captured lexical variables.  In addition, when a
procedure is compiled, it has associated metadata written to side
tables, for instance a line number mapping, or its docstring.  You can
pick apart these pieces with the accessors in @code{(system vm
program)}.  @xref{Compiled Procedures}, for a full API reference.

A procedure may reference data that was statically allocated when the
procedure was compiled.  For example, a pair of immediate objects
(@pxref{Immediate objects}) can be allocated directly in the memory
segment that contains the compiled bytecode, and accessed directly by
the bytecode.

Another use for statically allocated data is to serve as a cache for a
bytecode.  Top-level variable lookups are handled in this way.  If the
@code{toplevel-box} instruction finds that it does not have a cached
variable for a top-level reference, it accesses other static data to
resolve the reference, and fills in the cache slot.  Thereafter all
access to the variable goes through the cache cell.  The variable's
value may change in the future, but the variable itself will not.

We can see how these concepts tie together by disassembling the
@code{foo} function we defined earlier to see what is going on:

@smallexample
scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
scheme@@(guile-user)> ,x foo
Disassembly of #<procedure foo (a)> at #xea4ce4:

   0    (assert-nargs-ee/locals 2 0)    ;; 2 slots (1 arg)    at (unknown file):1:0
   1    (make-closure 1 7 1)            ;; anonymous procedure at #xea4d04 (1 free var)
   4    (free-set! 1 0 0)               ;; free var 0
   6    (mov 0 1)
   7    (return-values 2)               ;; 1 value

----------------------------------------
Disassembly of anonymous procedure at #xea4d04:

   0    (assert-nargs-ee/locals 2 2)    ;; 4 slots (1 arg)    at (unknown file):1:16
   1    (toplevel-box 1 74 58 68 #t)    ;; `foo'
   6    (box-ref 1 1)
   7    (make-short-immediate 0 772)    ;; ()                 at (unknown file):1:28
   8    (cons 2 2 0)
   9    (free-ref 3 3 0)                ;; free var 0
  11    (cons 3 3 2)
  12    (cons 2 1 3)
  13    (return-values 2)               ;; 1 value
@end smallexample

First there's some prelude, where @code{foo} checks that it was called
with only 1 argument.  Then at @code{ip} 1, we allocate a new closure
and store it in slot 1, relative to the @code{sp}.

At run-time, local variables in Guile are usually addressed relative to
the stack pointer, which leads to a pleasantly efficient
@code{sp[@var{n}]} access.  However it can make the disassembly hard to
read, because the @code{sp} can change during the function, and because
incoming arguments are relative to the @code{fp}, not the @code{sp}.

To know what @code{fp}-relative slot corresponds to an
@code{sp}-relative reference, scan up in the disassembly until you get
to a ``@var{n} slots'' annotation; in our case, 2, indicating that the
frame has space for 2 slots.  Thus a zero-indexed @code{sp}-relative
slot of 1 corresponds to the @code{fp}-relative slot of 0, which
initially held the value of the closure being called.  This means that
Guile doesn't need the value of the closure to compute its result, and
so slot 0 was free for re-use, in this case for the result of making a
new closure.

A closure is code with data.  The @code{6} in the @code{(make-closure 1
6 1)} is a relative offset from the instruction pointer of the code for
the closure, and the final @code{1} indicates that the closure has space
for 1 free variable.  @code{Ip} 4 initializes free variable 0 in the new
closure with the value from @code{sp}-relative slot 0, which corresponds
to @code{fp}-relative slot 1, the first argument of @code{foo}:
@code{a}.  Finally we return the closure.

The second stanza disassembles the code for the closure.  After the
prelude, we load the variable for the toplevel variable @code{foo} into
slot 1.  This lookup occurs lazily, the first time the variable is
actually referenced, and the location of the lookup is cached so that
future references are very cheap.  @xref{Top-Level Environment
Instructions}, for more details.  The @code{box-ref} dereferences the
variable cell, replacing the contents of slot 1.

What follows is a sequence of conses to build up the result list.
@code{Ip} 7 makes the tail of the list.  @code{Ip} 8 conses on the value
in slot 2, corresponding to the first argument to the closure: @code{b}.
@code{Ip} 9 loads free variable 0 of slot 3 -- the procedure being
called, in @code{fp}-relative slot 0 -- into slot 3, then @code{ip} 11
conses it onto the list.  Finally we cons the value in slot 1,
containing the @code{foo} toplevel, onto the front of the list, and we
return it.


@node Object File Format
@subsection Object File Format

To compile a file to disk, we need a format in which to write the
compiled code to disk, and later load it into Guile.  A good @dfn{object
file format} has a number of characteristics:

@itemize
@item Above all else, it should be very cheap to load a compiled file.
@item It should be possible to statically allocate constants in the
file.  For example, a bytevector literal in source code can be emitted
directly into the object file.
@item The compiled file should enable maximum code and data sharing
between different processes.
@item The compiled file should contain debugging information, such as
line numbers, but that information should be separated from the code
itself.  It should be possible to strip debugging information if space
is tight.
@end itemize

These characteristics are not specific to Scheme.  Indeed, mainstream
languages like C and C++ have solved this issue many times in the past.
Guile builds on their work by adopting ELF, the object file format of
GNU and other Unix-like systems, as its object file format.  Although
Guile uses ELF on all platforms, we do not use platform support for ELF.
Guile implements its own linker and loader.  The advantage of using ELF
is not sharing code, but sharing ideas.  ELF is simply a well-designed
object file format.

An ELF file has two meta-tables describing its contents.  The first
meta-table is for the loader, and is called the @dfn{program table} or
sometimes the @dfn{segment table}.  The program table divides the file
into big chunks that should be treated differently by the loader.
Mostly the difference between these @dfn{segments} is their
permissions.

Typically all segments of an ELF file are marked as read-only, except
that part that represents modifiable static data or static data that
needs load-time initialization.  Loading an ELF file is as simple as
mmapping the thing into memory with read-only permissions, then using
the segment table to mark a small sub-region of the file as writable.
This writable section is typically added to the root set of the garbage
collector as well.

One ELF segment is marked as ``dynamic'', meaning that it has data of
interest to the loader.  Guile uses this segment to record the Guile
version corresponding to this file.  There is also an entry in the
dynamic segment that points to the address of an initialization thunk
that is run to perform any needed link-time initialization.  (This is
like dynamic relocations for normal ELF shared objects, except that we
compile the relocations as a procedure instead of having the loader
interpret a table of relocations.)  Finally, the dynamic segment marks
the location of the ``entry thunk'' of the object file.  This thunk is
returned to the caller of @code{load-thunk-from-memory} or
@code{load-thunk-from-file}.  When called, it will execute the ``body''
of the compiled expression.

The other meta-table in an ELF file is the @dfn{section table}.  Whereas
the program table divides an ELF file into big chunks for the loader,
the section table specifies small sections for use by introspective
tools like debuggers or the like.  One segment (program table entry)
typically contains many sections.  There may be sections outside of any
segment, as well.

Typical sections in a Guile @code{.go} file include:

@table @code
@item .rtl-text
Bytecode.
@item .data
Data that needs initialization, or which may be modified at runtime.
@item .rodata
Statically allocated data that needs no run-time initialization, and
which therefore can be shared between processes.
@item .dynamic
The dynamic section, discussed above.
@item .symtab
@itemx .strtab
A table mapping addresses in the @code{.rtl-text} to procedure names.
@code{.strtab} is used by @code{.symtab}.
@item .guile.procprops
@itemx .guile.arities
@itemx .guile.arities.strtab
@itemx .guile.docstrs
@itemx .guile.docstrs.strtab
Side tables of procedure properties, arities, and docstrings.
@item .guile.docstrs.strtab
Side table of frame maps, describing the set of live slots for ever
return point in the program text, and whether those slots are pointers
are not.  Used by the garbage collector.
@item .debug_info
@itemx .debug_abbrev
@itemx .debug_str
@itemx .debug_loc
@itemx .debug_line
Debugging information, in DWARF format.  See the DWARF specification,
for more information.
@item .shstrtab
Section name string table.
@end table

For more information, see @uref{http://linux.die.net/man/5/elf,,the
elf(5) man page}.  See @uref{http://dwarfstd.org/,the DWARF
specification} for more on the DWARF debugging format.  Or if you are an
adventurous explorer, try running @code{readelf} or @code{objdump} on
compiled @code{.go} files.  It's good times!


@node Instruction Set
@subsection Instruction Set

There are currently about 175 instructions in Guile's virtual machine.
These instructions represent atomic units of a program's execution.
Ideally, they perform one task without conditional branches, then
dispatch to the next instruction in the stream.

Instructions themselves are composed of 1 or more 32-bit units.  The low
8 bits of the first word indicate the opcode, and the rest of
instruction describe the operands.  There are a number of different ways
operands can be encoded.

@table @code
@item s@var{n}
An unsigned @var{n}-bit integer, indicating the @code{sp}-relative index
of a local variable.
@item f@var{n}
An unsigned @var{n}-bit integer, indicating the @code{fp}-relative index
of a local variable.  Used when a continuation accepts a variable number
of values, to shuffle received values into known locations in the
frame.
@item c@var{n}
An unsigned @var{n}-bit integer, indicating a constant value.
@item l24
An offset from the current @code{ip}, in 32-bit units, as a signed
24-bit value.  Indicates a bytecode address, for a relative jump.
@item i16
@itemx i32
An immediate Scheme value (@pxref{Immediate objects}), encoded directly
in 16 or 32 bits.
@item a32
@itemx b32
An immediate Scheme value, encoded as a pair of 32-bit words.
@code{a32} and @code{b32} values always go together on the same opcode,
and indicate the high and low bits, respectively.  Normally only used on
64-bit systems.
@item n32
A statically allocated non-immediate.  The address of the non-immediate
is encoded as a signed 32-bit integer, and indicates a relative offset
in 32-bit units.  Think of it as @code{SCM x = ip + offset}.
@item r32
Indirect scheme value, like @code{n32} but indirected.  Think of it as
@code{SCM *x = ip + offset}.
@item l32
@item lo32
An ip-relative address, as a signed 32-bit integer.  Could indicate a
bytecode address, as in @code{make-closure}, or a non-immediate address,
as with @code{static-patch!}.

@code{l32} and @code{lo32} are the same from the perspective of the
virtual machine.  The difference is that an assembler might want to
allow an @code{lo32} address to be specified as a label and then some
number of words offset from that label, for example when patching a
field of a statically allocated object.
@item b1
A boolean value: 1 for true, otherwise 0.
@item x@var{n}
An ignored sequence of @var{n} bits.
@end table

An instruction is specified by giving its name, then describing its
operands.  The operands are packed by 32-bit words, with earlier
operands occupying the lower bits.

For example, consider the following instruction specification:

@deftypefn Instruction {} free-set! s12:@var{dst} s12:@var{src} x8:@var{_} c24:@var{idx}
Set free variable @var{idx} from the closure @var{dst} to @var{src}.
@end deftypefn

The first word in the instruction will start with the 8-bit value
corresponding to the @var{free-set!} opcode in the low bits, followed by
@var{dst} and @var{src} as 12-bit values.  The second word starts with 8
dead bits, followed by the index as a 24-bit immediate value.

Sometimes the compiler can figure out that it is compiling a special
case that can be run more efficiently. So, for example, while Guile
offers a generic test-and-branch instruction, it also offers specific
instructions for special cases, so that the following cases all have
their own test-and-branch instructions:

@example
(if pred then else)
(if (not pred) then else)
(if (null? l) then else)
(if (not (null? l)) then else)
@end example

In addition, some Scheme primitives have their own inline
implementations.  For example, in the previous section we saw
@code{cons}.

Finally, for instructions with operands that encode references to the
stack, the interpretation of those stack values is up to the instruction
itself.  Most instructions expect their operands to be tagged SCM values
(@code{scm} representation), but some instructions expect unboxed
integers (@code{u64} and @code{s64} representations) or floating-point
numbers (@var{f64} representation).  Instructions have static types:
they must receive their operands in the format they expect.  It's up to
the compiler to ensure this is the case.  Unless otherwise mentioned,
all operands and results are boxed as SCM values.

@menu
* Lexical Environment Instructions::
* Top-Level Environment Instructions::
* Procedure Call and Return Instructions::
* Function Prologue Instructions::
* Trampoline Instructions::
* Branch Instructions::
* Constant Instructions::
* Dynamic Environment Instructions::
* Miscellaneous Instructions::
* Inlined Scheme Instructions::
* Inlined Mathematical Instructions::
* Inlined Bytevector Instructions::
* Unboxed Integer Arithmetic::
* Unboxed Floating-Point Arithmetic::
@end menu


@node Lexical Environment Instructions
@subsubsection Lexical Environment Instructions

These instructions access and mutate the lexical environment of a
compiled procedure---its free and bound variables.  @xref{Stack Layout},
for more information on the format of stack frames.

@deftypefn Instruction {} mov s12:@var{dst} s12:@var{src}
@deftypefnx Instruction {} long-mov s24:@var{dst} x8:@var{_} s24:@var{src}
Copy a value from one local slot to another.

As discussed previously, procedure arguments and local variables are
allocated to local slots.  Guile's compiler tries to avoid shuffling
variables around to different slots, which often makes @code{mov}
instructions redundant.  However there are some cases in which shuffling
is necessary, and in those cases, @code{mov} is the thing to use.
@end deftypefn

@deftypefn Instruction {} long-fmov f24:@var{dst} x8:@var{_} f24:@var{src}
Copy a value from one local slot to another, but addressing slots
relative to the @code{fp} instead of the @code{sp}.  This is used when
shuffling values into place after multiple-value returns.
@end deftypefn

@deftypefn Instruction {} make-closure s24:@var{dst} l32:@var{offset} x8:@var{_} c24:@var{nfree}
Make a new closure, and write it to @var{dst}.  The code for the closure
will be found at @var{offset} words from the current @code{ip}.
@var{offset} is a signed 32-bit integer.  Space for @var{nfree} free
variables will be allocated.

The size of a closure is currently two words, plus one word per free
variable.
@end deftypefn

@deftypefn Instruction {} free-ref s12:@var{dst} s12:@var{src} x8:@var{_} c24:@var{idx}
Load free variable @var{idx} from the closure @var{src} into local slot
@var{dst}.
@end deftypefn

@deftypefn Instruction {} free-set! s12:@var{dst} s12:@var{src} x8:@var{_} c24:@var{idx}
Set free variable @var{idx} from the closure @var{dst} to @var{src}.

This instruction is usually used when initializing a closure's free
variables, but not to mutate free variables, as variables that are
assigned are boxed.
@end deftypefn

Recall that variables that are assigned are usually allocated in boxes,
so that continuations and closures can capture their identity and not
their value at one point in time.  Variables are also used in the
implementation of top-level bindings; see the next section for more
information.

@deftypefn Instruction {} box s12:@var{dst} s12:@var{src}
Create a new variable holding @var{src}, and place it in @var{dst}.
@end deftypefn

@deftypefn Instruction {} box-ref s12:@var{dst} s12:@var{src}
Unpack the variable at @var{src} into @var{dst}, asserting that the
variable is actually bound.
@end deftypefn

@deftypefn Instruction {} box-set! s12:@var{dst} s12:@var{src}
Set the contents of the variable at @var{dst} to @var{set}.
@end deftypefn


@node Top-Level Environment Instructions
@subsubsection Top-Level Environment Instructions

These instructions access values in the top-level environment: bindings
that were not lexically apparent at the time that the code in question
was compiled.

The location in which a toplevel binding is stored can be looked up once
and cached for later. The binding itself may change over time, but its
location will stay constant.

@deftypefn Instruction {} current-module s24:@var{dst}
Store the current module in @var{dst}.
@end deftypefn

@deftypefn Instruction {} resolve s24:@var{dst} b1:@var{bound?} x7:@var{_} s24:@var{sym}
Resolve @var{sym} in the current module, and place the resulting
variable in @var{dst}.  An error will be signalled if no variable is
found.  If @var{bound?} is true, an error will be signalled if the
variable is unbound.
@end deftypefn

@deftypefn Instruction {} define! s12:@var{dst} s12:@var{sym}
Look up a binding for @var{sym} in the current module, creating it if
necessary.  Store that variable to @var{dst}.
@end deftypefn

@deftypefn Instruction {} toplevel-box s24:@var{dst} r32:@var{var-offset} r32:@var{mod-offset} n32:@var{sym-offset} b1:@var{bound?} x31:@var{_}
Load a value.  The value will be fetched from memory, @var{var-offset}
32-bit words away from the current instruction pointer.
@var{var-offset} is a signed value.  Up to here, @code{toplevel-box} is
like @code{static-ref}.

Then, if the loaded value is a variable, it is placed in @var{dst}, and
control flow continues.

Otherwise, we have to resolve the variable.  In that case we load the
module from @var{mod-offset}, just as we loaded the variable.  Usually
the module gets set when the closure is created.  @var{sym-offset}
specifies the name, as an offset to a symbol.

We use the module and the symbol to resolve the variable, placing it in
@var{dst}, and caching the resolved variable so that we will hit the
cache next time.  If @var{bound?} is true, an error will be signalled if
the variable is unbound.
@end deftypefn

@deftypefn Instruction {} module-box s24:@var{dst} r32:@var{var-offset} n32:@var{mod-offset} n32:@var{sym-offset} b1:@var{bound?} x31:@var{_}
Like @code{toplevel-box}, except @var{mod-offset} points at a module
identifier instead of the module itself.  A module identifier is a
module name, as a list, prefixed by a boolean.  If the prefix is true,
then the variable is resolved relative to the module's public interface
instead of its private interface.
@end deftypefn


@node Procedure Call and Return Instructions
@subsubsection Procedure Call and Return Instructions

As described earlier (@pxref{Stack Layout}), Guile's calling convention
is that arguments are passed and values returned on the stack.

For calls, both in tail position and in non-tail position, we require
that the procedure and the arguments already be shuffled into place
befor the call instruction.  ``Into place'' for a tail call means that
the procedure should be in slot 0, relative to the @code{fp}, and the
arguments should follow.  For a non-tail call, if the procedure is in
@code{fp}-relative slot @var{n}, the arguments should follow from slot
@var{n}+1, and there should be two free slots at @var{n}-1 and @var{n}-2
in which to save the @code{ip} and @code{fp}.

Returning values is similar.  Multiple-value returns should have values
already shuffled down to start from @code{fp}-relative slot 1 before
emitting @code{return-values}.  We start from slot 1 instead of slot 0
to make tail calls to @code{values} trivial.

In both calls and returns, the @code{sp} is used to indicate to the
callee or caller the number of arguments or return values, respectively.
After receiving return values, it is the caller's responsibility to
@dfn{restore the frame} by resetting the @code{sp} to its former value.

@deftypefn Instruction {} call f24:@var{proc} x8:@var{_} c24:@var{nlocals}
Call a procedure.  @var{proc} is the local corresponding to a procedure.
The two values below @var{proc} will be overwritten by the saved call
frame data.  The new frame will have space for @var{nlocals} locals: one
for the procedure, and the rest for the arguments which should already
have been pushed on.

When the call returns, execution proceeds with the next instruction.
There may be any number of values on the return stack; the precise
number can be had by subtracting the address of @var{proc} from the
post-call @code{sp}.
@end deftypefn

@deftypefn Instruction {} call-label f24:@var{proc} x8:@var{_} c24:@var{nlocals} l32:@var{label}
Call a procedure in the same compilation unit.

This instruction is just like @code{call}, except that instead of
dereferencing @var{proc} to find the call target, the call target is
known to be at @var{label}, a signed 32-bit offset in 32-bit units from
the current @code{ip}.  Since @var{proc} is not dereferenced, it may be
some other representation of the closure.
@end deftypefn

@deftypefn Instruction {} tail-call c24:@var{nlocals}
Tail-call a procedure.  Requires that the procedure and all of the
arguments have already been shuffled into position.  Will reset the
frame to @var{nlocals}.
@end deftypefn

@deftypefn Instruction {} tail-call-label c24:@var{nlocals} l32:@var{label}
Tail-call a known procedure.  As @code{call} is to @code{call-label},
@code{tail-call} is to @code{tail-call-label}.
@end deftypefn

@deftypefn Instruction {} tail-call/shuffle f24:@var{from}
Tail-call a procedure.  The procedure should already be set to slot 0.
The rest of the args are taken from the frame, starting at @var{from},
shuffled down to start at slot 0.  This is part of the implementation of
the @code{call-with-values} builtin.
@end deftypefn

@deftypefn Instruction {} receive f12:@var{dst} f12:@var{proc} x8:@var{_} c24:@var{nlocals}
Receive a single return value from a call whose procedure was in
@var{proc}, asserting that the call actually returned at least one
value.  Afterwards, resets the frame to @var{nlocals} locals.
@end deftypefn

@deftypefn Instruction {} receive-values f24:@var{proc} b1:@var{allow-extra?} x7:@var{_} c24:@var{nvalues}
Receive a return of multiple values from a call whose procedure was in
@var{proc}.  If fewer than @var{nvalues} values were returned, signal an
error.  Unless @var{allow-extra?} is true, require that the number of
return values equals @var{nvalues} exactly.  After @code{receive-values}
has run, the values can be copied down via @code{mov}, or used in place.
@end deftypefn

@deftypefn Instruction {} return-values c24:@var{nlocals}
Return a number of values from a call frame.  This opcode corresponds to
an application of @code{values} in tail position.  As with tail calls,
we expect that the values have already been shuffled down to a
contiguous array starting at slot 1.  If @var{nlocals} is nonzero, reset
the frame to hold that number of locals.  Note that a frame reset to 1
local returns 0 values.
@end deftypefn

@deftypefn Instruction {} call/cc x24:@var{_}
Capture the current continuation, and tail-apply the procedure in local
slot 1 to it.  This instruction is part of the implementation of
@code{call/cc}, and is not generated by the compiler.
@end deftypefn


@node Function Prologue Instructions
@subsubsection Function Prologue Instructions

A function call in Guile is very cheap: the VM simply hands control to
the procedure. The procedure itself is responsible for asserting that it
has been passed an appropriate number of arguments. This strategy allows
arbitrarily complex argument parsing idioms to be developed, without
harming the common case.

For example, only calls to keyword-argument procedures ``pay'' for the
cost of parsing keyword arguments. (At the time of this writing, calling
procedures with keyword arguments is typically two to four times as
costly as calling procedures with a fixed set of arguments.)

@deftypefn Instruction {} assert-nargs-ee c24:@var{expected}
@deftypefnx Instruction {} assert-nargs-ge c24:@var{expected}
@deftypefnx Instruction {} assert-nargs-le c24:@var{expected}
If the number of actual arguments is not @code{==}, @code{>=}, or
@code{<=} @var{expected}, respectively, signal an error.

The number of arguments is determined by subtracting the stack pointer
from the frame pointer (@code{fp - sp}).  @xref{Stack Layout}, for more
details on stack frames.  Note that @var{expected} includes the
procedure itself.
@end deftypefn

@deftypefn Instruction {} br-if-nargs-ne c24:@var{expected} x8:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-nargs-lt c24:@var{expected} x8:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-nargs-gt c24:@var{expected} x8:@var{_} l24:@var{offset}
If the number of actual arguments is not equal, less than, or greater
than @var{expected}, respectively, add @var{offset}, a signed 24-bit
number, to the current instruction pointer.  Note that @var{expected}
includes the procedure itself.

These instructions are used to implement multiple arities, as in
@code{case-lambda}. @xref{Case-lambda}, for more information.
@end deftypefn

@deftypefn Instruction {} alloc-frame c24:@var{nlocals}
Ensure that there is space on the stack for @var{nlocals} local
variables, setting them all to @code{SCM_UNDEFINED}, except those values
that are already on the stack.
@end deftypefn

@deftypefn Instruction {} reset-frame c24:@var{nlocals}
Like @code{alloc-frame}, but doesn't check that the stack is big enough,
and doesn't initialize values to @code{SCM_UNDEFINED}.  Used to reset
the frame size to something less than the size that was previously set
via alloc-frame.
@end deftypefn

@deftypefn Instruction {} assert-nargs-ee/locals c12:@var{expected} c12:@var{nlocals}
Equivalent to a sequence of @code{assert-nargs-ee} and
@code{reserve-locals}.  The number of locals reserved is @var{expected}
+ @var{nlocals}.
@end deftypefn

@deftypefn Instruction {} br-if-npos-gt c24:@var{nreq} x8:@var{_} c24:@var{npos} x8:@var{_} l24:@var{offset}
Find the first positional argument after @var{nreq}.  If it is greater
than @var{npos}, jump to @var{offset}.

This instruction is only emitted for functions with multiple clauses,
and an earlier clause has keywords and no rest arguments.
@xref{Case-lambda}, for more on how @code{case-lambda} chooses the
clause to apply.
@end deftypefn

@deftypefn Instruction {} bind-kwargs c24:@var{nreq} c8:@var{flags} c24:@var{nreq-and-opt} x8:@var{_} c24:@var{ntotal} n32:@var{kw-offset}
@var{flags} is a bitfield, whose lowest bit is @var{allow-other-keys},
second bit is @var{has-rest}, and whose following six bits are unused.

Find the last positional argument, and shuffle all the rest above
@var{ntotal}.  Initialize the intervening locals to
@code{SCM_UNDEFINED}.  Then load the constant at @var{kw-offset} words
from the current @var{ip}, and use it and the @var{allow-other-keys}
flag to bind keyword arguments.  If @var{has-rest}, collect all shuffled
arguments into a list, and store it in @var{nreq-and-opt}.  Finally,
clear the arguments that we shuffled up.

The parsing is driven by a keyword arguments association list, looked up
using @var{kw-offset}.  The alist is a list of pairs of the form
@code{(@var{kw} . @var{index})}, mapping keyword arguments to their
local slot indices.  Unless @code{allow-other-keys} is set, the parser
will signal an error if an unknown key is found.

A macro-mega-instruction.
@end deftypefn

@deftypefn Instruction {} bind-rest f24:@var{dst}
Collect any arguments at or above @var{dst} into a list, and store that
list at @var{dst}.
@end deftypefn


@node Trampoline Instructions
@subsubsection Trampoline Instructions

Though most applicable objects in Guile are procedures implemented in
bytecode, not all are.  There are primitives, continuations, and other
procedure-like objects that have their own calling convention.  Instead
of adding special cases to the @code{call} instruction, Guile wraps
these other applicable objects in VM trampoline procedures, then
provides special support for these objects in bytecode.

Trampoline procedures are typically generated by Guile at runtime, for
example in response to a call to @code{scm_c_make_gsubr}.  As such, a
compiler probably shouldn't emit code with these instructions.  However,
it's still interesting to know how these things work, so we document
these trampoline instructions here.

@deftypefn Instruction {} subr-call x24:@var{_}
Call a subr, passing all locals in this frame as arguments.  Return from
the calling frame.
@end deftypefn

@deftypefn Instruction {} foreign-call c12:@var{cif-idx} c12:@var{ptr-idx}
Call a foreign function.  Fetch the @var{cif} and foreign pointer from
@var{cif-idx} and @var{ptr-idx}, both free variables.  Return from the calling
frame.  Arguments are taken from the stack.
@end deftypefn

@deftypefn Instruction {} continuation-call c24:@var{contregs}
Return to a continuation, nonlocally.  The arguments to the continuation
are taken from the stack.  @var{contregs} is a free variable containing
the reified continuation.
@end deftypefn

@deftypefn Instruction {} compose-continuation c24:@var{cont}
Compose a partial continution with the current continuation.  The
arguments to the continuation are taken from the stack.  @var{cont} is a
free variable containing the reified continuation.
@end deftypefn

@deftypefn Instruction {} tail-apply x24:@var{_}
Tail-apply the procedure in local slot 0 to the rest of the arguments.
This instruction is part of the implementation of @code{apply}, and is
not generated by the compiler.
@end deftypefn

@deftypefn Instruction {} builtin-ref s12:@var{dst} c12:@var{idx}
Load a builtin stub by index into @var{dst}.
@end deftypefn

@deftypefn Instruction {} apply-non-program x24:@var{_}
An instruction used only by a special trampoline that the VM uses to
apply non-programs.  Using that trampoline allows profilers and
backtrace utilities to avoid seeing the instruction pointer from the
calling frame.
@end deftypefn


@node Branch Instructions
@subsubsection Branch Instructions

All offsets to branch instructions are 24-bit signed numbers, which
count 32-bit units.  This gives Guile effectively a 26-bit address range
for relative jumps.

@deftypefn Instruction {} br l24:@var{offset}
Add @var{offset} to the current instruction pointer.
@end deftypefn

All the conditional branch instructions described below have an
@var{invert} parameter, which if true reverses the test:
@code{br-if-true} becomes @code{br-if-false}, and so on.

@deftypefn Instruction {} br-if-true s24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{test} is true for the purposes of Scheme, add
@var{offset} to the current instruction pointer.
@end deftypefn

@deftypefn Instruction {} br-if-null s24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{test} is the end-of-list or Lisp nil, add
@var{offset} to the current instruction pointer.
@end deftypefn

@deftypefn Instruction {} br-if-nil s24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{test} is false to Lisp, add @var{offset} to the
current instruction pointer.
@end deftypefn

@deftypefn Instruction {} br-if-pair s24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{test} is a pair, add @var{offset} to the current
instruction pointer.
@end deftypefn

@deftypefn Instruction {} br-if-struct s24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{test} is a struct, add @var{offset} number to the
current instruction pointer.
@end deftypefn

@deftypefn Instruction {} br-if-char s24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{test} is a char, add @var{offset} to the current
instruction pointer.
@end deftypefn

@deftypefn Instruction {} br-if-tc7 s24:@var{test} b1:@var{invert} u7:@var{tc7} l24:@var{offset}
If the value in @var{test} has the TC7 given in the second word, add
@var{offset} to the current instruction pointer.  TC7 codes are part of
the way Guile represents non-immediate objects, and are deep wizardry.
See @code{libguile/tags.h} for all the details.
@end deftypefn

@deftypefn Instruction {} br-if-eq s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-eqv s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{a} is @code{eq?} or @code{eqv?} to the value in
@var{b}, respectively, add @var{offset} to the current instruction
pointer.
@end deftypefn

@deftypefn Instruction {} br-if-= s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-< s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-<= s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the value in @var{a} is @code{=}, @code{<}, or @code{<=} to the value
in @var{b}, respectively, add @var{offset} to the current instruction
pointer.
@end deftypefn

@deftypefn Instruction {} br-if-logtest s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the bitwise intersection of the integers in @var{a} and @var{b} is
nonzero, add @var{offset} to the current instruction pointer.
@end deftypefn


@node Constant Instructions
@subsubsection Constant Instructions

The following instructions load literal data into a program.  There are
two kinds.

The first set of instructions loads immediate values.  These
instructions encode the immediate directly into the instruction stream.

@deftypefn Instruction {} make-short-immediate s8:@var{dst} i16:@var{low-bits}
Make an immediate whose low bits are @var{low-bits}, and whose top bits are
0.
@end deftypefn

@deftypefn Instruction {} make-long-immediate s24:@var{dst} i32:@var{low-bits}
Make an immediate whose low bits are @var{low-bits}, and whose top bits are
0.
@end deftypefn

@deftypefn Instruction {} make-long-long-immediate s24:@var{dst} a32:@var{high-bits} b32:@var{low-bits}
Make an immediate with @var{high-bits} and @var{low-bits}.
@end deftypefn

Non-immediate constant literals are referenced either directly or
indirectly.  For example, Guile knows at compile-time what the layout of
a string will be like, and arranges to embed that object directly in the
compiled image.  A reference to a string will use
@code{make-non-immediate} to treat a pointer into the compilation unit
as a @code{SCM} value directly.

@deftypefn Instruction {} make-non-immediate s24:@var{dst} n32:@var{offset}
Load a pointer to statically allocated memory into @var{dst}.  The
object's memory is will be found @var{offset} 32-bit words away from the
current instruction pointer.  Whether the object is mutable or immutable
depends on where it was allocated by the compiler, and loaded by the
loader.
@end deftypefn

Some objects must be unique across the whole system.  This is the case
for symbols and keywords.  For these objects, Guile arranges to
initialize them when the compilation unit is loaded, storing them into a
slot in the image.  References go indirectly through that slot.
@code{static-ref} is used in this case.

@deftypefn Instruction {} static-ref s24:@var{dst} r32:@var{offset}
Load a @var{scm} value into @var{dst}.  The @var{scm} value will be fetched from
memory, @var{offset} 32-bit words away from the current instruction
pointer.  @var{offset} is a signed value.
@end deftypefn

Fields of non-immediates may need to be fixed up at load time, because
we do not know in advance at what address they will be loaded.  This is
the case, for example, for a pair containing a non-immediate in one of
its fields.  @code{static-ref} and @code{static-patch!} are used in
these situations.

@deftypefn Instruction {} static-set! s24:@var{src} lo32:@var{offset}
Store a @var{scm} value into memory, @var{offset} 32-bit words away from the
current instruction pointer.  @var{offset} is a signed value.
@end deftypefn

@deftypefn Instruction {} static-patch! x24:@var{_} lo32:@var{dst-offset} l32:@var{src-offset}
Patch a pointer at @var{dst-offset} to point to @var{src-offset}.  Both offsets
are signed 32-bit values, indicating a memory address as a number
of 32-bit words away from the current instruction pointer.
@end deftypefn

Many kinds of literals can be loaded with the above instructions, once
the compiler has prepared the statically allocated data.  This is the
case for vectors, strings, uniform vectors, pairs, and procedures with
no free variables.  Other kinds of data might need special initializers;
those instructions follow.

@deftypefn Instruction {} string->number s12:@var{dst} s12:@var{src}
Parse a string in @var{src} to a number, and store in @var{dst}.
@end deftypefn

@deftypefn Instruction {} string->symbol s12:@var{dst} s12:@var{src}
Parse a string in @var{src} to a symbol, and store in @var{dst}.
@end deftypefn

@deftypefn Instruction {} symbol->keyword s12:@var{dst} s12:@var{src}
Make a keyword from the symbol in @var{src}, and store it in @var{dst}.
@end deftypefn

@deftypefn Instruction {} load-typed-array s24:@var{dst} x8:@var{_} s24:@var{type} x8:@var{_} s24:@var{shape} n32:@var{offset} u32:@var{len}
Load the contiguous typed array located at @var{offset} 32-bit words away
from the instruction pointer, and store into @var{dst}.  @var{len} is a byte
length.  @var{offset} is signed.
@end deftypefn


@node Dynamic Environment Instructions
@subsubsection Dynamic Environment Instructions

Guile's virtual machine has low-level support for @code{dynamic-wind},
dynamic binding, and composable prompts and aborts.

@deftypefn Instruction {} abort x24:@var{_}
Abort to a prompt handler.  The tag is expected in slot 1, and the rest
of the values in the frame are returned to the prompt handler.  This
corresponds to a tail application of abort-to-prompt.

If no prompt can be found in the dynamic environment with the given tag,
an error is signalled.  Otherwise all arguments are passed to the
prompt's handler, along with the captured continuation, if necessary.

If the prompt's handler can be proven to not reference the captured
continuation, no continuation is allocated.  This decision happens
dynamically, at run-time; the general case is that the continuation may
be captured, and thus resumed.  A reinstated continuation will have its
arguments pushed on the stack from slot 1, as if from a multiple-value
return, and control resumes in the caller.  Thus to the calling
function, a call to @code{abort-to-prompt} looks like any other function
call.
@end deftypefn

@deftypefn Instruction {} prompt s24:@var{tag} b1:@var{escape-only?} x7:@var{_} f24:@var{proc-slot} x8:@var{_} l24:@var{handler-offset}
Push a new prompt on the dynamic stack, with a tag from @var{tag} and a
handler at @var{handler-offset} words from the current @var{ip}.

If an abort is made to this prompt, control will jump to the handler.
The handler will expect a multiple-value return as if from a call with
the procedure at @var{proc-slot}, with the reified partial continuation
as the first argument, followed by the values returned to the handler.
If control returns to the handler, the prompt is already popped off by
the abort mechanism.  (Guile's @code{prompt} implements Felleisen's
@dfn{--F--} operator.)

If @var{escape-only?} is nonzero, the prompt will be marked as
escape-only, which allows an abort to this prompt to avoid reifying the
continuation.

@xref{Prompts}, for more information on prompts.
@end deftypefn

@deftypefn Instruction {} wind s12:@var{winder} s12:@var{unwinder}
Push wind and unwind procedures onto the dynamic stack. Note that
neither are actually called; the compiler should emit calls to wind and
unwind for the normal dynamic-wind control flow.  Also note that the
compiler should have inserted checks that they wind and unwind procs are
thunks, if it could not prove that to be the case.  @xref{Dynamic Wind}.
@end deftypefn

@deftypefn Instruction {} unwind x24:@var{_}
@var{a} normal exit from the dynamic extent of an expression. Pop the top
entry off of the dynamic stack.
@end deftypefn

@deftypefn Instruction {} push-fluid s12:@var{fluid} s12:@var{value}
Dynamically bind @var{value} to @var{fluid} by creating a with-fluids
object and pushing that object on the dynamic stack.  @xref{Fluids and
Dynamic States}.
@end deftypefn

@deftypefn Instruction {} pop-fluid x24:@var{_}
Leave the dynamic extent of a @code{with-fluid*} expression, restoring
the fluid to its previous value.  @code{push-fluid} should always be
balanced with @code{pop-fluid}.
@end deftypefn

@deftypefn Instruction {} fluid-ref s12:@var{dst} s12:@var{src}
Reference the fluid in @var{src}, and place the value in @var{dst}.
@end deftypefn

@deftypefn Instruction {} fluid-set s12:@var{fluid} s12:@var{val}
Set the value of the fluid in @var{dst} to the value in @var{src}.
@end deftypefn

@deftypefn Instruction {} current-thread s24:@var{dst}
Write the value of the current thread to @var{dst}.
@end deftypefn


@node Miscellaneous Instructions
@subsubsection Miscellaneous Instructions

@deftypefn Instruction {} halt x24:@var{_}
Bring the VM to a halt, returning all the values from the stack.  Used
in the ``boot continuation'', which is used when entering the VM from C.
@end deftypefn

@deftypefn Instruction {} push s24:@var{src}
Bump the stack pointer by one word, and fill it with the value from slot
@var{src}.  The offset to @var{src} is calculated before the stack
pointer is adjusted.
@end deftypefn

The @code{push} instruction is used when another instruction is unable
to address an operand because the operand is encoded with fewer than 24
bits.  In that case, Guile's assembler will transparently emit code that
temporarily pushes any needed operands onto the stack, emits the
original instruction to address those now-near variables, then shuffles
the result (if any) back into place.

@deftypefn Instruction {} pop s24:@var{dst}
Pop the stack pointer, storing the value that was there in slot
@var{dst}.  The offset to @var{dst} is calculated after the stack
pointer is adjusted.
@end deftypefn

@deftypefn Instruction {} drop c24:@var{count}
Pop the stack pointer by @var{count} words, discarding any values that
were stored there.
@end deftypefn


@node Inlined Scheme Instructions
@subsubsection Inlined Scheme Instructions

The Scheme compiler can recognize the application of standard Scheme
procedures.  It tries to inline these small operations to avoid the
overhead of creating new stack frames.  This allows the compiler to
optimize better.

@deftypefn Instruction {} make-vector s8:@var{dst} s8:@var{length} s8:@var{init}
Make a vector and write it to @var{dst}.  The vector will have space for
@var{length} slots.  They will be filled with the value in slot
@var{init}.
@end deftypefn

@deftypefn Instruction {} make-vector/immediate s8:@var{dst} s8:@var{length} c8:@var{init}
Make a short vector of known size and write it to @var{dst}.  The vector
will have space for @var{length} slots, an immediate value.  They will
be filled with the value in slot @var{init}.
@end deftypefn

@deftypefn Instruction {} vector-length s12:@var{dst} s12:@var{src}
Store the length of the vector in @var{src} in @var{dst}, as an unboxed
unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} vector-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
Fetch the item at position @var{idx} in the vector in @var{src}, and
store it in @var{dst}.  The @var{idx} value should be an unboxed
unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} vector-ref/immediate s8:@var{dst} s8:@var{src} c8:@var{idx}
Fill @var{dst} with the item @var{idx} elements into the vector at
@var{src}.  Useful for building data types using vectors.
@end deftypefn

@deftypefn Instruction {} vector-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
Store @var{src} into the vector @var{dst} at index @var{idx}.  The
@var{idx} value should be an unboxed unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} vector-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{src}
Store @var{src} into the vector @var{dst} at index @var{idx}.  Here
@var{idx} is an immediate value.
@end deftypefn

@deftypefn Instruction {} struct-vtable s12:@var{dst} s12:@var{src}
Store the vtable of @var{src} into @var{dst}.
@end deftypefn

@deftypefn Instruction {} allocate-struct s8:@var{dst} s8:@var{vtable} s8:@var{nfields}
Allocate a new struct with @var{vtable}, and place it in @var{dst}.  The
struct will be constructed with space for @var{nfields} fields, which
should correspond to the field count of the @var{vtable}.  The @var{idx}
value should be an unboxed unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} struct-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
Fetch the item at slot @var{idx} in the struct in @var{src}, and store
it in @var{dst}.  The @var{idx} value should be an unboxed unsigned
64-bit integer.
@end deftypefn

@deftypefn Instruction {} struct-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
Store @var{src} into the struct @var{dst} at slot @var{idx}.  The
@var{idx} value should be an unboxed unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} allocate-struct/immediate s8:@var{dst} s8:@var{vtable} c8:@var{nfields}
@deftypefnx Instruction {} struct-ref/immediate s8:@var{dst} s8:@var{src} c8:@var{idx}
@deftypefnx Instruction {} struct-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{src}
Variants of the struct instructions, but in which the @var{nfields} or
@var{idx} fields are immediate values.
@end deftypefn

@deftypefn Instruction {} class-of s12:@var{dst} s12:@var{type}
Store the vtable of @var{src} into @var{dst}.
@end deftypefn

@deftypefn Instruction {} make-array s24:@var{dst} x8:@var{_} s24:@var{type} x8:@var{_} s24:@var{fill} x8:@var{_} s24:@var{bounds}
Make a new array with @var{type}, @var{fill}, and @var{bounds}, storing it in @var{dst}.
@end deftypefn

@deftypefn Instruction {} string-length s12:@var{dst} s12:@var{src}
Store the length of the string in @var{src} in @var{dst}, as an unboxed
unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} string-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
Fetch the character at position @var{idx} in the string in @var{src},
and store it in @var{dst}.  The @var{idx} value should be an unboxed
unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} cons s8:@var{dst} s8:@var{car} s8:@var{cdr}
Cons @var{car} and @var{cdr}, and store the result in @var{dst}.
@end deftypefn

@deftypefn Instruction {} car s12:@var{dst} s12:@var{src}
Place the car of @var{src} in @var{dst}.
@end deftypefn

@deftypefn Instruction {} cdr s12:@var{dst} s12:@var{src}
Place the cdr of @var{src} in @var{dst}.
@end deftypefn

@deftypefn Instruction {} set-car! s12:@var{pair} s12:@var{car}
Set the car of @var{dst} to @var{src}.
@end deftypefn

@deftypefn Instruction {} set-cdr! s12:@var{pair} s12:@var{cdr}
Set the cdr of @var{dst} to @var{src}.
@end deftypefn

Note that @code{caddr} and friends compile to a series of @code{car}
and @code{cdr} instructions.

@deftypefn Instruction {} integer->char s12:@var{dst} s12:@var{src}
Convert the @code{u64} value in @var{src} to a Scheme character, and
place it in @var{dst}.
@end deftypefn

@deftypefn Instruction {} char->integer s12:@var{dst} s12:@var{src}
Convert the Scheme character in @var{src} to an integer, and place it in
@var{dst} as an unboxed @code{u64} value.
@end deftypefn


@node Inlined Mathematical Instructions
@subsubsection Inlined Mathematical Instructions

Inlining mathematical operations has the obvious advantage of handling
fixnums without function calls or allocations. The trick, of course,
is knowing when the result of an operation will be a fixnum, and there
might be a couple bugs here.

More instructions could be added here over time.

All of these operations place their result in their first operand,
@var{dst}.

@deftypefn Instruction {} add s8:@var{dst} s8:@var{a} s8:@var{b}
Add @var{a} to @var{b}.
@end deftypefn

@deftypefn Instruction {} add/immediate s8:@var{dst} s8:@var{src} c8:@var{imm}
Add the unsigned integer @var{imm} to the value in @var{src}.
@end deftypefn

@deftypefn Instruction {} sub s8:@var{dst} s8:@var{a} s8:@var{b}
Subtract @var{b} from @var{a}.
@end deftypefn

@deftypefn Instruction {} sub/immediate s8:@var{dst} s8:@var{src} s8:@var{imm}
Subtract the unsigned integer @var{imm} from the value in @var{src}.
@end deftypefn

@deftypefn Instruction {} mul s8:@var{dst} s8:@var{a} s8:@var{b}
Multiply @var{a} and @var{b}.
@end deftypefn

@deftypefn Instruction {} div s8:@var{dst} s8:@var{a} s8:@var{b}
Divide @var{a} by @var{b}.
@end deftypefn

@deftypefn Instruction {} quo s8:@var{dst} s8:@var{a} s8:@var{b}
Divide @var{a} by @var{b}.
@end deftypefn

@deftypefn Instruction {} rem s8:@var{dst} s8:@var{a} s8:@var{b}
Divide @var{a} by @var{b}.
@end deftypefn

@deftypefn Instruction {} mod s8:@var{dst} s8:@var{a} s8:@var{b}
Compute the modulo of @var{a} by @var{b}.
@end deftypefn

@deftypefn Instruction {} ash s8:@var{dst} s8:@var{a} s8:@var{b}
Shift @var{a} arithmetically by @var{b} bits.
@end deftypefn

@deftypefn Instruction {} logand s8:@var{dst} s8:@var{a} s8:@var{b}
Compute the bitwise @code{and} of @var{a} and @var{b}.
@end deftypefn

@deftypefn Instruction {} logior s8:@var{dst} s8:@var{a} s8:@var{b}
Compute the bitwise inclusive @code{or} of @var{a} with @var{b}.
@end deftypefn

@deftypefn Instruction {} logxor s8:@var{dst} s8:@var{a} s8:@var{b}
Compute the bitwise exclusive @code{or} of @var{a} with @var{b}.
@end deftypefn

@deftypefn Instruction {} logsub s8:@var{dst} s8:@var{a} s8:@var{b}
Place the bitwise @code{and} of @var{a} and the bitwise @code{not} of
@var{b} into @var{dst}.
@end deftypefn

@node Inlined Bytevector Instructions
@subsubsection Inlined Bytevector Instructions

Bytevector operations correspond closely to what the current hardware
can do, so it makes sense to inline them to VM instructions, providing
a clear path for eventual native compilation. Without this, Scheme
programs would need other primitives for accessing raw bytes -- but
these primitives are as good as any.

@deftypefn Instruction {} bv-length s12:@var{dst} s12:@var{src}
Store the length of the bytevector in @var{src} in @var{dst}, as an
unboxed unsigned 64-bit integer.
@end deftypefn

@deftypefn Instruction {} bv-u8-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-s8-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-u16-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-s16-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-u32-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-s32-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-u64-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-s64-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-f32-ref s8:@var{dst} s8:@var{src} s8:@var{idx}
@deftypefnx Instruction {} bv-f64-ref s8:@var{dst} s8:@var{src} s8:@var{idx}

Fetch the item at byte offset @var{idx} in the bytevector @var{src}, and
store it in @var{dst}.  All accesses use native endianness.

The @var{idx} value should be an unboxed unsigned 64-bit integer.

The results are all written to the stack as unboxed values, either as
signed 64-bit integers, unsigned 64-bit integers, or IEEE double
floating point numbers.
@end deftypefn

@deftypefn Instruction {} bv-u8-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-s8-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-u16-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-s16-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-u32-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-s32-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-u64-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-s64-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-f32-set! s8:@var{dst} s8:@var{idx} s8:@var{src}
@deftypefnx Instruction {} bv-f64-set! s8:@var{dst} s8:@var{idx} s8:@var{src}

Store @var{src} into the bytevector @var{dst} at byte offset @var{idx}.
Multibyte values are written using native endianness.

The @var{idx} value should be an unboxed unsigned 64-bit integer.

The @var{src} values are all unboxed, either as signed 64-bit integers,
unsigned 64-bit integers, or IEEE double floating point numbers.
@end deftypefn


@node Unboxed Integer Arithmetic
@subsubsection Unboxed Integer Arithmetic

Guile supports two kinds of unboxed integers: unsigned 64-bit integers,
and signed 64-bit integers.  Guile prefers unsigned integers, in the
sense that Guile's compiler supports them better and the virtual machine
has more operations that work on them.  Still, signed integers are
supported at least to allow @code{bv-s64-ref} and related instructions
to avoid boxing their values.

@deftypefn Instruction {} scm->u64 s12:@var{dst} s12:@var{src}
Unbox the SCM value at @var{src} to a unsigned 64-bit integer, placing
the result in @var{dst}.  If the @var{src} value is not an exact integer
in the unsigned 64-bit range, signal an error.
@end deftypefn

@deftypefn Instruction {} u64->scm s12:@var{dst} s12:@var{src}
Box the unsigned 64-bit integer at @var{src} to a SCM value and place
the result in @var{dst}.  The result will be a fixnum or a bignum.
@end deftypefn

@deftypefn Instruction {} load-u64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
Load a 64-bit value formed by joining @var{high-bits} and
@var{low-bits}, and write it to @var{dst}.
@end deftypefn

@deftypefn Instruction {} scm->s64 s12:@var{dst} s12:@var{src}
@deftypefnx Instruction {} s64->scm s12:@var{dst} s12:@var{src}
@deftypefnx Instruction {} load-s64 s24:@var{dst} as32:@var{high-bits} as32:@var{low-bits}
Like @code{scm->u64}, @code{u64->scm}, and @code{load-u64}, but for
signed 64-bit integers.
@end deftypefn

Sometimes the compiler can know that we will only need a subset of the
bits in an integer.  In that case we can sometimes unbox an integer even
if it might be out of range.

@deftypefn Instruction {} scm->u64/truncate s12:@var{dst} s12:@var{src}
Take the SCM value in @var{dst} and @code{logand} it with @code{(1- (ash
1 64))}.  Place the unboxed result in @var{dst}.
@end deftypefn

@deftypefn Instruction {} br-if-u64-= s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-u64-< s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-u64-<= s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the unboxed unsigned 64-bit integer value in @var{a} is @code{=},
@code{<}, or @code{<=} to the unboxed unsigned 64-bit integer value in
@var{b}, respectively, add @var{offset} to the current instruction
pointer.
@end deftypefn

@deftypefn Instruction {} br-if-u64-=-scm s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-u64-<-scm s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
@deftypefnx Instruction {} br-if-u64-<=-scm s24:@var{a} x8:@var{_} s24:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
If the unboxed unsigned 64-bit integer value in @var{a} is @code{=},
@code{<}, or @code{<=} to the SCM value in @var{b}, respectively, add
@var{offset} to the current instruction pointer.
@end deftypefn

@deftypefn Instruction {} uadd s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} usub s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} umul s8:@var{dst} s8:@var{a} s8:@var{b}
Like @code{add}, @code{sub}, and @code{mul}, except taking
the operands as unboxed unsigned 64-bit integers, and producing the
same.  The result will be silently truncated to 64 bits.
@end deftypefn

@deftypefn Instruction {} uadd/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
@deftypefnx Instruction {} usub/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
@deftypefnx Instruction {} umul/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
Like @code{uadd}, @code{usub}, and @code{umul}, except the second
operand is an immediate unsigned 8-bit integer.
@end deftypefn

@deftypefn Instruction {} ulogand s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} ulogior s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} ulogxor s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} ulogsub s8:@var{dst} s8:@var{a} s8:@var{b}
Like @code{logand}, @code{logior}, @code{logxor}, and @code{logsub}, but
operating on unboxed unsigned 64-bit integers.
@end deftypefn

@deftypefn Instruction {} ulsh s8:@var{dst} s8:@var{a} s8:@var{b}
Shift the unboxed unsigned 64-bit integer in @var{a} left by @var{b}
bits, also an unboxed unsigned 64-bit integer.  Truncate to 64 bits and
write to @var{dst} as an unboxed value.  Only the lower 6 bits of
@var{b} are used.
@end deftypefn

@deftypefn Instruction {} ursh s8:@var{dst} s8:@var{a} s8:@var{b}
Like @code{ulsh}, but shifting right.
@end deftypefn

@deftypefn Instruction {} ulsh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
@deftypefnx Instruction {} ursh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
Like @code{ulsh} and @code{ursh}, but encoding @code{b} as an immediate
8-bit unsigned integer.
@end deftypefn


@node Unboxed Floating-Point Arithmetic
@subsubsection Unboxed Floating-Point Arithmetic

@deftypefn Instruction {} scm->f64 s12:@var{dst} s12:@var{src}
Unbox the SCM value at @var{src} to an IEEE double, placing the result
in @var{dst}.  If the @var{src} value is not a real number, signal an
error.
@end deftypefn

@deftypefn Instruction {} f64->scm s12:@var{dst} s12:@var{src}
Box the IEEE double at @var{src} to a SCM value and place the result in
@var{dst}.
@end deftypefn

@deftypefn Instruction {} load-f64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
Load a 64-bit value formed by joining @var{high-bits} and
@var{low-bits}, and write it to @var{dst}.
@end deftypefn

@deftypefn Instruction {} fadd s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} fsub s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} fmul s8:@var{dst} s8:@var{a} s8:@var{b}
@deftypefnx Instruction {} fdiv s8:@var{dst} s8:@var{a} s8:@var{b}
Like @code{add}, @code{sub}, @code{div}, and @code{mul}, except taking
the operands as unboxed IEEE double floating-point numbers, and producing
the same.
@end deftypefn