mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-04-30 11:50:28 +02:00
* doc/ref/vm.texi: Update for new instructions. * doc/ref/web.texi: Update for URI-reference support. * NEWS: Update.
1353 lines
57 KiB
Text
1353 lines
57 KiB
Text
@c -*-texinfo-*-
|
|
@c This is part of the GNU Guile Reference Manual.
|
|
@c Copyright (C) 2008,2009,2010,2011,2013,2015
|
|
@c Free Software Foundation, Inc.
|
|
@c See the file guile.texi for copying conditions.
|
|
|
|
@node A Virtual Machine for Guile
|
|
@section A Virtual Machine for Guile
|
|
|
|
Guile has both an interpreter and a compiler. To a user, the difference
|
|
is transparent---interpreted and compiled procedures can call each other
|
|
as they please.
|
|
|
|
The difference is that the compiler creates and interprets bytecode
|
|
for a custom virtual machine, instead of interpreting the
|
|
S-expressions directly. Loading and running compiled code is faster
|
|
than loading and running source code.
|
|
|
|
The virtual machine that does the bytecode interpretation is a part of
|
|
Guile itself. This section describes the nature of Guile's virtual
|
|
machine.
|
|
|
|
@menu
|
|
* Why a VM?::
|
|
* VM Concepts::
|
|
* Stack Layout::
|
|
* Variables and the VM::
|
|
* VM Programs::
|
|
* Object File Format::
|
|
* Instruction Set::
|
|
@end menu
|
|
|
|
@node Why a VM?
|
|
@subsection Why a VM?
|
|
|
|
@cindex interpreter
|
|
For a long time, Guile only had an interpreter. Guile's interpreter
|
|
operated directly on the S-expression representation of Scheme source
|
|
code.
|
|
|
|
But while the interpreter was highly optimized and hand-tuned, it still
|
|
performed many needless computations during the course of evaluating an
|
|
expression. For example, application of a function to arguments
|
|
needlessly consed up the arguments in a list. Evaluation of an
|
|
expression always had to figure out what the car of the expression is --
|
|
a procedure, a memoized form, or something else. All values have to be
|
|
allocated on the heap. Et cetera.
|
|
|
|
The solution to this problem was to compile the higher-level language,
|
|
Scheme, into a lower-level language for which all of the checks and
|
|
dispatching have already been done---the code is instead stripped to
|
|
the bare minimum needed to ``do the job''.
|
|
|
|
The question becomes then, what low-level language to choose? There
|
|
are many options. We could compile to native code directly, but that
|
|
poses portability problems for Guile, as it is a highly cross-platform
|
|
project.
|
|
|
|
So we want the performance gains that compilation provides, but we
|
|
also want to maintain the portability benefits of a single code path.
|
|
The obvious solution is to compile to a virtual machine that is
|
|
present on all Guile installations.
|
|
|
|
The easiest (and most fun) way to depend on a virtual machine is to
|
|
implement the virtual machine within Guile itself. This way the
|
|
virtual machine provides what Scheme needs (tail calls, multiple
|
|
values, @code{call/cc}) and can provide optimized inline instructions
|
|
for Guile (@code{cons}, @code{struct-ref}, etc.).
|
|
|
|
So this is what Guile does. The rest of this section describes that VM
|
|
that Guile implements, and the compiled procedures that run on it.
|
|
|
|
Before moving on, though, we should note that though we spoke of the
|
|
interpreter in the past tense, Guile still has an interpreter. The
|
|
difference is that before, it was Guile's main evaluator, and so was
|
|
implemented in highly optimized C; now, it is actually implemented in
|
|
Scheme, and compiled down to VM bytecode, just like any other program.
|
|
(There is still a C interpreter around, used to bootstrap the compiler,
|
|
but it is not normally used at runtime.)
|
|
|
|
The upside of implementing the interpreter in Scheme is that we preserve
|
|
tail calls and multiple-value handling between interpreted and compiled
|
|
code. The downside is that the interpreter in Guile 2.2 is still slower
|
|
than the interpreter in 1.8. We hope the that the compiler's speed makes
|
|
up for the loss. In any case, once we have native compilation for
|
|
Scheme code, we expect the new self-hosted interpreter to beat the old
|
|
hand-tuned C implementation.
|
|
|
|
Also note that this decision to implement a bytecode compiler does not
|
|
preclude native compilation. We can compile from bytecode to native
|
|
code at runtime, or even do ahead of time compilation. More
|
|
possibilities are discussed in @ref{Extending the Compiler}.
|
|
|
|
@node VM Concepts
|
|
@subsection VM Concepts
|
|
|
|
Compiled code is run by a virtual machine (VM). Each thread has its own
|
|
VM. The virtual machine executes the sequence of instructions in a
|
|
procedure.
|
|
|
|
Each VM instruction starts by indicating which operation it is, and then
|
|
follows by encoding its source and destination operands. Each procedure
|
|
declares that it has some number of local variables, including the
|
|
function arguments. These local variables form the available operands
|
|
of the procedure, and are accessed by index.
|
|
|
|
The local variables for a procedure are stored on a stack. Calling a
|
|
procedure typically enlarges the stack, and returning from a procedure
|
|
shrinks it. Stack memory is exclusive to the virtual machine that owns
|
|
it.
|
|
|
|
In addition to their stacks, virtual machines also have access to the
|
|
global memory (modules, global bindings, etc) that is shared among other
|
|
parts of Guile, including other VMs.
|
|
|
|
The registers that a VM has are as follows:
|
|
|
|
@itemize
|
|
@item ip - Instruction pointer
|
|
@item sp - Stack pointer
|
|
@item fp - Frame pointer
|
|
@end itemize
|
|
|
|
In other architectures, the instruction pointer is sometimes called the
|
|
``program counter'' (pc). This set of registers is pretty typical for
|
|
virtual machines; their exact meanings in the context of Guile's VM are
|
|
described in the next section.
|
|
|
|
@node Stack Layout
|
|
@subsection Stack Layout
|
|
|
|
The stack of Guile's virtual machine is composed of @dfn{frames}. Each
|
|
frame corresponds to the application of one compiled procedure, and
|
|
contains storage space for arguments, local variables, and some
|
|
bookkeeping information (such as what to do after the frame is
|
|
finished).
|
|
|
|
While the compiler is free to do whatever it wants to, as long as the
|
|
semantics of a computation are preserved, in practice every time you
|
|
call a function, a new frame is created. (The notable exception of
|
|
course is the tail call case, @pxref{Tail Calls}.)
|
|
|
|
The structure of the top stack frame is as follows:
|
|
|
|
@example
|
|
/------------------\ <- top of stack
|
|
| Local N-1 | <- sp
|
|
| ... |
|
|
| Local 1 |
|
|
| Local 0 | <- fp = SCM_FRAME_LOCALS_ADDRESS (fp)
|
|
+==================+
|
|
| Return address |
|
|
| Dynamic link | <- fp - 2 = SCM_FRAME_LOWER_ADDRESS (fp)
|
|
+==================+
|
|
| | <- fp - 3 = SCM_FRAME_PREVIOUS_SP (fp)
|
|
@end example
|
|
|
|
In the above drawing, the stack grows upward. Usually the procedure
|
|
being applied is in local 0, followed by the arguments from local 1.
|
|
After that are enough slots to store the various lexically-bound and
|
|
temporary values that are needed in the function's application.
|
|
|
|
The @dfn{return address} is the @code{ip} that was in effect before this
|
|
program was applied. When we return from this activation frame, we will
|
|
jump back to this @code{ip}. Likewise, the @dfn{dynamic link} is the
|
|
@code{fp} in effect before this program was applied.
|
|
|
|
To prepare for a non-tail application, Guile's VM will emit code that
|
|
shuffles the function to apply and its arguments into appropriate stack
|
|
slots, with two free slots below them. The call then initializes those
|
|
free slots with the current @code{ip} and @code{fp}, and updates
|
|
@code{ip} to point to the function entry, and @code{fp} to point to the
|
|
new call frame.
|
|
|
|
In this way, the dynamic link links the current frame to the previous
|
|
frame. Computing a stack trace involves traversing these frames.
|
|
|
|
@node Variables and the VM
|
|
@subsection Variables and the VM
|
|
|
|
Consider the following Scheme code as an example:
|
|
|
|
@example
|
|
(define (foo a)
|
|
(lambda (b) (list foo a b)))
|
|
@end example
|
|
|
|
Within the lambda expression, @code{foo} is a top-level variable,
|
|
@code{a} is a lexically captured variable, and @code{b} is a local
|
|
variable.
|
|
|
|
Another way to refer to @code{a} and @code{b} is to say that @code{a} is
|
|
a ``free'' variable, since it is not defined within the lambda, and
|
|
@code{b} is a ``bound'' variable. These are the terms used in the
|
|
@dfn{lambda calculus}, a mathematical notation for describing functions.
|
|
The lambda calculus is useful because it is a language in which to
|
|
reason precisely about functions and variables. It is especially good
|
|
at describing scope relations, and it is for that reason that we mention
|
|
it here.
|
|
|
|
Guile allocates all variables on the stack. When a lexically enclosed
|
|
procedure with free variables---a @dfn{closure}---is created, it copies
|
|
those variables into its free variable vector. References to free
|
|
variables are then redirected through the free variable vector.
|
|
|
|
If a variable is ever @code{set!}, however, it will need to be
|
|
heap-allocated instead of stack-allocated, so that different closures
|
|
that capture the same variable can see the same value. Also, this
|
|
allows continuations to capture a reference to the variable, instead
|
|
of to its value at one point in time. For these reasons, @code{set!}
|
|
variables are allocated in ``boxes''---actually, in variable cells.
|
|
@xref{Variables}, for more information. References to @code{set!}
|
|
variables are indirected through the boxes.
|
|
|
|
Thus perhaps counterintuitively, what would seem ``closer to the
|
|
metal'', viz @code{set!}, actually forces an extra memory allocation
|
|
and indirection.
|
|
|
|
Going back to our example, @code{b} may be allocated on the stack, as
|
|
it is never mutated.
|
|
|
|
@code{a} may also be allocated on the stack, as it too is never
|
|
mutated. Within the enclosed lambda, its value will be copied into
|
|
(and referenced from) the free variables vector.
|
|
|
|
@code{foo} is a top-level variable, because @code{foo} is not
|
|
lexically bound in this example.
|
|
|
|
@node VM Programs
|
|
@subsection Compiled Procedures are VM Programs
|
|
|
|
By default, when you enter in expressions at Guile's REPL, they are
|
|
first compiled to bytecode. Then that bytecode is executed to produce a
|
|
value. If the expression evaluates to a procedure, the result of this
|
|
process is a compiled procedure.
|
|
|
|
A compiled procedure is a compound object consisting of its bytecode and
|
|
a reference to any captured lexical variables. In addition, when a
|
|
procedure is compiled, it has associated metadata written to side
|
|
tables, for instance a line number mapping, or its docstring. You can
|
|
pick apart these pieces with the accessors in @code{(system vm
|
|
program)}. @xref{Compiled Procedures}, for a full API reference.
|
|
|
|
A procedure may reference data that was statically allocated when the
|
|
procedure was compiled. For example, a pair of immediate objects
|
|
(@pxref{Immediate objects}) can be allocated directly in the memory
|
|
segment that contains the compiled bytecode, and accessed directly by
|
|
the bytecode.
|
|
|
|
Another use for statically allocated data is to serve as a cache for a
|
|
bytecode. Top-level variable lookups are handled in this way. If the
|
|
@code{toplevel-box} instruction finds that it does not have a cached
|
|
variable for a top-level reference, it accesses other static data to
|
|
resolve the reference, and fills in the cache slot. Thereafter all
|
|
access to the variable goes through the cache cell. The variable's
|
|
value may change in the future, but the variable itself will not.
|
|
|
|
We can see how these concepts tie together by disassembling the
|
|
@code{foo} function we defined earlier to see what is going on:
|
|
|
|
@smallexample
|
|
scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
|
|
scheme@@(guile-user)> ,x foo
|
|
Disassembly of #<procedure foo (a)> at #x203be34:
|
|
|
|
0 (assert-nargs-ee/locals 2 1) ;; 1 arg, 1 local at (unknown file):1:0
|
|
1 (make-closure 2 6 1) ;; anonymous procedure at #x203be50 (1 free var)
|
|
4 (free-set! 2 1 0) ;; free var 0
|
|
6 (return 2)
|
|
|
|
----------------------------------------
|
|
Disassembly of anonymous procedure at #x203be50:
|
|
|
|
0 (assert-nargs-ee/locals 2 3) ;; 1 arg, 3 locals at (unknown file):1:0
|
|
1 (toplevel-box 2 73 57 71 #t) ;; `foo'
|
|
6 (box-ref 2 2)
|
|
7 (make-short-immediate 3 772) ;; ()
|
|
8 (cons 3 1 3)
|
|
9 (free-ref 4 0 0) ;; free var 0
|
|
11 (cons 3 4 3)
|
|
12 (cons 2 2 3)
|
|
13 (return 2)
|
|
@end smallexample
|
|
|
|
First there's some prelude, where @code{foo} checks that it was called
|
|
with only 1 argument. Then at @code{ip} 1, we allocate a new closure
|
|
and store it in slot 2. The `6' in the @code{(make-closure 2 6 1)} is a
|
|
relative offset from the instruction pointer of the code for the
|
|
closure.
|
|
|
|
A closure is code with data. We already have the code part initialized;
|
|
what remains is to set the data. @code{Ip} 4 initializes free variable
|
|
0 in the new closure with the value from local variable 1, which
|
|
corresponds to the first argument of @code{foo}: `a'. Finally we return
|
|
the closure.
|
|
|
|
The second stanza disassembles the code for the closure. After the
|
|
prelude, we load the variable for the toplevel variable @code{foo} into
|
|
local variable 2. This lookup occurs lazily, the first time the
|
|
variable is actually referenced, and the location of the lookup is
|
|
cached so that future references are very cheap. @xref{Top-Level
|
|
Environment Instructions}, for more details. The @code{box-ref}
|
|
dereferences the variable cell, replacing the contents of local 2.
|
|
|
|
What follows is a sequence of conses to build up the result list.
|
|
@code{Ip} 7 makes the tail of the list. @code{Ip} 8 conses on the value
|
|
in local 1, corresponding to the first argument to the closure: `b'.
|
|
@code{Ip} 9 loads free variable 0 of local 0 -- the procedure being
|
|
called -- into slot 4, then @code{ip} 11 conses it onto the list.
|
|
Finally we cons local 2, containing the @code{foo} toplevel, onto the
|
|
front of the list, and we return it.
|
|
|
|
|
|
@node Object File Format
|
|
@subsection Object File Format
|
|
|
|
To compile a file to disk, we need a format in which to write the
|
|
compiled code to disk, and later load it into Guile. A good @dfn{object
|
|
file format} has a number of characteristics:
|
|
|
|
@itemize
|
|
@item Above all else, it should be very cheap to load a compiled file.
|
|
@item It should be possible to statically allocate constants in the
|
|
file. For example, a bytevector literal in source code can be emitted
|
|
directly into the object file.
|
|
@item The compiled file should enable maximum code and data sharing
|
|
between different processes.
|
|
@item The compiled file should contain debugging information, such as
|
|
line numbers, but that information should be separated from the code
|
|
itself. It should be possible to strip debugging information if space
|
|
is tight.
|
|
@end itemize
|
|
|
|
These characteristics are not specific to Scheme. Indeed, mainstream
|
|
languages like C and C++ have solved this issue many times in the past.
|
|
Guile builds on their work by adopting ELF, the object file format of
|
|
GNU and other Unix-like systems, as its object file format. Although
|
|
Guile uses ELF on all platforms, we do not use platform support for ELF.
|
|
Guile implements its own linker and loader. The advantage of using ELF
|
|
is not sharing code, but sharing ideas. ELF is simply a well-designed
|
|
object file format.
|
|
|
|
An ELF file has two meta-tables describing its contents. The first
|
|
meta-table is for the loader, and is called the @dfn{program table} or
|
|
sometimes the @dfn{segment table}. The program table divides the file
|
|
into big chunks that should be treated differently by the loader.
|
|
Mostly the difference between these @dfn{segments} is their
|
|
permissions.
|
|
|
|
Typically all segments of an ELF file are marked as read-only, except
|
|
that part that represents modifiable static data or static data that
|
|
needs load-time initialization. Loading an ELF file is as simple as
|
|
mmapping the thing into memory with read-only permissions, then using
|
|
the segment table to mark a small sub-region of the file as writable.
|
|
This writable section is typically added to the root set of the garbage
|
|
collector as well.
|
|
|
|
One ELF segment is marked as ``dynamic'', meaning that it has data of
|
|
interest to the loader. Guile uses this segment to record the Guile
|
|
version corresponding to this file. There is also an entry in the
|
|
dynamic segment that points to the address of an initialization thunk
|
|
that is run to perform any needed link-time initialization. (This is
|
|
like dynamic relocations for normal ELF shared objects, except that we
|
|
compile the relocations as a procedure instead of having the loader
|
|
interpret a table of relocations.) Finally, the dynamic segment marks
|
|
the location of the ``entry thunk'' of the object file. This thunk is
|
|
returned to the caller of @code{load-thunk-from-memory} or
|
|
@code{load-thunk-from-file}. When called, it will execute the ``body''
|
|
of the compiled expression.
|
|
|
|
The other meta-table in an ELF file is the @dfn{section table}. Whereas
|
|
the program table divides an ELF file into big chunks for the loader,
|
|
the section table specifies small sections for use by introspective
|
|
tools like debuggers or the like. One segment (program table entry)
|
|
typically contains many sections. There may be sections outside of any
|
|
segment, as well.
|
|
|
|
Typical sections in a Guile @code{.go} file include:
|
|
|
|
@table @code
|
|
@item .rtl-text
|
|
Bytecode.
|
|
@item .data
|
|
Data that needs initialization, or which may be modified at runtime.
|
|
@item .rodata
|
|
Statically allocated data that needs no run-time initialization, and
|
|
which therefore can be shared between processes.
|
|
@item .dynamic
|
|
The dynamic section, discussed above.
|
|
@item .symtab
|
|
@itemx .strtab
|
|
A table mapping addresses in the @code{.rtl-text} to procedure names.
|
|
@code{.strtab} is used by @code{.symtab}.
|
|
@item .guile.procprops
|
|
@itemx .guile.arities
|
|
@itemx .guile.arities.strtab
|
|
@itemx .guile.docstrs
|
|
@itemx .guile.docstrs.strtab
|
|
Side tables of procedure properties, arities, and docstrings.
|
|
@item .debug_info
|
|
@itemx .debug_abbrev
|
|
@itemx .debug_str
|
|
@itemx .debug_loc
|
|
@itemx .debug_line
|
|
Debugging information, in DWARF format. See the DWARF specification,
|
|
for more information.
|
|
@item .shstrtab
|
|
Section name string table.
|
|
@end table
|
|
|
|
For more information, see @uref{http://linux.die.net/man/5/elf,,the
|
|
elf(5) man page}. See @uref{http://dwarfstd.org/,the DWARF
|
|
specification} for more on the DWARF debugging format. Or if you are an
|
|
adventurous explorer, try running @code{readelf} or @code{objdump} on
|
|
compiled @code{.go} files. It's good times!
|
|
|
|
|
|
@node Instruction Set
|
|
@subsection Instruction Set
|
|
|
|
There are currently about 130 instructions in Guile's virtual machine.
|
|
These instructions represent atomic units of a program's execution.
|
|
Ideally, they perform one task without conditional branches, then
|
|
dispatch to the next instruction in the stream.
|
|
|
|
Instructions themselves are composed of 1 or more 32-bit units. The low
|
|
8 bits of the first word indicate the opcode, and the rest of
|
|
instruction describe the operands. There are a number of different ways
|
|
operands can be encoded.
|
|
|
|
@table @code
|
|
@item u@var{n}
|
|
An unsigned @var{n}-bit integer. Usually indicates the index of a local
|
|
variable, but some instructions interpret these operands as immediate
|
|
values.
|
|
@item l24
|
|
An offset from the current @code{ip}, in 32-bit units, as a signed
|
|
24-bit value. Indicates a bytecode address, for a relative jump.
|
|
@item i16
|
|
@itemx i32
|
|
An immediate Scheme value (@pxref{Immediate objects}), encoded directly
|
|
in 16 or 32 bits.
|
|
@item a32
|
|
@itemx b32
|
|
An immediate Scheme value, encoded as a pair of 32-bit words.
|
|
@code{a32} and @code{b32} values always go together on the same opcode,
|
|
and indicate the high and low bits, respectively. Normally only used on
|
|
64-bit systems.
|
|
@item n32
|
|
A statically allocated non-immediate. The address of the non-immediate
|
|
is encoded as a signed 32-bit integer, and indicates a relative offset
|
|
in 32-bit units. Think of it as @code{SCM x = ip + offset}.
|
|
@item s32
|
|
Indirect scheme value, like @code{n32} but indirected. Think of it as
|
|
@code{SCM *x = ip + offset}.
|
|
@item l32
|
|
@item lo32
|
|
An ip-relative address, as a signed 32-bit integer. Could indicate a
|
|
bytecode address, as in @code{make-closure}, or a non-immediate address,
|
|
as with @code{static-patch!}.
|
|
|
|
@code{l32} and @code{lo32} are the same from the perspective of the
|
|
virtual machine. The difference is that an assembler might want to
|
|
allow an @code{lo32} address to be specified as a label and then some
|
|
number of words offset from that label, for example when patching a
|
|
field of a statically allocated object.
|
|
@item b1
|
|
A boolean value: 1 for true, otherwise 0.
|
|
@item x@var{n}
|
|
An ignored sequence of @var{n} bits.
|
|
@end table
|
|
|
|
An instruction is specified by giving its name, then describing its
|
|
operands. The operands are packed by 32-bit words, with earlier
|
|
operands occupying the lower bits.
|
|
|
|
For example, consider the following instruction specification:
|
|
|
|
@deftypefn Instruction {} free-set! u12:@var{dst} u12:@var{src} x8:@var{_} u24:@var{idx}
|
|
Set free variable @var{idx} from the closure @var{dst} to @var{src}.
|
|
@end deftypefn
|
|
|
|
The first word in the instruction will start with the 8-bit value
|
|
corresponding to the @var{free-set!} opcode in the low bits, followed by
|
|
@var{dst} and @var{src} as 12-bit values. The second word starts with 8
|
|
dead bits, followed by the index as a 24-bit immediate value.
|
|
|
|
Sometimes the compiler can figure out that it is compiling a special
|
|
case that can be run more efficiently. So, for example, while Guile
|
|
offers a generic test-and-branch instruction, it also offers specific
|
|
instructions for special cases, so that the following cases all have
|
|
their own test-and-branch instructions:
|
|
|
|
@example
|
|
(if pred then else)
|
|
(if (not pred) then else)
|
|
(if (null? l) then else)
|
|
(if (not (null? l)) then else)
|
|
@end example
|
|
|
|
In addition, some Scheme primitives have their own inline
|
|
implementations. For example, in the previous section we saw
|
|
@code{cons}.
|
|
|
|
Guile's instruction set is a @emph{complete} instruction set, in that it
|
|
provides the instructions that are suited to the problem, and is not
|
|
concerned with making a minimal, orthogonal set of instructions. More
|
|
instructions may be added over time.
|
|
|
|
@menu
|
|
* Lexical Environment Instructions::
|
|
* Top-Level Environment Instructions::
|
|
* Procedure Call and Return Instructions::
|
|
* Function Prologue Instructions::
|
|
* Trampoline Instructions::
|
|
* Branch Instructions::
|
|
* Constant Instructions::
|
|
* Dynamic Environment Instructions::
|
|
* Miscellaneous Instructions::
|
|
* Inlined Scheme Instructions::
|
|
* Inlined Mathematical Instructions::
|
|
* Inlined Bytevector Instructions::
|
|
@end menu
|
|
|
|
|
|
@node Lexical Environment Instructions
|
|
@subsubsection Lexical Environment Instructions
|
|
|
|
These instructions access and mutate the lexical environment of a
|
|
compiled procedure---its free and bound variables. @xref{Stack Layout},
|
|
for more information on the format of stack frames.
|
|
|
|
@deftypefn Instruction {} mov u12:@var{dst} u12:@var{src}
|
|
@deftypefnx Instruction {} long-mov u24:@var{dst} x8:@var{_} u24:@var{src}
|
|
Copy a value from one local slot to another.
|
|
|
|
As discussed previously, procedure arguments and local variables are
|
|
allocated to local slots. Guile's compiler tries to avoid shuffling
|
|
variables around to different slots, which often makes @code{mov}
|
|
instructions redundant. However there are some cases in which shuffling
|
|
is necessary, and in those cases, @code{mov} is the thing to use.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} make-closure u24:@var{dst} l32:@var{offset} x8:@var{_} u24:@var{nfree}
|
|
Make a new closure, and write it to @var{dst}. The code for the closure
|
|
will be found at @var{offset} words from the current @code{ip}.
|
|
@var{offset} is a signed 32-bit integer. Space for @var{nfree} free
|
|
variables will be allocated.
|
|
|
|
The size of a closure is currently two words, plus one word per free
|
|
variable.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} free-ref u12:@var{dst} u12:@var{src} x8:@var{_} u24:@var{idx}
|
|
Load free variable @var{idx} from the closure @var{src} into local slot
|
|
@var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} free-set! u12:@var{dst} u12:@var{src} x8:@var{_} u24:@var{idx}
|
|
Set free variable @var{idx} from the closure @var{dst} to @var{src}.
|
|
|
|
This instruction is usually used when initializing a closure's free
|
|
variables, but not to mutate free variables, as variables that are
|
|
assigned are boxed.
|
|
@end deftypefn
|
|
|
|
Recall that variables that are assigned are usually allocated in boxes,
|
|
so that continuations and closures can capture their identity and not
|
|
their value at one point in time. Variables are also used in the
|
|
implementation of top-level bindings; see the next section for more
|
|
information.
|
|
|
|
@deftypefn Instruction {} box u12:@var{dst} u12:@var{src}
|
|
Create a new variable holding @var{src}, and place it in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} box-ref u12:@var{dst} u12:@var{src}
|
|
Unpack the variable at @var{src} into @var{dst}, asserting that the
|
|
variable is actually bound.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} box-set! u12:@var{dst} u12:@var{src}
|
|
Set the contents of the variable at @var{dst} to @var{set}.
|
|
@end deftypefn
|
|
|
|
|
|
@node Top-Level Environment Instructions
|
|
@subsubsection Top-Level Environment Instructions
|
|
|
|
These instructions access values in the top-level environment: bindings
|
|
that were not lexically apparent at the time that the code in question
|
|
was compiled.
|
|
|
|
The location in which a toplevel binding is stored can be looked up once
|
|
and cached for later. The binding itself may change over time, but its
|
|
location will stay constant.
|
|
|
|
@deftypefn Instruction {} current-module u24:@var{dst}
|
|
Store the current module in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} resolve u24:@var{dst} b1:@var{bound?} x7:@var{_} u24:@var{sym}
|
|
Resolve @var{sym} in the current module, and place the resulting
|
|
variable in @var{dst}. An error will be signalled if no variable is
|
|
found. If @var{bound?} is true, an error will be signalled if the
|
|
variable is unbound.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} define! u12:@var{sym} u12:@var{val}
|
|
Look up a binding for @var{sym} in the current module, creating it if
|
|
necessary. Set its value to @var{val}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} toplevel-box u24:@var{dst} s32:@var{var-offset} s32:@var{mod-offset} n32:@var{sym-offset} b1:@var{bound?} x31:@var{_}
|
|
Load a value. The value will be fetched from memory, @var{var-offset}
|
|
32-bit words away from the current instruction pointer.
|
|
@var{var-offset} is a signed value. Up to here, @code{toplevel-box} is
|
|
like @code{static-ref}.
|
|
|
|
Then, if the loaded value is a variable, it is placed in @var{dst}, and
|
|
control flow continues.
|
|
|
|
Otherwise, we have to resolve the variable. In that case we load the
|
|
module from @var{mod-offset}, just as we loaded the variable. Usually
|
|
the module gets set when the closure is created. @var{sym-offset}
|
|
specifies the name, as an offset to a symbol.
|
|
|
|
We use the module and the symbol to resolve the variable, placing it in
|
|
@var{dst}, and caching the resolved variable so that we will hit the
|
|
cache next time. If @var{bound?} is true, an error will be signalled if
|
|
the variable is unbound.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} module-box u24:@var{dst} s32:@var{var-offset} n32:@var{mod-offset} n32:@var{sym-offset} b1:@var{bound?} x31:@var{_}
|
|
Like @code{toplevel-box}, except @var{mod-offset} points at a module
|
|
identifier instead of the module itself. A module identifier is a
|
|
module name, as a list, prefixed by a boolean. If the prefix is true,
|
|
then the variable is resolved relative to the module's public interface
|
|
instead of its private interface.
|
|
@end deftypefn
|
|
|
|
|
|
@node Procedure Call and Return Instructions
|
|
@subsubsection Procedure Call and Return Instructions
|
|
|
|
As described earlier (@pxref{Stack Layout}), Guile's calling convention
|
|
is that arguments are passed and values returned on the stack.
|
|
|
|
For calls, both in tail position and in non-tail position, we require
|
|
that the procedure and the arguments already be shuffled into place
|
|
befor the call instruction. ``Into place'' for a tail call means that
|
|
the procedure should be in slot 0, and the arguments should follow. For
|
|
a non-tail call, if the procedure is in slot @var{n}, the arguments
|
|
should follow from slot @var{n}+1, and there should be two free slots at
|
|
@var{n}-1 and @var{n}-2 in which to save the @code{ip} and @code{fp}.
|
|
|
|
Returning values is similar. Multiple-value returns should have values
|
|
already shuffled down to start from slot 1 before emitting
|
|
@code{return-values}. There is a short-cut in the single-value case, in
|
|
that @code{return} handles the trivial shuffling itself. We start from
|
|
slot 1 instead of slot 0 to make tail calls to @code{values} trivial.
|
|
|
|
In both calls and returns, the @code{sp} is used to indicate to the
|
|
callee or caller the number of arguments or return values, respectively.
|
|
After receiving return values, it is the caller's responsibility to
|
|
@dfn{restore the frame} by resetting the @code{sp} to its former value.
|
|
|
|
@deftypefn Instruction {} call u24:@var{proc} x8:@var{_} u24:@var{nlocals}
|
|
Call a procedure. @var{proc} is the local corresponding to a procedure.
|
|
The two values below @var{proc} will be overwritten by the saved call
|
|
frame data. The new frame will have space for @var{nlocals} locals: one
|
|
for the procedure, and the rest for the arguments which should already
|
|
have been pushed on.
|
|
|
|
When the call returns, execution proceeds with the next instruction.
|
|
There may be any number of values on the return stack; the precise
|
|
number can be had by subtracting the address of @var{proc} from the
|
|
post-call @code{sp}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} call-label u24:@var{proc} x8:@var{_} u24:@var{nlocals} l32:@var{label}
|
|
Call a procedure in the same compilation unit.
|
|
|
|
This instruction is just like @code{call}, except that instead of
|
|
dereferencing @var{proc} to find the call target, the call target is
|
|
known to be at @var{label}, a signed 32-bit offset in 32-bit units from
|
|
the current @code{ip}. Since @var{proc} is not dereferenced, it may be
|
|
some other representation of the closure.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} tail-call u24:@var{nlocals}
|
|
Tail-call a procedure. Requires that the procedure and all of the
|
|
arguments have already been shuffled into position. Will reset the
|
|
frame to @var{nlocals}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} tail-call-label u24:@var{nlocals} l32:@var{label}
|
|
Tail-call a known procedure. As @code{call} is to @code{call-label},
|
|
@code{tail-call} is to @code{tail-call-label}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} tail-call/shuffle u24:@var{from}
|
|
Tail-call a procedure. The procedure should already be set to slot 0.
|
|
The rest of the args are taken from the frame, starting at @var{from},
|
|
shuffled down to start at slot 0. This is part of the implementation of
|
|
the @code{call-with-values} builtin.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} receive u12:@var{dst} u12:@var{proc} x8:@var{_} u24:@var{nlocals}
|
|
Receive a single return value from a call whose procedure was in
|
|
@var{proc}, asserting that the call actually returned at least one
|
|
value. Afterwards, resets the frame to @var{nlocals} locals.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} receive-values u24:@var{proc} b1:@var{allow-extra?} x7:@var{_} u24:@var{nvalues}
|
|
Receive a return of multiple values from a call whose procedure was in
|
|
@var{proc}. If fewer than @var{nvalues} values were returned, signal an
|
|
error. Unless @var{allow-extra?} is true, require that the number of
|
|
return values equals @var{nvalues} exactly. After @code{receive-values}
|
|
has run, the values can be copied down via @code{mov}, or used in place.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} return u24:@var{src}
|
|
Return a value.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} return-values x24:@var{_}
|
|
Return a number of values from a call frame. This opcode corresponds to
|
|
an application of @code{values} in tail position. As with tail calls,
|
|
we expect that the values have already been shuffled down to a
|
|
contiguous array starting at slot 1. We also expect the frame has
|
|
already been reset.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} call/cc x24:@var{_}
|
|
Capture the current continuation, and tail-apply the procedure in local
|
|
slot 1 to it. This instruction is part of the implementation of
|
|
@code{call/cc}, and is not generated by the compiler.
|
|
@end deftypefn
|
|
|
|
|
|
@node Function Prologue Instructions
|
|
@subsubsection Function Prologue Instructions
|
|
|
|
A function call in Guile is very cheap: the VM simply hands control to
|
|
the procedure. The procedure itself is responsible for asserting that it
|
|
has been passed an appropriate number of arguments. This strategy allows
|
|
arbitrarily complex argument parsing idioms to be developed, without
|
|
harming the common case.
|
|
|
|
For example, only calls to keyword-argument procedures ``pay'' for the
|
|
cost of parsing keyword arguments. (At the time of this writing, calling
|
|
procedures with keyword arguments is typically two to four times as
|
|
costly as calling procedures with a fixed set of arguments.)
|
|
|
|
@deftypefn Instruction {} assert-nargs-ee u24:@var{expected}
|
|
@deftypefnx Instruction {} assert-nargs-ge u24:@var{expected}
|
|
@deftypefnx Instruction {} assert-nargs-le u24:@var{expected}
|
|
If the number of actual arguments is not @code{==}, @code{>=}, or
|
|
@code{<=} @var{expected}, respectively, signal an error.
|
|
|
|
The number of arguments is determined by subtracting the frame pointer
|
|
from the stack pointer (@code{sp + 1 - fp}). @xref{Stack Layout}, for
|
|
more details on stack frames. Note that @var{expected} includes the
|
|
procedure itself.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-nargs-ne u24:@var{expected} x8:@var{_} l24:@var{offset}
|
|
@deftypefnx Instruction {} br-if-nargs-lt u24:@var{expected} x8:@var{_} l24:@var{offset}
|
|
@deftypefnx Instruction {} br-if-nargs-gt u24:@var{expected} x8:@var{_} l24:@var{offset}
|
|
If the number of actual arguments is not equal, less than, or greater
|
|
than @var{expected}, respectively, add @var{offset}, a signed 24-bit
|
|
number, to the current instruction pointer. Note that @var{expected}
|
|
includes the procedure itself.
|
|
|
|
These instructions are used to implement multiple arities, as in
|
|
@code{case-lambda}. @xref{Case-lambda}, for more information.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} alloc-frame u24:@var{nlocals}
|
|
Ensure that there is space on the stack for @var{nlocals} local
|
|
variables, setting them all to @code{SCM_UNDEFINED}, except those values
|
|
that are already on the stack.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} reset-frame u24:@var{nlocals}
|
|
Like @code{alloc-frame}, but doesn't check that the stack is big enough,
|
|
and doesn't initialize values to @code{SCM_UNDEFINED}. Used to reset
|
|
the frame size to something less than the size that was previously set
|
|
via alloc-frame.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} assert-nargs-ee/locals u12:@var{expected} u12:@var{nlocals}
|
|
Equivalent to a sequence of @code{assert-nargs-ee} and
|
|
@code{reserve-locals}. The number of locals reserved is @var{expected}
|
|
+ @var{nlocals}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-npos-gt u24:@var{nreq} x8:@var{_} u24:@var{npos} x8:@var{_} l24:@var{offset}
|
|
Find the first positional argument after @var{nreq}. If it is greater
|
|
than @var{npos}, jump to @var{offset}.
|
|
|
|
This instruction is only emitted for functions with multiple clauses,
|
|
and an earlier clause has keywords and no rest arguments.
|
|
@xref{Case-lambda}, for more on how @code{case-lambda} chooses the
|
|
clause to apply.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} bind-kwargs u24:@var{nreq} u8:@var{flags} u24:@var{nreq-and-opt} x8:@var{_} u24:@var{ntotal} n32:@var{kw-offset}
|
|
@var{flags} is a bitfield, whose lowest bit is @var{allow-other-keys},
|
|
second bit is @var{has-rest}, and whose following six bits are unused.
|
|
|
|
Find the last positional argument, and shuffle all the rest above
|
|
@var{ntotal}. Initialize the intervening locals to
|
|
@code{SCM_UNDEFINED}. Then load the constant at @var{kw-offset} words
|
|
from the current @var{ip}, and use it and the @var{allow-other-keys}
|
|
flag to bind keyword arguments. If @var{has-rest}, collect all shuffled
|
|
arguments into a list, and store it in @var{nreq-and-opt}. Finally,
|
|
clear the arguments that we shuffled up.
|
|
|
|
The parsing is driven by a keyword arguments association list, looked up
|
|
using @var{kw-offset}. The alist is a list of pairs of the form
|
|
@code{(@var{kw} . @var{index})}, mapping keyword arguments to their
|
|
local slot indices. Unless @code{allow-other-keys} is set, the parser
|
|
will signal an error if an unknown key is found.
|
|
|
|
A macro-mega-instruction.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} bind-rest u24:@var{dst}
|
|
Collect any arguments at or above @var{dst} into a list, and store that
|
|
list at @var{dst}.
|
|
@end deftypefn
|
|
|
|
|
|
@node Trampoline Instructions
|
|
@subsubsection Trampoline Instructions
|
|
|
|
Though most applicable objects in Guile are procedures implemented in
|
|
bytecode, not all are. There are primitives, continuations, and other
|
|
procedure-like objects that have their own calling convention. Instead
|
|
of adding special cases to the @code{call} instruction, Guile wraps
|
|
these other applicable objects in VM trampoline procedures, then
|
|
provides special support for these objects in bytecode.
|
|
|
|
Trampoline procedures are typically generated by Guile at runtime, for
|
|
example in response to a call to @code{scm_c_make_gsubr}. As such, a
|
|
compiler probably shouldn't emit code with these instructions. However,
|
|
it's still interesting to know how these things work, so we document
|
|
these trampoline instructions here.
|
|
|
|
@deftypefn Instruction {} subr-call u24:@var{ptr-idx}
|
|
Call a subr, passing all locals in this frame as arguments. Fetch the
|
|
foreign pointer from @var{ptr-idx}, a free variable. Return from the
|
|
calling frame.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} foreign-call u12:@var{cif-idx} u12:@var{ptr-idx}
|
|
Call a foreign function. Fetch the @var{cif} and foreign pointer from
|
|
@var{cif-idx} and @var{ptr-idx}, both free variables. Return from the calling
|
|
frame. Arguments are taken from the stack.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} continuation-call u24:@var{contregs}
|
|
Return to a continuation, nonlocally. The arguments to the continuation
|
|
are taken from the stack. @var{contregs} is a free variable containing
|
|
the reified continuation.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} compose-continuation u24:@var{cont}
|
|
Compose a partial continution with the current continuation. The
|
|
arguments to the continuation are taken from the stack. @var{cont} is a
|
|
free variable containing the reified continuation.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} tail-apply x24:@var{_}
|
|
Tail-apply the procedure in local slot 0 to the rest of the arguments.
|
|
This instruction is part of the implementation of @code{apply}, and is
|
|
not generated by the compiler.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} builtin-ref u12:@var{dst} u12:@var{idx}
|
|
Load a builtin stub by index into @var{dst}.
|
|
@end deftypefn
|
|
|
|
|
|
@node Branch Instructions
|
|
@subsubsection Branch Instructions
|
|
|
|
All offsets to branch instructions are 24-bit signed numbers, which
|
|
count 32-bit units. This gives Guile effectively a 26-bit address range
|
|
for relative jumps.
|
|
|
|
@deftypefn Instruction {} br l24:@var{offset}
|
|
Add @var{offset} to the current instruction pointer.
|
|
@end deftypefn
|
|
|
|
All the conditional branch instructions described below have an
|
|
@var{invert} parameter, which if true reverses the test:
|
|
@code{br-if-true} becomes @code{br-if-false}, and so on.
|
|
|
|
@deftypefn Instruction {} br-if-true u24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{test} is true for the purposes of Scheme, add
|
|
@var{offset} to the current instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-null u24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{test} is the end-of-list or Lisp nil, add
|
|
@var{offset} to the current instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-nil u24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{test} is false to Lisp, add @var{offset} to the
|
|
current instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-pair u24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{test} is a pair, add @var{offset} to the current
|
|
instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-struct u24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{test} is a struct, add @var{offset} number to the
|
|
current instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-char u24:@var{test} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{test} is a char, add @var{offset} to the current
|
|
instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-tc7 u24:@var{test} b1:@var{invert} u7:@var{tc7} l24:@var{offset}
|
|
If the value in @var{test} has the TC7 given in the second word, add
|
|
@var{offset} to the current instruction pointer. TC7 codes are part of
|
|
the way Guile represents non-immediate objects, and are deep wizardry.
|
|
See @code{libguile/tags.h} for all the details.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-eq u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
@deftypefnx Instruction {} br-if-eqv u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
@deftypefnx Instruction {} br-if-equal u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{a} is @code{eq?}, @code{eqv?}, or @code{equal?} to
|
|
the value in @var{b}, respectively, add @var{offset} to the current
|
|
instruction pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-= u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
@deftypefnx Instruction {} br-if-< u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
@deftypefnx Instruction {} br-if-<= u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the value in @var{a} is @code{=}, @code{<}, or @code{<=} to the value
|
|
in @var{b}, respectively, add @var{offset} to the current instruction
|
|
pointer.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} br-if-logtest u12:@var{a} u12:@var{b} b1:@var{invert} x7:@var{_} l24:@var{offset}
|
|
If the bitwise intersection of the integers in @var{a} and @var{b} is
|
|
nonzero, add @var{offset} to the current instruction pointer.
|
|
@end deftypefn
|
|
|
|
|
|
@node Constant Instructions
|
|
@subsubsection Constant Instructions
|
|
|
|
The following instructions load literal data into a program. There are
|
|
two kinds.
|
|
|
|
The first set of instructions loads immediate values. These
|
|
instructions encode the immediate directly into the instruction stream.
|
|
|
|
@deftypefn Instruction {} make-short-immediate u8:@var{dst} i16:@var{low-bits}
|
|
Make an immediate whose low bits are @var{low-bits}, and whose top bits are
|
|
0.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} make-long-immediate u24:@var{dst} i32:@var{low-bits}
|
|
Make an immediate whose low bits are @var{low-bits}, and whose top bits are
|
|
0.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} make-long-long-immediate u24:@var{dst} a32:@var{high-bits} b32:@var{low-bits}
|
|
Make an immediate with @var{high-bits} and @var{low-bits}.
|
|
@end deftypefn
|
|
|
|
Non-immediate constant literals are referenced either directly or
|
|
indirectly. For example, Guile knows at compile-time what the layout of
|
|
a string will be like, and arranges to embed that object directly in the
|
|
compiled image. A reference to a string will use
|
|
@code{make-non-immediate} to treat a pointer into the compilation unit
|
|
as a @code{SCM} value directly.
|
|
|
|
@deftypefn Instruction {} make-non-immediate u24:@var{dst} n32:@var{offset}
|
|
Load a pointer to statically allocated memory into @var{dst}. The
|
|
object's memory is will be found @var{offset} 32-bit words away from the
|
|
current instruction pointer. Whether the object is mutable or immutable
|
|
depends on where it was allocated by the compiler, and loaded by the
|
|
loader.
|
|
@end deftypefn
|
|
|
|
Some objects must be unique across the whole system. This is the case
|
|
for symbols and keywords. For these objects, Guile arranges to
|
|
initialize them when the compilation unit is loaded, storing them into a
|
|
slot in the image. References go indirectly through that slot.
|
|
@code{static-ref} is used in this case.
|
|
|
|
@deftypefn Instruction {} static-ref u24:@var{dst} s32:@var{offset}
|
|
Load a @var{scm} value into @var{dst}. The @var{scm} value will be fetched from
|
|
memory, @var{offset} 32-bit words away from the current instruction
|
|
pointer. @var{offset} is a signed value.
|
|
@end deftypefn
|
|
|
|
Fields of non-immediates may need to be fixed up at load time, because
|
|
we do not know in advance at what address they will be loaded. This is
|
|
the case, for example, for a pair containing a non-immediate in one of
|
|
its fields. @code{static-ref} and @code{static-patch!} are used in
|
|
these situations.
|
|
|
|
@deftypefn Instruction {} static-set! u24:@var{src} lo32:@var{offset}
|
|
Store a @var{scm} value into memory, @var{offset} 32-bit words away from the
|
|
current instruction pointer. @var{offset} is a signed value.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} static-patch! x24:@var{_} lo32:@var{dst-offset} l32:@var{src-offset}
|
|
Patch a pointer at @var{dst-offset} to point to @var{src-offset}. Both offsets
|
|
are signed 32-bit values, indicating a memory address as a number
|
|
of 32-bit words away from the current instruction pointer.
|
|
@end deftypefn
|
|
|
|
Many kinds of literals can be loaded with the above instructions, once
|
|
the compiler has prepared the statically allocated data. This is the
|
|
case for vectors, strings, uniform vectors, pairs, and procedures with
|
|
no free variables. Other kinds of data might need special initializers;
|
|
those instructions follow.
|
|
|
|
@deftypefn Instruction {} string->number u12:@var{dst} u12:@var{src}
|
|
Parse a string in @var{src} to a number, and store in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} string->symbol u12:@var{dst} u12:@var{src}
|
|
Parse a string in @var{src} to a symbol, and store in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} symbol->keyword u12:@var{dst} u12:@var{src}
|
|
Make a keyword from the symbol in @var{src}, and store it in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} load-typed-array u8:@var{dst} u8:@var{type} u8:@var{shape} n32:@var{offset} u32:@var{len}
|
|
Load the contiguous typed array located at @var{offset} 32-bit words away
|
|
from the instruction pointer, and store into @var{dst}. @var{len} is a byte
|
|
length. @var{offset} is signed.
|
|
@end deftypefn
|
|
|
|
|
|
@node Dynamic Environment Instructions
|
|
@subsubsection Dynamic Environment Instructions
|
|
|
|
Guile's virtual machine has low-level support for @code{dynamic-wind},
|
|
dynamic binding, and composable prompts and aborts.
|
|
|
|
@deftypefn Instruction {} abort x24:@var{_}
|
|
Abort to a prompt handler. The tag is expected in slot 1, and the rest
|
|
of the values in the frame are returned to the prompt handler. This
|
|
corresponds to a tail application of abort-to-prompt.
|
|
|
|
If no prompt can be found in the dynamic environment with the given tag,
|
|
an error is signalled. Otherwise all arguments are passed to the
|
|
prompt's handler, along with the captured continuation, if necessary.
|
|
|
|
If the prompt's handler can be proven to not reference the captured
|
|
continuation, no continuation is allocated. This decision happens
|
|
dynamically, at run-time; the general case is that the continuation may
|
|
be captured, and thus resumed. A reinstated continuation will have its
|
|
arguments pushed on the stack from slot 1, as if from a multiple-value
|
|
return, and control resumes in the caller. Thus to the calling
|
|
function, a call to @code{abort-to-prompt} looks like any other function
|
|
call.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} prompt u24:@var{tag} b1:@var{escape-only?} x7:@var{_} u24:@var{proc-slot} x8:@var{_} l24:@var{handler-offset}
|
|
Push a new prompt on the dynamic stack, with a tag from @var{tag} and a
|
|
handler at @var{handler-offset} words from the current @var{ip}.
|
|
|
|
If an abort is made to this prompt, control will jump to the handler.
|
|
The handler will expect a multiple-value return as if from a call with
|
|
the procedure at @var{proc-slot}, with the reified partial continuation
|
|
as the first argument, followed by the values returned to the handler.
|
|
If control returns to the handler, the prompt is already popped off by
|
|
the abort mechanism. (Guile's @code{prompt} implements Felleisen's
|
|
@dfn{--F--} operator.)
|
|
|
|
If @var{escape-only?} is nonzero, the prompt will be marked as
|
|
escape-only, which allows an abort to this prompt to avoid reifying the
|
|
continuation.
|
|
|
|
@xref{Prompts}, for more information on prompts.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} wind u12:@var{winder} u12:@var{unwinder}
|
|
Push wind and unwind procedures onto the dynamic stack. Note that
|
|
neither are actually called; the compiler should emit calls to wind and
|
|
unwind for the normal dynamic-wind control flow. Also note that the
|
|
compiler should have inserted checks that they wind and unwind procs are
|
|
thunks, if it could not prove that to be the case. @xref{Dynamic Wind}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} unwind x24:@var{_}
|
|
@var{a} normal exit from the dynamic extent of an expression. Pop the top
|
|
entry off of the dynamic stack.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} push-fluid u12:@var{fluid} u12:@var{value}
|
|
Dynamically bind @var{value} to @var{fluid} by creating a with-fluids
|
|
object and pushing that object on the dynamic stack. @xref{Fluids and
|
|
Dynamic States}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} pop-fluid x24:@var{_}
|
|
Leave the dynamic extent of a @code{with-fluid*} expression, restoring
|
|
the fluid to its previous value. @code{push-fluid} should always be
|
|
balanced with @code{pop-fluid}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} fluid-ref u12:@var{dst} u12:@var{src}
|
|
Reference the fluid in @var{src}, and place the value in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} fluid-set u12:@var{fluid} u12:@var{val}
|
|
Set the value of the fluid in @var{dst} to the value in @var{src}.
|
|
@end deftypefn
|
|
|
|
|
|
@node Miscellaneous Instructions
|
|
@subsubsection Miscellaneous Instructions
|
|
|
|
@deftypefn Instruction {} halt x24:@var{_}
|
|
Bring the VM to a halt, returning all the values from the stack. Used
|
|
in the ``boot continuation'', which is used when entering the VM from C.
|
|
@end deftypefn
|
|
|
|
|
|
@node Inlined Scheme Instructions
|
|
@subsubsection Inlined Scheme Instructions
|
|
|
|
The Scheme compiler can recognize the application of standard Scheme
|
|
procedures. It tries to inline these small operations to avoid the
|
|
overhead of creating new stack frames. This allows the compiler to
|
|
optimize better.
|
|
|
|
@deftypefn Instruction {} make-vector u8:@var{dst} u8:@var{length} u8:@var{init}
|
|
Make a vector and write it to @var{dst}. The vector will have space for
|
|
@var{length} slots. They will be filled with the value in slot
|
|
@var{init}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} make-vector/immediate u8:@var{dst} u8:@var{length} u8:@var{init}
|
|
Make a short vector of known size and write it to @var{dst}. The vector
|
|
will have space for @var{length} slots, an immediate value. They will
|
|
be filled with the value in slot @var{init}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} vector-length u12:@var{dst} u12:@var{src}
|
|
Store the length of the vector in @var{src} in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} vector-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
Fetch the item at position @var{idx} in the vector in @var{src}, and
|
|
store it in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} vector-ref/immediate u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
Fill @var{dst} with the item @var{idx} elements into the vector at
|
|
@var{src}. Useful for building data types using vectors.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} vector-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
Store @var{src} into the vector @var{dst} at index @var{idx}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} vector-set!/immediate u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
Store @var{src} into the vector @var{dst} at index @var{idx}. Here
|
|
@var{idx} is an immediate value.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} struct-vtable u12:@var{dst} u12:@var{src}
|
|
Store the vtable of @var{src} into @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} allocate-struct u8:@var{dst} u8:@var{vtable} u8:@var{nfields}
|
|
Allocate a new struct with @var{vtable}, and place it in @var{dst}. The
|
|
struct will be constructed with space for @var{nfields} fields, which
|
|
should correspond to the field count of the @var{vtable}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} struct-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
Fetch the item at slot @var{idx} in the struct in @var{src}, and store
|
|
it in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} struct-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
Store @var{src} into the struct @var{dst} at slot @var{idx}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} allocate-struct/immediate u8:@var{dst} u8:@var{vtable} u8:@var{nfields}
|
|
@deftypefnx Instruction {} struct-ref/immediate u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} struct-set!/immediate u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
Variants of the struct instructions, but in which the @var{nfields} or
|
|
@var{idx} fields are immediate values.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} class-of u12:@var{dst} u12:@var{type}
|
|
Store the vtable of @var{src} into @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} make-array u8:@var{dst} u8:@var{type} u8:@var{fill} x8:@var{_} u24:@var{bounds}
|
|
Make a new array with @var{type}, @var{fill}, and @var{bounds}, storing it in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} string-length u12:@var{dst} u12:@var{src}
|
|
Store the length of the string in @var{src} in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} string-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
Fetch the character at position @var{idx} in the string in @var{src}, and store
|
|
it in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} cons u8:@var{dst} u8:@var{car} u8:@var{cdr}
|
|
Cons @var{car} and @var{cdr}, and store the result in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} car u12:@var{dst} u12:@var{src}
|
|
Place the car of @var{src} in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} cdr u12:@var{dst} u12:@var{src}
|
|
Place the cdr of @var{src} in @var{dst}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} set-car! u12:@var{pair} u12:@var{car}
|
|
Set the car of @var{dst} to @var{src}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} set-cdr! u12:@var{pair} u12:@var{cdr}
|
|
Set the cdr of @var{dst} to @var{src}.
|
|
@end deftypefn
|
|
|
|
Note that @code{caddr} and friends compile to a series of @code{car}
|
|
and @code{cdr} instructions.
|
|
|
|
|
|
@node Inlined Mathematical Instructions
|
|
@subsubsection Inlined Mathematical Instructions
|
|
|
|
Inlining mathematical operations has the obvious advantage of handling
|
|
fixnums without function calls or allocations. The trick, of course,
|
|
is knowing when the result of an operation will be a fixnum, and there
|
|
might be a couple bugs here.
|
|
|
|
More instructions could be added here over time.
|
|
|
|
All of these operations place their result in their first operand,
|
|
@var{dst}.
|
|
|
|
@deftypefn Instruction {} add u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Add @var{a} to @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} add1 u12:@var{dst} u12:@var{src}
|
|
Add 1 to the value in @var{src}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} sub u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Subtract @var{b} from @var{a}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} sub1 u12:@var{dst} u12:@var{src}
|
|
Subtract 1 from @var{src}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} mul u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Multiply @var{a} and @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} div u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Divide @var{a} by @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} quo u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Divide @var{a} by @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} rem u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Divide @var{a} by @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} mod u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Compute the modulo of @var{a} by @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} ash u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Shift @var{a} arithmetically by @var{b} bits.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} logand u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Compute the bitwise @code{and} of @var{a} and @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} logior u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Compute the bitwise inclusive @code{or} of @var{a} with @var{b}.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} logxor u8:@var{dst} u8:@var{a} u8:@var{b}
|
|
Compute the bitwise exclusive @code{or} of @var{a} with @var{b}.
|
|
@end deftypefn
|
|
|
|
|
|
@node Inlined Bytevector Instructions
|
|
@subsubsection Inlined Bytevector Instructions
|
|
|
|
Bytevector operations correspond closely to what the current hardware
|
|
can do, so it makes sense to inline them to VM instructions, providing
|
|
a clear path for eventual native compilation. Without this, Scheme
|
|
programs would need other primitives for accessing raw bytes -- but
|
|
these primitives are as good as any.
|
|
|
|
@deftypefn Instruction {} bv-u8-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-s8-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-u16-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-s16-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-u32-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-s32-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-u64-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-s64-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-f32-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
@deftypefnx Instruction {} bv-f64-ref u8:@var{dst} u8:@var{src} u8:@var{idx}
|
|
|
|
Fetch the item at byte offset @var{idx} in the bytevector @var{src}, and
|
|
store it in @var{dst}. All accesses use native endianness.
|
|
@end deftypefn
|
|
|
|
@deftypefn Instruction {} bv-u8-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-s8-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-u16-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-s16-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-u32-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-s32-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-u64-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-s64-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-f32-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
@deftypefnx Instruction {} bv-f64-set! u8:@var{dst} u8:@var{idx} u8:@var{src}
|
|
|
|
Store @var{src} into the bytevector @var{dst} at byte offset @var{idx}.
|
|
Multibyte values are written using native endianness.
|
|
@end deftypefn
|