1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-06-15 16:20:17 +02:00

update docs, clean up VM vestiges, macro docs, fix (/ a b c)

* doc/ref/api-procedures.texi (Compiled Procedures): Fix for API changes.

* doc/ref/compiler.texi (Compiling to the Virtual Machine): Replace GHIL
  docs with Tree-IL docs. Update the bits about the Scheme compiler to
  talk about Tree-IL and the expander instead of GHIL. Remove
  <glil-argument>. Add placeholder sections for assembly and bytecode.

* doc/ref/vm.texi: Update examples with what currently happens. Reword
  some things. Fix a couple errors.

* libguile/vm-i-system.c (externals): Remove this instruction, it's not
  used.

* module/ice-9/documentation.scm (object-documentation): If the object is
  a macro, try to return documentation on the macro transformer.

* module/language/assembly/disassemble.scm (disassemble-load-program):
  Fix problem in which we skipped the first element of the object vector,
  because of changes to procedure layouts a few months ago.

* module/language/scheme/spec.scm (read-file): Remove read-file
  definition.

* module/language/tree-il.scm: Reorder exports. Remove <lexical>, it was
  a compat shim to something that was never released. Fix `location'.

* module/language/tree-il/primitives.scm (/): Fix expander for more than
  two args to /.

* module/system/base/compile.scm (read-file-in): Remove unused
  definition.

* module/system/base/language.scm (system): Remove language-read-file.

* module/language/ecmascript/spec.scm (ecmascript): Remove read-file
  definition.
This commit is contained in:
Andy Wingo 2009-05-24 13:09:01 +02:00
parent a755136ba8
commit 81fd315299
12 changed files with 317 additions and 455 deletions

View file

@ -162,18 +162,10 @@ appropriate module first, though:
Returns @code{#t} iff @var{obj} is a compiled procedure.
@end deffn
@deffn {Scheme Procedure} program-bytecode program
@deffnx {C Function} scm_program_bytecode (program)
Returns the object code associated with this program, as a
@code{u8vector}.
@end deffn
@deffn {Scheme Procedure} program-base program
@deffnx {C Function} scm_program_base (program)
Returns the address in memory corresponding to the start of
@var{program}'s object code, as an integer. This is useful mostly when
you map the value of an instruction pointer from the VM to actual
instructions.
@deffn {Scheme Procedure} program-objcode program
@deffnx {C Function} scm_program_objcode (program)
Returns the object code associated with this program. @xref{Object
Code}, for more information.
@end deffn
@deffn {Scheme Procedure} program-objects program
@ -184,9 +176,9 @@ vector. @xref{VM Programs}, for more information.
@deffn {Scheme Procedure} program-module program
@deffnx {C Function} scm_program_module (program)
Returns the module that was current when this program was created.
Free variables in this program are looked up with respect to this
module.
Returns the module that was current when this program was created. Can
return @code{#f} if the compiler could determine that this information
was unnecessary.
@end deffn
@deffn {Scheme Procedure} program-external program
@ -250,9 +242,9 @@ REPL. The only tricky bit is that @var{extp} is a boolean, declaring
whether the binding is heap-allocated or not. @xref{VM Concepts}, for
more information.
Note that bindings information are stored in a program as part of its
metadata thunk, so including them in the generated object code does
not impose a runtime performance penalty.
Note that bindings information is stored in a program as part of its
metadata thunk, so including it in the generated object code does not
impose a runtime performance penalty.
@end deffn
@deffn {Scheme Procedure} program-sources program

View file

@ -22,8 +22,10 @@ know how to compile your .scm file.
@menu
* Compiler Tower::
* The Scheme Compiler::
* GHIL::
* Tree-IL::
* GLIL::
* Assembly::
* Bytecode::
* Object Code::
* Extending the Compiler::
@end menu
@ -52,7 +54,7 @@ They are registered with the @code{define-language} form.
@deffn {Scheme Syntax} define-language @
name title version reader printer @
[parser=#f] [read-file=#f] [compilers='()] [evaluator=#f]
[parser=#f] [compilers='()] [decompilers='()] [evaluator=#f]
Define a language.
This syntax defines a @code{#<language>} object, bound to @var{name}
@ -65,14 +67,12 @@ for Scheme:
#:title "Guile Scheme"
#:version "0.5"
#:reader read
#:read-file read-file
#:compilers `((,ghil . ,compile-ghil))
#:compilers `((tree-il . ,compile-tree-il)
(ghil . ,compile-ghil))
#:decompilers `((tree-il . ,decompile-tree-il))
#:evaluator (lambda (x module) (primitive-eval x))
#:printer write)
@end example
In this example, from @code{(language scheme spec)}, @code{read-file}
reads expressions from a port and wraps them in a @code{begin} block.
@end deffn
The interesting thing about having languages defined this way is that
@ -85,12 +85,12 @@ Guile Scheme interpreter 0.5 on Guile 1.9.0
Copyright (C) 2001-2008 Free Software Foundation, Inc.
Enter `,help' for help.
scheme@@(guile-user)> ,language ghil
Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
scheme@@(guile-user)> ,language tree-il
Tree Intermediate Language interpreter 1.0 on Guile 1.9.0
Copyright (C) 2001-2008 Free Software Foundation, Inc.
Enter `,help' for help.
ghil@@(guile-user)>
tree-il@@(guile-user)>
@end example
Languages can be looked up by name, as they were above.
@ -128,8 +128,10 @@ The normal tower of languages when compiling Scheme goes like this:
@itemize
@item Scheme, which we know and love
@item Guile High Intermediate Language (GHIL)
@item Tree Intermediate Language (Tree-IL)
@item Guile Low Intermediate Language (GLIL)
@item Assembly
@item Bytecode
@item Object code
@end itemize
@ -137,8 +139,14 @@ Object code may be serialized to disk directly, though it has a cookie
and version prepended to the front. But when compiling Scheme at
run time, you want a Scheme value, e.g. a compiled procedure. For this
reason, so as not to break the abstraction, Guile defines a fake
language, @code{value}. Compiling to @code{value} loads the object
code into a procedure, and wakes the sleeping giant.
language at the bottom of the tower:
@itemize
@item Value
@end itemize
Compiling to @code{value} loads the object code into a procedure, and
wakes the sleeping giant.
Perhaps this strangeness can be explained by example:
@code{compile-file} defaults to compiling to object code, because it
@ -156,340 +164,254 @@ different worlds indefinitely, as shown by the following quine:
@node The Scheme Compiler
@subsection The Scheme Compiler
The job of the Scheme compiler is to expand all macros and to resolve
all symbols to lexical variables. Its target language, GHIL, is fairly
close to Scheme itself, so this process is not very complicated.
The job of the Scheme compiler is to expand all macros and all of
Scheme to its most primitive expressions. The definition of
``primitive'' is given by the inventory of constructs provided by
Tree-IL, the target language of the Scheme compiler: procedure
applications, conditionals, lexical references, etc. This is described
more fully in the next section.
The Scheme compiler is driven by a table of @dfn{translators},
declared with the @code{define-scheme-translator} form, defined in the
module, @code{(language scheme compile-ghil)}.
The tricky and amusing thing about the Scheme-to-Tree-IL compiler is
that it is completely implemented by the macro expander. Since the
macro expander has to run over all of the source code already in order
to expand macros, it might as well do the analysis at the same time,
producing Tree-IL expressions directly.
@deffn {Scheme Syntax} define-scheme-translator head clause1 clause2...
The best documentation of this form is probably an example. Here is
the translator for @code{if}:
Because this compiler is actually the macro expander, it is
extensible. Any macro which the user writes becomes part of the
compiler.
@example
(define-scheme-translator if
;; (if TEST THEN [ELSE])
((,test ,then)
(make-ghil-if e l (retrans test) (retrans then) (retrans '(begin))))
((,test ,then ,else)
(make-ghil-if e l (retrans test) (retrans then) (retrans else))))
@end example
The Scheme-to-Tree-IL expander may be invoked using the generic
@code{compile} procedure:
The match syntax is from the @code{pmatch} macro, defined in
@code{(system base pmatch)}. The result of a clause should be a valid
GHIL value. If no clause matches, a syntax error is signalled.
@lisp
(compile '(+ 1 2) #:from 'scheme #:to 'tree-il)
@result{}
#<<application> src: #f
proc: #<<toplevel-ref> src: #f name: +>
args: (#<<const> src: #f exp: 1>
#<<const> src: #f exp: 2>)>
@end lisp
In the body of the clauses, the following bindings are introduced:
@itemize
@item @code{e}, the current environment
@item @code{l}, the current source location (or @code{#f})
@item @code{retrans}, a procedure that may be called to compile
subexpressions
@end itemize
Or, since Tree-IL is so close to Scheme, it is often useful to expand
Scheme to Tree-IL, then translate back to Scheme. For that reason the
expander provides two interfaces. The former is equivalent to calling
@code{(sc-expand '(+ 1 2) 'c)}, where the @code{'c} is for
``compile''. With @code{'e} (the default), the result is translated
back to Scheme:
Note that translators are looked up by @emph{value}, not by name. That
is to say, the translator is keyed under the @emph{value} of
@code{if}, which normally prints as @code{#<primitive-builtin-macro!
if>}.
@end deffn
@lisp
(sc-expand '(+ 1 2))
@result{} (+ 1 2)
(sc-expand '(let ((x 10)) (* x x)))
@result{} (let ((x84 10)) (* x84 x84))
@end lisp
Users can extend the compiler by defining new translators.
Additionally, some forms can be inlined directly to
instructions -- @xref{Inlined Scheme Instructions}, for a list. The
actual inliners are defined in @code{(language scheme inline)}:
The second example shows that as part of its job, the macro expander
renames lexically-bound variables. The original names are preserved
when compiling to Tree-IL, but can't be represented in Scheme: a
lexical binding only has one name. It is for this reason that the
@emph{native} output of the expander is @emph{not} Scheme. There's too
much information we would lose if we translated to Scheme directly:
lexical variable names, source locations, and module hygiene.
@deffn {Scheme Syntax} define-inline head arity1 result1 arity2 result2...
Defines an inliner for @code{head}. As in
@code{define-scheme-translator}, inliners are keyed by value and not
by name.
Note however that @code{sc-expand} does not have the same signature as
@code{compile-tree-il}. @code{compile-tree-il} is a small wrapper
around @code{sc-expand}, to make it conform to the general form of
compiler procedures in Guile's language tower.
Expressions are matched on their arities. For example:
Compiler procedures take two arguments, an expression and an
environment. They return three values: the compiled expression, the
corresponding environment for the target language, and a
``continuation environment''. The compiled expression and environment
will serve as input to the next language's compiler. The
``continuation environment'' can be used to compile another expression
from the same source language within the same module.
@example
(define-inline eq?
(x y) (eq? x y))
@end example
For example, you might compile the expression, @code{(define-module
(foo))}. This will result in a Tree-IL expression and environment. But
if you compiled a second expression, you would want to take into
account the compile-time effect of compiling the previous expression,
which puts the user in the @code{(foo)} module. That is purpose of the
``continuation environment''; you would pass it as the environment
when compiling the subsequent expression.
This inlines calls to the Scheme procedure, @code{eq?}, to the
instruction @code{eq?}.
A more complicated example would be:
@example
(define-inline +
() 0
(x) x
(x y) (add x y)
(x y . rest) (add x (+ y . rest)))
@end example
@end deffn
Compilers take two arguments, an expression and an environment, and
return two values as well: an expression in the target language, and
an environment suitable for the target language. The format of the
environment is language-dependent.
For Scheme, an environment may be one of three things:
For Scheme, an environment may be one of two things:
@itemize
@item @code{#f}, in which case compilation is performed in the context
of the current module;
@item a module, which specifies the context of the compilation; or
@item a @dfn{compile environment}, which specifies lexical variables
as well.
of the current module; or
@item a module, which specifies the context of the compilation.
@end itemize
The format of a compile environment for scheme is @code{(@var{module}
@var{lexicals} . @var{externals})}, though users are strongly
discouraged from constructing these environments themselves. Instead,
if you need this functionality -- as in GOOPS' dynamic method compiler
-- capture an environment with @code{compile-time-environment}, then
pass that environment to @code{compile}.
@node Tree-IL
@subsection Tree-IL
@deffn {Scheme Procedure} compile-time-environment
A special function known to the compiler that, when compiled, will
return a representation of the lexical environment in place at compile
time. Useful for supporting some forms of dynamic compilation. Returns
@code{#f} if called from the interpreter.
@end deffn
@node GHIL
@subsection GHIL
Guile High Intermediate Language (GHIL) is a structured intermediate
Tree Intermediate Language (Tree-IL) is a structured intermediate
language that is close in expressive power to Scheme. It is an
expanded, pre-analyzed Scheme.
GHIL is ``structured'' in the sense that its representation is based
on records, not S-expressions. This gives a rigidity to the language
that ensures that compiling to a lower-level language only requires a
limited set of transformations. Practically speaking, consider the
GHIL type, @code{<ghil-quote>}, which has fields named @code{env},
@code{loc}, and @code{exp}. Instances of this type are records created
via @code{make-ghil-quote}, and whose fields are accessed as
@code{ghil-quote-env}, @code{ghil-quote-loc}, and
@code{ghil-quote-exp}. There is also a predicate, @code{ghil-quote?}.
@xref{Records}, for more information on records.
Tree-IL is ``structured'' in the sense that its representation is
based on records, not S-expressions. This gives a rigidity to the
language that ensures that compiling to a lower-level language only
requires a limited set of transformations. Practically speaking,
consider the Tree-IL type, @code{<const>}, which has two fields,
@code{src} and @code{exp}. Instances of this type are records created
via @code{make-const}, and whose fields are accessed as
@code{const-src}, and @code{const-exp}. There is also a predicate,
@code{const?}. @xref{Records}, for more information on records.
Expressions of GHIL name their environments explicitly, and all
variables are referenced by identity in addition to by name.
@code{(language ghil)} defines a number of routines to deal explicitly
with variables and environments:
@c alpha renaming
@deftp {Scheme Variable} <ghil-toplevel-env> [table='()]
A toplevel environment. The @var{table} holds all toplevel variables
that have been resolved in this environment.
@end deftp
@deftp {Scheme Variable} <ghil-env> parent [table='()] [variables='()]
A lexical environment. @var{parent} will be the enclosing lexical
environment, or a toplevel environment. @var{table} holds an alist
mapping symbols to variables bound in this environment, while
@var{variables} holds a cumulative list of all variables ever defined
in this environment.
All Tree-IL types have a @code{src} slot, which holds source location
information for the expression. This information, if present, will be
residualized into the compiled object code, allowing backtraces to
show source information. The format of @code{src} is the same as that
returned by Guile's @code{source-properties} function. @xref{Source
Properties}, for more information.
Lexical environments correspond to procedures. Bindings introduced
e.g. by Scheme's @code{let} add to the bindings in a lexical
environment. An example of a case in which a variable might be in
@var{variables} but not in @var{table} would be a variable that is in
the same procedure, but is out of scope.
@end deftp
@deftp {Scheme Variable} <ghil-var> env name kind [index=#f]
A variable. @var{kind} is one of @code{argument}, @code{local},
@code{external}, @code{toplevel}, @code{public}, or @code{private};
see the procedures below for more information. @var{index} is used in
compilation.
@end deftp
@deffn {Scheme Procedure} ghil-var-is-bound? env sym
Recursively look up a variable named @var{sym} in @var{env}, and
return it or @code{#f} if none is found.
@end deffn
@deffn {Scheme Procedure} ghil-var-for-ref! env sym
Recursively look up a variable named @var{sym} in @var{env}, and
return it. If the symbol was not bound, return a new toplevel
variable.
@end deffn
@deffn {Scheme Procedure} ghil-var-for-set! env sym
Like @code{ghil-var-for-ref!}, except that the returned variable will
be marked as @code{external}. @xref{Variables and the VM}.
@end deffn
@deffn {Scheme Procedure} ghil-var-define! toplevel-env sym
Return an existing or new toplevel variable named @var{sym}.
@var{toplevel-env} must be a toplevel environment.
@end deffn
@deffn {Scheme Procedure} ghil-var-at-module! env modname sym interface?
Return a variable that will be resolved at run-time with respect to a
specific module named @var{modname}. If @var{interface?} is true, the
variable will be of type @code{public}, otherwise @code{private}.
@end deffn
@deffn {Scheme Procedure} call-with-ghil-environment env syms func
Bind @var{syms} to fresh variables within a new lexical environment
whose parent is @var{env}, and call @var{func} as @code{(@var{func}
@var{new-env} @var{new-vars})}.
@end deffn
@deffn {Scheme Procedure} call-with-ghil-bindings env syms func
Like @code{call-with-ghil-environment}, except the existing
environment @var{env} is re-used. For that reason, @var{func} is
invoked as @code{(@var{func} @var{new-vars})}
@end deffn
In the aforementioned @code{<ghil-quote>} type, the @var{env} slot
holds a pointer to the environment in which the expression occurs. The
@var{loc} slot holds source location information, so that errors
corresponding to this expression can be mapped back to the initial
expression in the higher-level language, e.g. Scheme. @xref{Compiled
Procedures}, for more information on source location objects.
GHIL also has a declarative serialization format, which makes writing
and reading it a tractable problem for the human mind. Since all GHIL
language constructs contain @code{env} and @code{loc} pointers, they
are left out of the serialization. (Serializing @code{env} structures
would be difficult, as they are often circular.) What is left is the
type of expression, and the remaining slots defined in the expression
type.
For example, an S-expression representation of the @code{<ghil-quote>}
expression would be:
Although Tree-IL objects are represented internally using records,
there is also an equivalent S-expression external representation for
each kind of Tree-IL. For example, an the S-expression representation
of @code{#<const src: #f exp: 3>} expression would be:
@example
(quote 3)
(const 3)
@end example
It's deceptively like Scheme. The general rule is, for a type defined
as @code{<ghil-@var{foo}> env loc @var{slot1} @var{slot2}...}, the
S-expression representation will be @code{(@var{foo} @var{slot1}
@var{slot2}...)}. Users may program with this format directly at the
REPL:
Users may program with this format directly at the REPL:
@example
scheme@@(guile-user)> ,language ghil
Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
scheme@@(guile-user)> ,language tree-il
Tree Intermediate Language interpreter 1.0 on Guile 1.9.0
Copyright (C) 2001-2008 Free Software Foundation, Inc.
Enter `,help' for help.
ghil@@(guile-user)> (call (ref +) (quote 32) (quote 10))
tree-il@@(guile-user)> (apply (primitive +) (const 32) (const 10))
@result{} 42
@end example
For convenience, some slots are serialized as rest arguments; those
are noted below. The other caveat is that variables are serialized as
their names only, and not their identities.
The @code{src} fields are left out of the external representation.
@deftp {Scheme Variable} <ghil-void> env loc
The unspecified value.
@deftp {Scheme Variable} <void> src
@deftpx {External Representation} (void)
An empty expression. In practice, equivalent to Scheme's @code{(if #f
#f)}.
@end deftp
@deftp {Scheme Variable} <ghil-quote> env loc exp
A quoted expression.
@deftp {Scheme Variable} <const> src exp
@deftpx {External Representation} (const @var{exp})
A constant.
@end deftp
@deftp {Scheme Variable} <primitive-ref> src name
@deftpx {External Representation} (primitive @var{name})
A reference to a ``primitive''. A primitive is a procedure that, when
compiled, may be open-coded. For example, @code{cons} is usually
recognized as a primitive, so that it compiles down to a single
instruction.
Note that unlike in Scheme, there are no self-quoting expressions; all
constants must come from @code{quote} expressions.
Compilation of Tree-IL usually begins with a pass that resolves some
@code{<module-ref>} and @code{<toplevel-ref>} expressions to
@code{<primitive-ref>} expressions. The actual compilation pass
has special cases for applications of certain primitives, like
@code{apply} or @code{cons}.
@end deftp
@deftp {Scheme Variable} <ghil-quasiquote> env loc exp
A quasiquoted expression. The expression is treated as a constant,
except for embedded @code{unquote} and @code{unquote-splicing} forms.
@deftp {Scheme Variable} <lexical-ref> src name gensym
@deftpx {External Representation} (lexical @var{name} @var{gensym})
A reference to a lexically-bound variable. The @var{name} is the
original name of the variable in the source program. @var{gensym} is a
unique identifier for this variable.
@end deftp
@deftp {Scheme Variable} <ghil-unquote> env loc exp
Like Scheme's @code{unquote}; only valid within a quasiquote.
@deftp {Scheme Variable} <lexical-set> src name gensym exp
@deftpx {External Representation} (set! (lexical @var{name} @var{gensym}) @var{exp})
Sets a lexically-bound variable.
@end deftp
@deftp {Scheme Variable} <ghil-unquote-splicing> env loc exp
Like Scheme's @code{unquote-splicing}; only valid within a quasiquote.
@deftp {Scheme Variable} <module-ref> src mod name public?
@deftpx {External Representation} (@@ @var{mod} @var{name})
@deftpx {External Representation} (@@@@ @var{mod} @var{name})
A reference to a variable in a specific module. @var{mod} should be
the name of the module, e.g. @code{(guile-user)}.
If @var{public?} is true, the variable named @var{name} will be looked
up in @var{mod}'s public interface, and serialized with @code{@@};
otherwise it will be looked up among the module's private bindings,
and is serialized with @code{@@@@}.
@end deftp
@deftp {Scheme Variable} <ghil-ref> env loc var
A variable reference. Note that for purposes of serialization,
@var{var} is serialized as its name, as a symbol.
@deftp {Scheme Variable} <module-set> src mod name public? exp
@deftpx {External Representation} (set! (@@ @var{mod} @var{name}) @var{exp})
@deftpx {External Representation} (set! (@@@@ @var{mod} @var{name}) @var{exp})
Sets a variable in a specific module.
@end deftp
@deftp {Scheme Variable} <ghil-set> env loc var val
A variable mutation. @var{var} is serialized as a symbol.
@deftp {Scheme Variable} <toplevel-ref> src name
@deftpx {External Representation} (toplevel @var{name})
References a variable from the current procedure's module.
@end deftp
@deftp {Scheme Variable} <ghil-define> env loc var val
A toplevel variable definition. See @code{ghil-var-define!}.
@deftp {Scheme Variable} <toplevel-set> src name exp
@deftpx {External Representation} (set! (toplevel @var{name}) @var{exp})
Sets a variable in the current procedure's module.
@end deftp
@deftp {Scheme Variable} <ghil-if> env loc test then else
@deftp {Scheme Variable} <toplevel-define> src name exp
@deftpx {External Representation} (define (toplevel @var{name}) @var{exp})
Defines a new top-level variable in the current procedure's module.
@end deftp
@deftp {Scheme Variable} <conditional> src test then else
@deftpx {External Representation} (if @var{test} @var{then} @var{else})
A conditional. Note that @var{else} is not optional.
@end deftp
@deftp {Scheme Variable} <ghil-and> env loc . exps
Like Scheme's @code{and}.
@end deftp
@deftp {Scheme Variable} <ghil-or> env loc . exps
Like Scheme's @code{or}.
@end deftp
@deftp {Scheme Variable} <ghil-begin> env loc . body
Like Scheme's @code{begin}.
@end deftp
@deftp {Scheme Variable} <ghil-bind> env loc vars exprs . body
Like a deconstructed @code{let}: each element of @var{vars} will be
bound to the corresponding GHIL expression in @var{exprs}.
Note that for purposes of the serialization format, @var{exprs} are
evaluated before the new bindings are added to the environment. For
@code{letrec} semantics, there also exists a @code{bindrec} parse
flavor. This is useful for writing GHIL at the REPL, but the
serializer does not currently have the cleverness needed to determine
whether a @code{<ghil-bind>} has @code{let} or @code{letrec}
semantics, and thus only serializes @code{<ghil-bind>} as @code{bind}.
@end deftp
@deftp {Scheme Variable} <ghil-mv-bind> env loc vars rest producer . body
Like Scheme's @code{receive} -- binds the values returned by
applying @code{producer}, which should be a thunk, to the
@code{lambda}-like bindings described by @var{vars} and @var{rest}.
@end deftp
@deftp {Scheme Variable} <ghil-lambda> env loc vars rest meta . body
A closure. @var{vars} is the argument list, serialized as a list of
symbols. @var{rest} is a boolean, which is @code{#t} iff the last
argument is a rest argument. @var{meta} is an association list of
properties. The actual @var{body} should be a list of GHIL
expressions.
@end deftp
@deftp {Scheme Variable} <ghil-call> env loc proc . args
@deftp {Scheme Variable} <application> src proc args
@deftpx {External Representation} (apply @var{proc} . @var{args})
A procedure call.
@end deftp
@deftp {Scheme Variable} <ghil-mv-call> env loc producer consumer
Like Scheme's @code{call-with-values}.
@deftp {Scheme Variable} <sequence> src exps
@deftpx {External Representation} (begin . @var{exps})
Like Scheme's @code{begin}.
@end deftp
@deftp {Scheme Variable} <ghil-inline> env loc op . args
An inlined VM instruction. @var{op} should be the instruction name as
a symbol, and @var{args} should be its arguments, as GHIL expressions.
@deftp {Scheme Variable} <lambda> src names vars meta body
@deftpx {External Representation} (lambda @var{names} @var{vars} @var{meta} @var{body})
A closure. @var{names} is original binding form, as given in the
source code, which may be an improper list. @var{vars} are gensyms
corresponding to the @var{names}. @var{meta} is an association list of
properties. The actual @var{body} is a single Tree-IL expression.
@end deftp
@deftp {Scheme Variable} <ghil-values> env loc . values
Like Scheme's @code{values}.
@deftp {Scheme Variable} <let> src names vars vals exp
@deftpx {External Representation} (let @var{names} @var{vars} @var{vals} @var{exp})
Lexical binding, like Scheme's @code{let}. @var{names} are the
original binding names, @var{vars} are gensyms corresponding to the
@var{names}, and @var{vals} are Tree-IL expressions for the values.
@var{exp} is a single Tree-IL expression.
@end deftp
@deftp {Scheme Variable} <ghil-values*> env loc . values
@var{values} are as in the Scheme expression, @code{(apply values .
@var{vals})}.
@end deftp
@deftp {Scheme Variable} <ghil-reified-env> env loc
Produces, at run-time, a reification of the environment at compile
time. Used in the implementation of Scheme's
@code{compile-time-environment}.
@deftp {Scheme Variable} <letrec> src names vars vals exp
@deftpx {External Representation} (letrec @var{names} @var{vars} @var{vals} @var{exp})
A version of @code{<let>} that creates recursive bindings, like
Scheme's @code{letrec}.
@end deftp
GHIL implements a compiler to GLIL that recursively traverses GHIL
expressions, writing out GLIL expressions into a linear list. The
compiler also keeps some state as to whether the current expression is
in tail context, and whether its value will be used in future
computations. This state allows the compiler not to emit code for
constant expressions that will not be used (e.g. docstrings), and to
perform tail calls when in tail position.
@c FIXME -- need to revive this one
@c @deftp {Scheme Variable} <ghil-mv-bind> src vars rest producer . body
@c Like Scheme's @code{receive} -- binds the values returned by
@c applying @code{producer}, which should be a thunk, to the
@c @code{lambda}-like bindings described by @var{vars} and @var{rest}.
@c @end deftp
Just as the Scheme to GHIL compiler introduced new hidden state---the
environment---the GHIL to GLIL compiler introduces more state, the
stack. While not represented explicitly, the stack is present in the
compilation of each GHIL expression: compiling a GHIL expression
should leave the run-time value stack in the same state. For example,
if the intermediate value stack has two elements before evaluating an
@code{if} expression, it should have two elements after that
expression.
Tree-IL implements a compiler to GLIL that recursively traverses
Tree-IL expressions, writing out GLIL expressions into a linear list.
The compiler also keeps some state as to whether the current
expression is in tail context, and whether its value will be used in
future computations. This state allows the compiler not to emit code
for constant expressions that will not be used (e.g. docstrings), and
to perform tail calls when in tail position.
In the future, there will be a pass at the beginning of the
Tree-IL->GLIL compilation step to perform inlining, copy propagation,
dead code elimination, and constant folding.
Interested readers are encouraged to read the implementation in
@code{(language ghil compile-glil)} for more details.
@code{(language tree-il compile-glil)} for more details.
@node GLIL
@subsection GLIL
Guile Low Intermediate Language (GLIL) is a structured intermediate
language whose expressions closely mirror the functionality of Guile's
VM instruction set.
language whose expressions more closely approximate Guile's VM
instruction set.
Its expression types are defined in @code{(language glil)}, and as
with GHIL, some of its fields parse as rest arguments.
@ -544,14 +466,11 @@ Pushes a constant value onto the stack. @var{obj} must be a number,
string, symbol, keyword, boolean, character, or a pair or vector or
list thereof, or the empty list.
@end deftp
@deftp {Scheme Variable} <glil-argument> op index
Accesses an argument on the stack. If @var{op} is @code{ref}, the
argument is pushed onto the stack; if it is @code{set}, the argument
is set from the top value on the stack, which is popped off.
@end deftp
@deftp {Scheme Variable} <glil-local> op index
Like @code{<glil-argument>}, but for local variables. @xref{Stack
Layout}, for more information.
Accesses a lexically variable from the stack. If @var{op} is
@code{ref}, the value is pushed onto the stack; if it is @code{set},
the variable is set from the top value on the stack, which is popped
off. @xref{Stack Layout}, for more information.
@end deftp
@deftp {Scheme Variable} <glil-external> op depth index
Accesses a heap-allocated variable, addressed by @var{depth}, the nth
@ -607,6 +526,12 @@ Just as in all of Guile's compilers, an environment is passed to the
GLIL-to-object code compiler, and one is returned as well, along with
the object code.
@node Assembly
@subsection Assembly
@node Bytecode
@subsection Bytecode
@node Object Code
@subsection Object Code

View file

@ -111,7 +111,7 @@ The registers that a VM has are as follows:
In other architectures, the instruction pointer is sometimes called
the ``program counter'' (pc). This set of registers is pretty typical
for stack machines; their exact meanings in the context of Guile's VM
is described in the next section.
are described in the next section.
A virtual machine executes by loading a compiled procedure, and
executing the object code associated with that procedure. Of course,
@ -119,14 +119,17 @@ that procedure may call other procedures, tail-call others, ad
infinitum---indeed, within a guile whose modules have all been
compiled to object code, one might never leave the virtual machine.
@c wingo: I wish the following were true, but currently we just use
@c the one engine. This kind of thing is possible tho.
@c wingo: The following is true, but I don't know in what context to
@c describe it. A documentation FIXME.
@c A VM may have one of three engines: reckless, regular, or debugging.
@c Reckless engine is fastest but dangerous. Regular engine is normally
@c fail-safe and reasonably fast. Debugging engine is safest and
@c functional but very slow.
@c (Actually we have just a regular and a debugging engine; normally
@c we use the latter, it's almost as fast as the ``regular'' engine.)
@node Stack Layout
@subsection Stack Layout
@ -174,7 +177,7 @@ The structure of the fixed part of an application frame is as follows:
In the above drawing, the stack grows upward. The intermediate values
stored in the application of this frame are stored above
@code{SCM_FRAME_UPPER_ADDRESS (fp)}. @code{bp} refers to the
@code{struct scm_program*} data associated with the program at
@code{struct scm_objcode} data associated with the program at
@code{fp - 1}. @code{nargs} and @code{nlocs} are properties of the
compiled procedure, which will be discussed later.
@ -226,7 +229,7 @@ programs are implemented, @xref{VM Programs}.
@node Variables and the VM
@subsection Variables and the VM
Let's think about the following Scheme code as an example:
Consider the following Scheme code as an example:
@example
(define (foo a)
@ -236,22 +239,15 @@ Let's think about the following Scheme code as an example:
Within the lambda expression, "foo" is a top-level variable, "a" is a
lexically captured variable, and "b" is a local variable.
That is to say: @code{b} may safely be allocated on the stack, as
there is no enclosed procedure that references it, nor is it ever
mutated.
@code{b} may safely be allocated on the stack, as there is no enclosed
procedure that references it, nor is it ever mutated.
@code{a}, on the other hand, is referenced by an enclosed procedure,
that of the lambda. Thus it must be allocated on the heap, as it may
(and will) outlive the dynamic extent of the invocation of @code{foo}.
@code{foo} is a toplevel variable, as mandated by Scheme's semantics:
@example
(define proc (foo 'bar)) ; assuming prev. definition of @code{foo}
(define foo 42) ; redefinition
(proc 'baz)
@result{} (42 bar baz)
@end example
@code{foo} is a top-level variable, because it names the procedure
@code{foo}, which is here defined at the top-level.
Note that variables that are mutated (via @code{set!}) must be
allocated on the heap, even if they are local variables. This is
@ -276,6 +272,7 @@ You can pick apart these pieces with the accessors in @code{(system vm
program)}. @xref{Compiled Procedures}, for a full API reference.
@cindex object table
@cindex object array
The object array of a compiled procedure, also known as the
@dfn{object table}, holds all Scheme objects whose values are known
not to change across invocations of the procedure: constant strings,
@ -293,31 +290,27 @@ instruction, which uses the object vector, and are almost as fast as
local variable references.
We can see how these concepts tie together by disassembling the
@code{foo} function to see what is going on:
@code{foo} function we defined earlier to see what is going on:
@smallexample
scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
scheme@@(guile-user)> ,x foo
Disassembly of #<program foo (a)>:
Bytecode:
0 (local-ref 0) ;; `a' (arg)
2 (external-set 0) ;; `a' (arg)
4 (object-ref 0) ;; #<program #(0 28 #f) (b)>
6 (make-closure) at (unknown file):0:16
4 (object-ref 1) ;; #<program b70d2910 at <unknown port>:0:16 (b)>
6 (make-closure)
7 (return)
----------------------------------------
Disassembly of #<program #(0 28 #f) (b)>:
Disassembly of #<program b70d2910 at <unknown port>:0:16 (b)>:
Bytecode:
0 (toplevel-ref 0) ;; `list'
2 (toplevel-ref 1) ;; `foo'
4 (external-ref 0) ;; (closure variable)
6 (local-ref 0) ;; `b' (arg)
8 (goto/args 3) at (unknown file):0:28
0 (toplevel-ref 1) ;; `foo'
2 (external-ref 0) ;; (closure variable)
4 (local-ref 0) ;; `b' (arg)
6 (list 0 3) ;; 3 elements at (unknown file):0:28
9 (return)
@end smallexample
At @code{ip} 0 and 2, we do the copy from argument to heap for
@ -336,8 +329,9 @@ Control Instructions}, for more details.
Then we see a reference to an external variable, corresponding to
@code{a}. The disassembler doesn't have enough information to give a
name to that variable, so it just marks it as being a ``closure
variable''. Finally we see the reference to @code{b}, then a tail call
(@code{goto/args}) with three arguments.
variable''. Finally we see the reference to @code{b}, then the
@code{list} opcode, an inline implementation of the @code{list} scheme
routine.
@node Instruction Set
@subsection Instruction Set
@ -365,7 +359,8 @@ their own test-and-branch instructions:
@end example
In addition, some Scheme primitives have their own inline
implementations, e.g. @code{cons}.
implementations, e.g. @code{cons}, and @code{list}, as we saw in the
previous section.
So Guile's instruction set is a @emph{complete} instruction set, in
that it provides the instructions that are suited to the problem, and
@ -421,12 +416,6 @@ efficient in the future via addressing by frame and index. Currently,
external variables are all consed onto a list, which results in O(N)
lookup time.
@deffn Instruction externals
Pushes the current list of external variables onto the stack. This
instruction is used in the implementation of
@code{compile-time-environment}. @xref{The Scheme Compiler}.
@end deffn
@deffn Instruction toplevel-ref index
Push the value of the toplevel binding whose location is stored in at
position @var{index} in the object table.
@ -440,11 +429,11 @@ created.
Alternately, the lookup may be performed relative to a particular
module, determined at compile-time (e.g. via @code{@@} or
@code{@@@@}). In that case, the cell in the object table holds a list:
@code{(@var{modname} @var{sym} @var{interface?})}. The symbol
@var{sym} will be looked up in the module named @var{modname} (a list
of symbols). The lookup will be performed against the module's public
interface, unless @var{interface?} is @code{#f}, which it is for
example when compiling @code{@@@@}.
@code{(@var{modname} @var{sym} @var{public?})}. The symbol @var{sym}
will be looked up in the module named @var{modname} (a list of
symbols). The lookup will be performed against the module's public
interface, unless @var{public?} is @code{#f}, which it is for example
when compiling @code{@@@@}.
In any case, if the symbol is unbound, an error is signalled.
Otherwise the initial form is replaced with the looked-up variable, an
@ -550,8 +539,9 @@ may be encoded in 1, 2, or 4 bytes.
@deffn Instruction load-integer length
@deffnx Instruction load-unsigned-integer length
Load a 32-bit integer (respectively unsigned integer) from the
instruction stream.
Load a 32-bit integer or unsigned integer from the instruction stream.
The bytes of the integer are read in order of decreasing significance
(i.e., big-endian).
@end deffn
@deffn Instruction load-number length
Load an arbitrary number from the instruction stream. The number is
@ -573,43 +563,23 @@ the current toplevel environment, creating the binding if necessary.
Push the variable corresponding to the binding.
@end deffn
@deffn Instruction load-program length
@deffn Instruction load-program
Load bytecode from the instruction stream, and push a compiled
procedure. This instruction pops the following values from the stack:
procedure.
@itemize
@item Optionally, a thunk, which when called should return metadata
associated with this program---for example its name, the names of its
arguments, its documentation string, debugging information, etc.
This instruction pops one value from the stack: the program's object
table, as a vector, or @code{#f} in the case that the program has no
object table. A program that does not reference toplevel bindings and
does not use @code{object-ref} does not need an object table.
Normally, this thunk its itself a compiled procedure (with no
metadata). Metadata is represented this way so that the initial load
of a procedure is fast: the VM just mmap's the thunk and goes. The
symbols and pairs associated with the metadata are only created if the
user asks for them.
For information on the format of the thunk's return value,
@xref{Compiled Procedures}.
@item Optionally, the program's object table, as a vector.
A program that does not reference toplevel bindings and does not use
@code{object-ref} does not need an object table.
@item Finally, either one immediate integer or four immediate integers
representing the arity of the program.
In the four-fixnum case, the values are respectively the number of
arguments taken by the function (@var{nargs}), the number of @dfn{rest
arguments} (@var{nrest}, 0 or 1), the number of local variables
(@var{nlocs}) and the number of external variables (@var{nexts})
(@pxref{Environment Control Instructions}).
The common single-fixnum case represents all of these values within a
16-bit bitmask.
@end itemize
This instruction is unlike the rest of the loading instructions,
because instead of parsing its data, it directly maps the instruction
stream onto a C structure, @code{struct scm_objcode}. @xref{Object
Code}, for more information.
The resulting compiled procedure will not have any ``external''
variables captured, so it will be loaded only once but may be used
many times to create closures.
variables captured, so it may be loaded only once but used many times
to create closures.
@end deffn
Finally, while this instruction is not strictly a ``loading''
@ -620,7 +590,10 @@ here:
Pop the program object from the stack, capture the current set of
``external'' variables, and assign those external variables to a copy
of the program. Push the new program object, which shares state with
the original program. Also captures the current module.
the original program.
At the time of this writing, the space overhead of closures is 4 words
per closure.
@end deffn
@node Procedural Instructions
@ -640,22 +613,24 @@ set to the returned value.
@deffn Instruction call nargs
Call the procedure located at @code{sp[-nargs]} with the @var{nargs}
arguments located from @code{sp[0]} to @code{sp[-nargs + 1]}.
arguments located from @code{sp[-nargs + 1]} to @code{sp[0]}.
For compiled procedures, this instruction sets up a new stack frame,
as described in @ref{Stack Layout}, and then dispatches to the first
instruction in the called procedure, relying on the called procedure
to return one value to the newly-created continuation. Because the new
frame pointer will point to sp[-nargs + 1], the arguments don't have
to be shuffled around -- they are already in place.
For non-compiled procedures (continuations, primitives, and
interpreted procedures), @code{call} will pop the procedure and
arguments off the stack, and push the result of calling
@code{scm_apply}.
For compiled procedures, this instruction sets up a new stack frame,
as described in @ref{Stack Layout}, and then dispatches to the first
instruction in the called procedure, relying on the called procedure
to return one value to the newly-created continuation.
@end deffn
@deffn Instruction goto/args nargs
Like @code{call}, but reusing the current continuation. This
instruction implements tail calling as required by RnRS.
instruction implements tail calls as required by RnRS.
For compiled procedures, that means that @code{goto/args} reuses the
current frame instead of building a new one. The @code{goto/*}
@ -726,14 +701,14 @@ values. This is an optimization for the common @code{(apply values
@deffn Instruction truncate-values nbinds nrest
Used in multiple-value continuations, this instruction takes the
values that are on the stack (including the number-of-value marker)
values that are on the stack (including the number-of-values marker)
and truncates them for a binding construct.
For example, a call to @code{(receive (x y . z) (foo) ...)} would,
logically speaking, pop off the values returned from @code{(foo)} and
push them as three values, corresponding to @code{x}, @code{y}, and
@code{z}. In that case, @var{nbinds} would be 3, and @var{nrest} would
be 1 (to indicate that one of the bindings was a rest arguments).
be 1 (to indicate that one of the bindings was a rest argument).
Signals an error if there is an insufficient number of values.
@end deffn
@ -779,12 +754,14 @@ Push @var{value}, an 8-bit character, onto the stack.
@deffn Instruction list n
Pops off the top @var{n} values off of the stack, consing them up into
a list, then pushes that list on the stack. What was the topmost value
will be the last element in the list.
will be the last element in the list. @var{n} is a two-byte value,
most significant byte first.
@end deffn
@deffn Instruction vector n
Create and fill a vector with the top @var{n} values from the stack,
popping off those values and pushing on the resulting vector.
popping off those values and pushing on the resulting vector. @var{n}
is a two-byte value, like in @code{vector}.
@end deffn
@deffn Instruction mark
@ -850,9 +827,8 @@ Pushes ``the unspecified value'' onto the stack.
@subsubsection Inlined Scheme Instructions
The Scheme compiler can recognize the application of standard Scheme
procedures, or unbound variables that look like they are bound to
standard Scheme procedures. It tries to inline these small operations
to avoid the overhead of creating new stack frames.
procedures. It tries to inline these small operations to avoid the
overhead of creating new stack frames.
Since most of these operations are historically implemented as C
primitives, not inlining them would entail constantly calling out from
@ -876,12 +852,12 @@ stream.
@deffnx Instruction eqv? x y
@deffnx Instruction equal? x y
@deffnx Instruction pair? x y
@deffnx Instruction list? x y
@deffnx Instruction list? x
@deffnx Instruction set-car! pair x
@deffnx Instruction set-cdr! pair x
@deffnx Instruction slot-ref struct n
@deffnx Instruction slot-set struct n x
@deffnx Instruction cons x
@deffnx Instruction cons x y
@deffnx Instruction car x
@deffnx Instruction cdr x
Inlined implementations of their Scheme equivalents.

View file

@ -410,12 +410,6 @@ VM_DEFINE_INSTRUCTION (29, toplevel_set, "toplevel-set", 1, 1, 0)
NEXT;
}
VM_DEFINE_INSTRUCTION (30, externals, "externals", 0, 0, 1)
{
PUSH (external);
NEXT;
}
/*
* branch and jump

View file

@ -198,6 +198,8 @@ OBJECT can be a procedure, macro or any object that has its
(object-property object 'documentation)
(and (program? object)
(program-documentation object))
(and (macro? object)
(object-documentation (macro-transformer object)))
(and (procedure? object)
(not (closure? object))
(procedure-name object)

View file

@ -82,7 +82,7 @@
(if (program? x)
(begin (display "----------------------------------------\n")
(disassemble x))))
(cddr (vector->list objs))))))
(cdr (vector->list objs))))))
(else
(error "bad load-program form" asm))))

View file

@ -33,7 +33,6 @@
#:title "Guile ECMAScript"
#:version "3.0"
#:reader (lambda () (read-ecmascript/1 (current-input-port)))
#:read-file read-ecmascript
#:compilers `((ghil . ,compile-ghil))
;; a pretty-printer would be interesting.
#:printer write

View file

@ -32,12 +32,6 @@
(read-enable 'positions)
(define (read-file port)
(do ((x (read port) (read port))
(l '() (cons x l)))
((eof-object? x)
(cons 'begin (reverse! l)))))
;;;
;;; Language definition
;;;
@ -46,7 +40,6 @@
#:title "Guile Scheme"
#:version "0.5"
#:reader read
#:read-file read-file
#:compilers `((tree-il . ,compile-tree-il)
(ghil . ,compile-ghil))
#:decompilers `((tree-il . ,decompile-tree-il))

View file

@ -21,12 +21,8 @@
#:use-module (system base syntax)
#:export (tree-il-src
<lexical> make-lexical
lexical-name lexical-gensym
<void> void? make-void void-src
<application> application? make-application application-src application-proc application-args
<conditional> conditional? make-conditional conditional-src conditional-test conditional-then conditional-else
<const> const? make-const const-src const-exp
<primitive-ref> primitive-ref? make-primitive-ref primitive-ref-src primitive-ref-name
<lexical-ref> lexical-ref? make-lexical-ref lexical-ref-src lexical-ref-name lexical-ref-gensym
<lexical-set> lexical-set? make-lexical-set lexical-set-src lexical-set-name lexical-set-gensym lexical-set-exp
@ -35,9 +31,10 @@
<toplevel-ref> toplevel-ref? make-toplevel-ref toplevel-ref-src toplevel-ref-name
<toplevel-set> toplevel-set? make-toplevel-set toplevel-set-src toplevel-set-name toplevel-set-exp
<toplevel-define> toplevel-define? make-toplevel-define toplevel-define-src toplevel-define-name toplevel-define-exp
<lambda> lambda? make-lambda lambda-src lambda-names lambda-vars lambda-meta lambda-body
<const> const? make-const const-src const-exp
<conditional> conditional? make-conditional conditional-src conditional-test conditional-then conditional-else
<application> application? make-application application-src application-proc application-args
<sequence> sequence? make-sequence sequence-src sequence-exps
<lambda> lambda? make-lambda lambda-src lambda-names lambda-vars lambda-meta lambda-body
<let> let? make-let let-src let-names let-vars let-vals let-exp
<letrec> letrec? make-letrec letrec-src letrec-names letrec-vars letrec-vals letrec-exp
@ -50,8 +47,7 @@
(define-type (<tree-il> #:common-slots (src))
(<void>)
(<application> proc args)
(<conditional> test then else)
(<const> exp)
(<primitive-ref> name)
(<lexical-ref> name gensym)
(<lexical-set> name gensym exp)
@ -60,28 +56,19 @@
(<toplevel-ref> name)
(<toplevel-set> name exp)
(<toplevel-define> name exp)
(<lambda> names vars meta body)
(<const> exp)
(<conditional> test then else)
(<application> proc args)
(<sequence> exps)
(<lambda> names vars meta body)
(<let> names vars vals exp)
(<letrec> names vars vals exp))
(define <lexical> <lexical-ref>)
(define lexical? lexical-ref?)
(define make-lexical make-lexical-ref)
(define lexical-name lexical-ref-name)
(define lexical-gensym lexical-ref-gensym)
;; FIXME: use this in psyntax
(define (location x)
(and (pair? x)
(let ((props (source-properties x)))
(and (not (null? props))
(vector (assq-ref props 'line)
(assq-ref props 'column)
(assq-ref props 'filename))))))
(and (pair? props) props))))
(define (parse-tree-il exp)
(let ((loc (location exp))

View file

@ -152,7 +152,7 @@
(define-primitive-expander /
(x) (/ 1 x)
(x y z . rest) (div x (* y z . rest)))
(x y z . rest) (/ x (* y z . rest)))
(define-primitive-expander caar (x) (car (car x)))
(define-primitive-expander cadr (x) (car (cdr x)))

View file

@ -135,11 +135,6 @@
;;; Compiler interface
;;;
(define (read-file-in file lang)
(call-with-input-file file
(or (language-read-file lang)
(error "language has no #:read-file" lang))))
(define (compile-passes from to opts)
(map cdr
(or (lookup-compilation-order from to)

View file

@ -23,7 +23,7 @@
#:use-module (system base syntax)
#:export (define-language language? lookup-language make-language
language-name language-title language-version language-reader
language-printer language-parser language-read-file
language-printer language-parser
language-compilers language-decompilers language-evaluator
language-joiner
@ -42,7 +42,6 @@
reader
printer
(parser #f)
(read-file #f)
(compilers '())
(decompilers '())
(evaluator #f)