1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-30 03:40:34 +02:00
guile/doc/ref/compiler.texi
Andy Wingo ca445ba5ec rename translate.scm to compile-ghil.scm, and more work on compiler.texi
* doc/ref/api-evaluation.texi: Fix some typos and xrefs.

* doc/ref/compiler.texi (The Scheme Compiler): Document the scheme
  compiler, and start documenting the GHIL language.

* doc/ref/guile.texi (Guile Implementation): Whoops, put autoconf after
  the implementation foo. Unless we want it before?

* doc/ref/history.texi (The Emacs Thesis): Fix typo.

* doc/ref/vm.texi (Environment Control Instructions): Rename offset to
  index.

* module/language/ghil.scm (parse-ghil): Fix what I think was a bug --
  the consumer in a mv-call shouldn't be a rest arg.

* module/language/scheme/Makefile.am (SOURCES):
* module/language/scheme/compile-ghil.scm: Rename this file from
  translate.scm.

* module/oop/goops.scm:
* module/language/scheme/spec.scm: Deal with renaming.
2009-01-09 17:49:09 +01:00

385 lines
13 KiB
Text

@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
@c Copyright (C) 2008
@c Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@node Compiling to the Virtual Machine
@section Compiling to the Virtual Machine
Compilers have a mystique about them that is attractive and
off-putting at the same time. They are attractive because they are
magical -- they transform inert text into live results, like throwing
the switch on Frankenstein. However, this magic is perceived by many
to be impenetrable.
This section aims to pull back the veil from over Guile's compiler
implementation, and pay attention to the small man behind the curtain.
@xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
know how to compile your .scm file.
@menu
* Compiler Tower::
* The Scheme Compiler::
* GHIL::
* GLIL::
* Object Code::
* Extending the Compiler::
@end menu
FIXME: document the new repl somewhere?
@node Compiler Tower
@subsection Compiler Tower
Guile's compiler is quite simple, actually -- its @emph{compilers}, to
put it more accurately. Guile defines a tower of languages, starting
at Scheme and progressively simplifying down to languages that
resemble the VM instruction set (@pxref{Instruction Set}).
Each language knows how to compile to the next, so each step is simple
and understandable. Furthermore, this set of languages is not
hardcoded into Guile, so it is possible for the user to add new
high-level languages, new passes, or even different compilation
targets.
Languages are registered in the module, @code{(system base language)}:
@example
(use-modules (system base language))
@end example
They are registered with the @code{define-language} form.
@deffn {Scheme Syntax} define-language @
name title version reader printer @
[parser=#f] [read-file=#f] [compilers='()] [evaluator=#f]
Define a language.
This syntax defines a @code{#<language>} object, bound to @var{name}
in the current environment. In addition, the language will be added to
the global language set. For example, this is the language definition
for Scheme:
@example
(define-language scheme
#:title "Guile Scheme"
#:version "0.5"
#:reader read
#:read-file read-file
#:compilers `((,ghil . ,compile-ghil))
#:evaluator (lambda (x module) (primitive-eval x))
#:printer write)
@end example
In this example, from @code{(language scheme spec)}, @code{read-file}
reads expressions from a port and wraps them in a @code{begin} block.
@end deffn
The interesting thing about having languages defined this way is that
they present a uniform interface to the read-eval-print loop. This
allows the user to change the current language of the REPL:
@example
$ guile
Guile Scheme interpreter 0.5 on Guile 1.9.0
Copyright (C) 2001-2008 Free Software Foundation, Inc.
Enter `,help' for help.
scheme@@(guile-user)> ,language ghil
Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
Copyright (C) 2001-2008 Free Software Foundation, Inc.
Enter `,help' for help.
ghil@@(guile-user)>
@end example
Languages can be looked up by name, as they were above.
@deffn {Scheme Procedure} lookup-language name
Looks up a language named @var{name}, autoloading it if necessary.
Languages are autoloaded by looking for a variable named @var{name} in
a module named @code{(language @var{name} spec)}.
The language object will be returned, or @code{#f} if there does not
exist a language with that name.
@end deffn
Defining languages this way allows us to programmatically determine
the necessary steps for compiling code from one language to another.
@deffn {Scheme Procedure} lookup-compilation-order from to
Recursively traverses the set of languages to which @var{from} can
compile, depth-first, and return the first path that can transform
@var{from} to @var{to}. Returns @code{#f} if no path is found.
This function memoizes its results in a cache that is invalidated by
subsequent calls to @code{define-language}, so it should be quite
fast.
@end deffn
There is a notion of a ``current language'', which is maintained in
the @code{*current-language*} fluid. This language is normally Scheme,
and may be rebound by the user. The runtime compilation interfaces
(@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
and target languages.
The normal tower of languages when compiling Scheme goes like this:
@itemize
@item Scheme, which we know and love
@item Guile High Intermediate Language (GHIL)
@item Guile Low Intermediate Language (GLIL)
@item Object code
@end itemize
Object code may be serialized to disk directly, though it has a cookie
and version prepended to the front. But when compiling Scheme at
runtime, you want a Scheme value, e.g. a compiled procedure. For this
reason, so as not to break the abstraction, Guile defines a fake
language, @code{value}. Compiling to @code{value} loads the object
code into a procedure, and wakes the sleeping giant.
Perhaps this strangeness can be explained by example:
@code{compile-file} defaults to compiling to object code, because it
produces object code that has to live in the barren world outside the
Guile runtime; but @code{compile} defaults to compiling to
@code{value}, as its product re-enters the Guile world.
Indeed, the process of compilation can circulate through these
different worlds indefinitely, as shown by the following quine:
@example
((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
@end example
@node The Scheme Compiler
@subsection The Scheme Compiler
The job of the Scheme compiler is to expand all macros and to resolve
all symbols to lexical variables. Its target language, GHIL, is fairly
close to Scheme itself, so this process is not very complicated.
The Scheme compiler is driven by a table of @dfn{translators},
declared with the @code{define-scheme-translator} form, defined in the
module, @code{(language scheme compile-ghil)}.
@deffn {Scheme Syntax} define-scheme-translator head clause1 clause2...
The best documentation of this form is probably an example. Here is
the translator for @code{if}:
@example
(define-scheme-translator if
;; (if TEST THEN [ELSE])
((,test ,then)
(make-ghil-if e l (retrans test) (retrans then) (retrans '(begin))))
((,test ,then ,else)
(make-ghil-if e l (retrans test) (retrans then) (retrans else))))
@end example
The match syntax is from the @code{pmatch} macro, defined in
@code{(system base pmatch)}. The result of a clause should be a valid
GHIL value. If no clause matches, a syntax error is signalled.
In the body of the clauses, the following bindings are introduced:
@itemize
@item @code{e}, the current environment
@item @code{l}, the current source location (or @code{#f})
@item @code{retrans}, a procedure that may be called to compile
subexpressions
@end itemize
Note that translators are looked up by @emph{value}, not by name. That
is to say, the translator is keyed under the @emph{value} of
@code{if}, which normally prints as @code{#<primitive-builtin-macro!
if>}.
@end deffn
Users can extend the compiler by defining new translators.
Additionally, some forms can be inlined directly to
instructions -- @xref{Inlined Scheme Instructions}, for a list. The
actual inliners are defined in @code{(language scheme inline)}:
@deffn {Scheme Syntax} define-inline head arity1 result1 arity2 result2...
Defines an inliner for @code{head}. As in
@code{define-scheme-translator}, inliners are keyed by value and not
by name.
Expressions are matched on their arities. For example:
@example
(define-inline eq?
(x y) (eq? x y))
@end example
This inlines calls to the Scheme procedure, @code{eq?}, to the
instruction @code{eq?}.
A more complicated example would be:
@example
(define-inline +
() 0
(x) x
(x y) (add x y)
(x y . rest) (add x (+ y . rest)))
@end example
@end deffn
Compilers take two arguments, an expression and an environment, and
return two values as well: an expression in the target language, and
an environment suitable for the target language. The format of the
environment is language-dependent.
For Scheme, an environment may be one of three things:
@itemize
@item @code{#f}, in which case compilation is performed in the context
of the current module;
@item a module, which specifies the context of the compilation; or
@item a @dfn{compile environment}, which specifies lexical variables
as well.
@end itemize
The format of a compile environment for scheme is @code{(@var{module}
@var{lexicals} . @var{externals})}, though users are strongly
discouraged from constructing these environments themselves. Instead,
if you need this functionality -- as in GOOPS' dynamic method compiler
-- capture an environment with @code{compile-time-environment}, then
pass that environment to @code{compile}.
@deffn {Scheme Procedure} compile-time-environment
A special function known to the compiler that, when compiled, will
return a representation of the lexical environment in place at compile
time. Useful for supporting some forms of dynamic compilation. Returns
@code{#f} if called from the interpreter.
@end deffn
@node GHIL
@subsection GHIL
structured, typed intermediate language, close to scheme
with an s-expression representation
,lang ghil
document reified format, as it's more interesting, and gives you an idea
all have environment and location pointers
@deffn {GHIL Expression} quote exp
A quoted expression.
@end deffn
@deffn {GHIL Expression} quasiquote exp
A quasiquoted expression. The parse format understands the normal
@code{unquote} and @code{unquote-splicing} forms as in normal Scheme.
When constructing @var{exp} programmatically, you will need to call
@code{make-ghil-unquote} and @code{make-ghil-unquote-splicing} as
appropriate.
@end deffn
@deffn {GHIL Expression} lambda syms rest meta . body
A closure. @var{syms} is the argument list, as a list of symbols.
@var{rest} is a boolean, which is @code{#t} iff the last argument is a
rest argument. @var{meta} is an association list of properties. The
actual @var{body} should be a list of GHIL expressions.
@end deffn
@deffn {GHIL Expression} void
The unspecified value.
@end deffn
@deffn {GHIL Expression} begin . body
Like Scheme's @code{begin}.
@end deffn
@deffn {GHIL Expression} bind syms exprs . body
Like a deconstructed @code{let}: each element of @var{syms} will be
bound to the corresponding GHIL expression in @var{exprs}.
@end deffn
@deffn {GHIL Expression} bindrec syms exprs . body
As @code{bind} is to @code{let}, so @code{bindrec} is to
@code{letrec}.
@end deffn
@deffn {GHIL Expression} set! sym val
Like Scheme's @code{set!}.
@end deffn
@deffn {GHIL Expression} define sym val
Like Scheme's @code{define}, but without the lambda sugar of course.
@end deffn
@deffn {GHIL Expression} if test then else
A conditional. Note that @var{else} is not optional.
@end deffn
@deffn {GHIL Expression} and . exps
Like Scheme's @code{and}.
@end deffn
@deffn {GHIL Expression} or . exps
Like Scheme's @code{or}.
@end deffn
@deffn {GHIL Expression} mv-bind syms rest producer . body
Like Scheme's @code{receive} -- binds the values returned by
applying @code{producer}, which should be a thunk, to the
@code{lambda}-like bindings described by @var{syms} and @var{rest}.
@end deffn
@deffn {GHIL Expression} call proc . args
A procedure call.
@end deffn
@deffn {GHIL Expression} mv-call producer consumer
Like Scheme's @code{call-with-values}.
@end deffn
@deffn {GHIL Expression} inline op . args
An inlined VM instruction. @var{op} should be the instruction name as
a symbol, and @var{args} should be its arguments, as GHIL expressions.
@end deffn
@deffn {GHIL Expression} values . values
Like Scheme's @code{values}.
@end deffn
@deffn {GHIL Expression} values* . values
@var{values} are as in the Scheme expression, @code{(apply values .
@var{vals})}.
@end deffn
@deffn {GHIL Expression} compile-time-environment
Produces, at runtime, a reification of the environment at compile
time.
@end deffn
ghil environments
ghil-var-for-ref!, ghil-var-for-set!, ghil-var-define!, ghil-var-at-module!
some pre-optimization
real name of the game is closure elimination -- fixing letrec
@node GLIL
@subsection GLIL
structured, typed intermediate language, close to object code
passes through the env
no let, no lambda, no closures, just labels and branches and constants
and code. Well, there's a bit more, but that's the flavor of GLIL.
Compiled code will effectively be a thunk, of no arguments, but
optionally closing over some number of variables (which should be
captured via `make-closure', @pxref{Loading Instructions}).
@node Object Code
@subsection Object Code
describe the env -- module + externals (the actual values!)
The env is used when compiling to value -- effectively calling the
thunk from objcode->program with a certain current module and with
those externals. so you can recompile a closure at runtime, a trick
that goops uses.
@node Extending the Compiler
@subsection Extending the Compiler
JIT compilation
AOT compilation
link to what dybvig did
profiling
startup time