mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-05-02 13:00:26 +02:00
* doc/ref/compiler.texi (Compiler Tower): Reword a couple things. (Tree-IL): Add more vertical space, for readability in info.
600 lines
24 KiB
Text
600 lines
24 KiB
Text
@c -*-texinfo-*-
|
|
@c This is part of the GNU Guile Reference Manual.
|
|
@c Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013
|
|
@c Free Software Foundation, Inc.
|
|
@c See the file guile.texi for copying conditions.
|
|
|
|
@node Compiling to the Virtual Machine
|
|
@section Compiling to the Virtual Machine
|
|
|
|
Compilers! The word itself inspires excitement and awe, even among
|
|
experienced practitioners. But a compiler is just a program: an
|
|
eminently hackable thing. This section aims to to describe Guile's
|
|
compiler in such a way that interested Scheme hackers can feel
|
|
comfortable reading and extending it.
|
|
|
|
@xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
|
|
know how to compile your @code{.scm} file.
|
|
|
|
@menu
|
|
* Compiler Tower::
|
|
* The Scheme Compiler::
|
|
* Tree-IL::
|
|
* Continuation-Passing Style::
|
|
* Bytecode::
|
|
* Writing New High-Level Languages::
|
|
* Extending the Compiler::
|
|
@end menu
|
|
|
|
@node Compiler Tower
|
|
@subsection Compiler Tower
|
|
|
|
Guile's compiler is quite simple -- its @emph{compilers}, to put it more
|
|
accurately. Guile defines a tower of languages, starting at Scheme and
|
|
progressively simplifying down to languages that resemble the VM
|
|
instruction set (@pxref{Instruction Set}).
|
|
|
|
Each language knows how to compile to the next, so each step is simple
|
|
and understandable. Furthermore, this set of languages is not hardcoded
|
|
into Guile, so it is possible for the user to add new high-level
|
|
languages, new passes, or even different compilation targets.
|
|
|
|
Languages are registered in the module, @code{(system base language)}:
|
|
|
|
@example
|
|
(use-modules (system base language))
|
|
@end example
|
|
|
|
They are registered with the @code{define-language} form.
|
|
|
|
@deffn {Scheme Syntax} define-language @
|
|
[#:name] [#:title] [#:reader] [#:printer] @
|
|
[#:parser=#f] [#:compilers='()] @
|
|
[#:decompilers='()] [#:evaluator=#f] @
|
|
[#:joiner=#f] [#:for-humans?=#t] @
|
|
[#:make-default-environment=make-fresh-user-module]
|
|
Define a language.
|
|
|
|
This syntax defines a @code{<language>} object, bound to @var{name} in
|
|
the current environment. In addition, the language will be added to the
|
|
global language set. For example, this is the language definition for
|
|
Scheme:
|
|
|
|
@example
|
|
(define-language scheme
|
|
#:title "Scheme"
|
|
#:reader (lambda (port env) ...)
|
|
#:compilers `((tree-il . ,compile-tree-il))
|
|
#:decompilers `((tree-il . ,decompile-tree-il))
|
|
#:evaluator (lambda (x module) (primitive-eval x))
|
|
#:printer write
|
|
#:make-default-environment (lambda () ...))
|
|
@end example
|
|
@end deffn
|
|
|
|
The interesting thing about having languages defined this way is that
|
|
they present a uniform interface to the read-eval-print loop. This
|
|
allows the user to change the current language of the REPL:
|
|
|
|
@example
|
|
scheme@@(guile-user)> ,language tree-il
|
|
Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'.
|
|
tree-il@@(guile-user)> ,L scheme
|
|
Happy hacking with Scheme! To switch back, type `,L tree-il'.
|
|
scheme@@(guile-user)>
|
|
@end example
|
|
|
|
Languages can be looked up by name, as they were above.
|
|
|
|
@deffn {Scheme Procedure} lookup-language name
|
|
Looks up a language named @var{name}, autoloading it if necessary.
|
|
|
|
Languages are autoloaded by looking for a variable named @var{name} in
|
|
a module named @code{(language @var{name} spec)}.
|
|
|
|
The language object will be returned, or @code{#f} if there does not
|
|
exist a language with that name.
|
|
@end deffn
|
|
|
|
Defining languages this way allows us to programmatically determine
|
|
the necessary steps for compiling code from one language to another.
|
|
|
|
@deffn {Scheme Procedure} lookup-compilation-order from to
|
|
Recursively traverses the set of languages to which @var{from} can
|
|
compile, depth-first, and return the first path that can transform
|
|
@var{from} to @var{to}. Returns @code{#f} if no path is found.
|
|
|
|
This function memoizes its results in a cache that is invalidated by
|
|
subsequent calls to @code{define-language}, so it should be quite
|
|
fast.
|
|
@end deffn
|
|
|
|
There is a notion of a ``current language'', which is maintained in the
|
|
@code{current-language} parameter, defined in the core @code{(guile)}
|
|
module. This language is normally Scheme, and may be rebound by the
|
|
user. The run-time compilation interfaces
|
|
(@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
|
|
and target languages.
|
|
|
|
The normal tower of languages when compiling Scheme goes like this:
|
|
|
|
@itemize
|
|
@item Scheme
|
|
@item Tree Intermediate Language (Tree-IL)
|
|
@item Continuation-Passing Style (CPS)
|
|
@item Bytecode
|
|
@end itemize
|
|
|
|
As discussed before (@pxref{Object File Format}), bytecode is in ELF
|
|
format, ready to be serialized to disk. But when compiling Scheme at
|
|
run time, you want a Scheme value: for example, a compiled procedure.
|
|
For this reason, so as not to break the abstraction, Guile defines a
|
|
fake language at the bottom of the tower:
|
|
|
|
@itemize
|
|
@item Value
|
|
@end itemize
|
|
|
|
Compiling to @code{value} loads the bytecode into a procedure, turning
|
|
cold bytes into warm code.
|
|
|
|
Perhaps this strangeness can be explained by example:
|
|
@code{compile-file} defaults to compiling to bytecode, because it
|
|
produces object code that has to live in the barren world outside the
|
|
Guile runtime; but @code{compile} defaults to compiling to @code{value},
|
|
as its product re-enters the Guile world.
|
|
|
|
@c FIXME: This doesn't work anymore :( Should we add some kind of
|
|
@c special GC pass, or disclaim this kind of code, or what?
|
|
|
|
Indeed, the process of compilation can circulate through these
|
|
different worlds indefinitely, as shown by the following quine:
|
|
|
|
@example
|
|
((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
|
|
@end example
|
|
|
|
@node The Scheme Compiler
|
|
@subsection The Scheme Compiler
|
|
|
|
The job of the Scheme compiler is to expand all macros and all of Scheme
|
|
to its most primitive expressions. The definition of ``primitive
|
|
expression'' is given by the inventory of constructs provided by
|
|
Tree-IL, the target language of the Scheme compiler: procedure calls,
|
|
conditionals, lexical references, and so on. This is described more
|
|
fully in the next section.
|
|
|
|
The tricky and amusing thing about the Scheme-to-Tree-IL compiler is
|
|
that it is completely implemented by the macro expander. Since the
|
|
macro expander has to run over all of the source code already in order
|
|
to expand macros, it might as well do the analysis at the same time,
|
|
producing Tree-IL expressions directly.
|
|
|
|
Because this compiler is actually the macro expander, it is extensible.
|
|
Any macro which the user writes becomes part of the compiler.
|
|
|
|
The Scheme-to-Tree-IL expander may be invoked using the generic
|
|
@code{compile} procedure:
|
|
|
|
@lisp
|
|
(compile '(+ 1 2) #:from 'scheme #:to 'tree-il)
|
|
@result{}
|
|
#<tree-il (call (toplevel +) (const 1) (const 2))>
|
|
@end lisp
|
|
|
|
@code{(compile @var{foo} #:from 'scheme #:to 'tree-il)} is entirely
|
|
equivalent to calling the macro expander as @code{(macroexpand @var{foo}
|
|
'c '(compile load eval))}. @xref{Macro Expansion}.
|
|
@code{compile-tree-il}, the procedure dispatched by @code{compile} to
|
|
@code{'tree-il}, is a small wrapper around @code{macroexpand}, to make
|
|
it conform to the general form of compiler procedures in Guile's
|
|
language tower.
|
|
|
|
Compiler procedures take three arguments: an expression, an
|
|
environment, and a keyword list of options. They return three values:
|
|
the compiled expression, the corresponding environment for the target
|
|
language, and a ``continuation environment''. The compiled expression
|
|
and environment will serve as input to the next language's compiler.
|
|
The ``continuation environment'' can be used to compile another
|
|
expression from the same source language within the same module.
|
|
|
|
For example, you might compile the expression, @code{(define-module
|
|
(foo))}. This will result in a Tree-IL expression and environment. But
|
|
if you compiled a second expression, you would want to take into
|
|
account the compile-time effect of compiling the previous expression,
|
|
which puts the user in the @code{(foo)} module. That is purpose of the
|
|
``continuation environment''; you would pass it as the environment
|
|
when compiling the subsequent expression.
|
|
|
|
For Scheme, an environment is a module. By default, the @code{compile}
|
|
and @code{compile-file} procedures compile in a fresh module, such
|
|
that bindings and macros introduced by the expression being compiled
|
|
are isolated:
|
|
|
|
@example
|
|
(eq? (current-module) (compile '(current-module)))
|
|
@result{} #f
|
|
|
|
(compile '(define hello 'world))
|
|
(defined? 'hello)
|
|
@result{} #f
|
|
|
|
(define / *)
|
|
(eq? (compile '/) /)
|
|
@result{} #f
|
|
@end example
|
|
|
|
Similarly, changes to the @code{current-reader} fluid (@pxref{Loading,
|
|
@code{current-reader}}) are isolated:
|
|
|
|
@example
|
|
(compile '(fluid-set! current-reader (lambda args 'fail)))
|
|
(fluid-ref current-reader)
|
|
@result{} #f
|
|
@end example
|
|
|
|
Nevertheless, having the compiler and @dfn{compilee} share the same name
|
|
space can be achieved by explicitly passing @code{(current-module)} as
|
|
the compilation environment:
|
|
|
|
@example
|
|
(define hello 'world)
|
|
(compile 'hello #:env (current-module))
|
|
@result{} world
|
|
@end example
|
|
|
|
@node Tree-IL
|
|
@subsection Tree-IL
|
|
|
|
Tree Intermediate Language (Tree-IL) is a structured intermediate
|
|
language that is close in expressive power to Scheme. It is an
|
|
expanded, pre-analyzed Scheme.
|
|
|
|
Tree-IL is ``structured'' in the sense that its representation is
|
|
based on records, not S-expressions. This gives a rigidity to the
|
|
language that ensures that compiling to a lower-level language only
|
|
requires a limited set of transformations. For example, the Tree-IL
|
|
type @code{<const>} is a record type with two fields, @code{src} and
|
|
@code{exp}. Instances of this type are created via @code{make-const}.
|
|
Fields of this type are accessed via the @code{const-src} and
|
|
@code{const-exp} procedures. There is also a predicate, @code{const?}.
|
|
@xref{Records}, for more information on records.
|
|
|
|
@c alpha renaming
|
|
|
|
All Tree-IL types have a @code{src} slot, which holds source location
|
|
information for the expression. This information, if present, will be
|
|
residualized into the compiled object code, allowing backtraces to
|
|
show source information. The format of @code{src} is the same as that
|
|
returned by Guile's @code{source-properties} function. @xref{Source
|
|
Properties}, for more information.
|
|
|
|
Although Tree-IL objects are represented internally using records,
|
|
there is also an equivalent S-expression external representation for
|
|
each kind of Tree-IL. For example, the S-expression representation
|
|
of @code{#<const src: #f exp: 3>} expression would be:
|
|
|
|
@example
|
|
(const 3)
|
|
@end example
|
|
|
|
Users may program with this format directly at the REPL:
|
|
|
|
@example
|
|
scheme@@(guile-user)> ,language tree-il
|
|
Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'.
|
|
tree-il@@(guile-user)> (call (primitive +) (const 32) (const 10))
|
|
@result{} 42
|
|
@end example
|
|
|
|
The @code{src} fields are left out of the external representation.
|
|
|
|
One may create Tree-IL objects from their external representations via
|
|
calling @code{parse-tree-il}, the reader for Tree-IL. If any source
|
|
information is attached to the input S-expression, it will be
|
|
propagated to the resulting Tree-IL expressions. This is probably the
|
|
easiest way to compile to Tree-IL: just make the appropriate external
|
|
representations in S-expression format, and let @code{parse-tree-il}
|
|
take care of the rest.
|
|
|
|
@deftp {Scheme Variable} <void> src
|
|
@deftpx {External Representation} (void)
|
|
An empty expression. In practice, equivalent to Scheme's @code{(if #f
|
|
#f)}.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <const> src exp
|
|
@deftpx {External Representation} (const @var{exp})
|
|
A constant.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <primitive-ref> src name
|
|
@deftpx {External Representation} (primitive @var{name})
|
|
A reference to a ``primitive''. A primitive is a procedure that, when
|
|
compiled, may be open-coded. For example, @code{cons} is usually
|
|
recognized as a primitive, so that it compiles down to a single
|
|
instruction.
|
|
|
|
Compilation of Tree-IL usually begins with a pass that resolves some
|
|
@code{<module-ref>} and @code{<toplevel-ref>} expressions to
|
|
@code{<primitive-ref>} expressions. The actual compilation pass has
|
|
special cases for calls to certain primitives, like @code{apply} or
|
|
@code{cons}.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <lexical-ref> src name gensym
|
|
@deftpx {External Representation} (lexical @var{name} @var{gensym})
|
|
A reference to a lexically-bound variable. The @var{name} is the
|
|
original name of the variable in the source program. @var{gensym} is a
|
|
unique identifier for this variable.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <lexical-set> src name gensym exp
|
|
@deftpx {External Representation} (set! (lexical @var{name} @var{gensym}) @var{exp})
|
|
Sets a lexically-bound variable.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <module-ref> src mod name public?
|
|
@deftpx {External Representation} (@@ @var{mod} @var{name})
|
|
@deftpx {External Representation} (@@@@ @var{mod} @var{name})
|
|
A reference to a variable in a specific module. @var{mod} should be
|
|
the name of the module, e.g.@: @code{(guile-user)}.
|
|
|
|
If @var{public?} is true, the variable named @var{name} will be looked
|
|
up in @var{mod}'s public interface, and serialized with @code{@@};
|
|
otherwise it will be looked up among the module's private bindings,
|
|
and is serialized with @code{@@@@}.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <module-set> src mod name public? exp
|
|
@deftpx {External Representation} (set! (@@ @var{mod} @var{name}) @var{exp})
|
|
@deftpx {External Representation} (set! (@@@@ @var{mod} @var{name}) @var{exp})
|
|
Sets a variable in a specific module.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <toplevel-ref> src name
|
|
@deftpx {External Representation} (toplevel @var{name})
|
|
References a variable from the current procedure's module.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <toplevel-set> src name exp
|
|
@deftpx {External Representation} (set! (toplevel @var{name}) @var{exp})
|
|
Sets a variable in the current procedure's module.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <toplevel-define> src name exp
|
|
@deftpx {External Representation} (define (toplevel @var{name}) @var{exp})
|
|
Defines a new top-level variable in the current procedure's module.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <conditional> src test then else
|
|
@deftpx {External Representation} (if @var{test} @var{then} @var{else})
|
|
A conditional. Note that @var{else} is not optional.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <call> src proc args
|
|
@deftpx {External Representation} (call @var{proc} . @var{args})
|
|
A procedure call.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <primcall> src name args
|
|
@deftpx {External Representation} (primcall @var{name} . @var{args})
|
|
A call to a primitive. Equivalent to @code{(call (primitive @var{name})
|
|
. @var{args})}. This construct is often more convenient to generate and
|
|
analyze than @code{<call>}.
|
|
|
|
As part of the compilation process, instances of @code{(call (primitive
|
|
@var{name}) . @var{args})} are transformed into primcalls.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <seq> src head tail
|
|
@deftpx {External Representation} (seq @var{head} @var{tail})
|
|
A sequence. The semantics is that @var{head} is evaluated first, and
|
|
any resulting values are ignored. Then @var{tail} is evaluated, in tail
|
|
position.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <lambda> src meta body
|
|
@deftpx {External Representation} (lambda @var{meta} @var{body})
|
|
A closure. @var{meta} is an association list of properties for the
|
|
procedure. @var{body} is a single Tree-IL expression of type
|
|
@code{<lambda-case>}. As the @code{<lambda-case>} clause can chain to
|
|
an alternate clause, this makes Tree-IL's @code{<lambda>} have the
|
|
expressiveness of Scheme's @code{case-lambda}.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <lambda-case> req opt rest kw inits gensyms body alternate
|
|
@deftpx {External Representation} @
|
|
(lambda-case ((@var{req} @var{opt} @var{rest} @var{kw} @var{inits} @var{gensyms})@
|
|
@var{body})@
|
|
[@var{alternate}])
|
|
One clause of a @code{case-lambda}. A @code{lambda} expression in
|
|
Scheme is treated as a @code{case-lambda} with one clause.
|
|
|
|
@var{req} is a list of the procedure's required arguments, as symbols.
|
|
@var{opt} is a list of the optional arguments, or @code{#f} if there
|
|
are no optional arguments. @var{rest} is the name of the rest
|
|
argument, or @code{#f}.
|
|
|
|
@var{kw} is a list of the form, @code{(@var{allow-other-keys?}
|
|
(@var{keyword} @var{name} @var{var}) ...)}, where @var{keyword} is the
|
|
keyword corresponding to the argument named @var{name}, and whose
|
|
corresponding gensym is @var{var}. @var{inits} are tree-il expressions
|
|
corresponding to all of the optional and keyword arguments, evaluated to
|
|
bind variables whose value is not supplied by the procedure caller.
|
|
Each @var{init} expression is evaluated in the lexical context of
|
|
previously bound variables, from left to right.
|
|
|
|
@var{gensyms} is a list of gensyms corresponding to all arguments:
|
|
first all of the required arguments, then the optional arguments if
|
|
any, then the rest argument if any, then all of the keyword arguments.
|
|
|
|
@var{body} is the body of the clause. If the procedure is called with
|
|
an appropriate number of arguments, @var{body} is evaluated in tail
|
|
position. Otherwise, if there is an @var{alternate}, it should be a
|
|
@code{<lambda-case>} expression, representing the next clause to try.
|
|
If there is no @var{alternate}, a wrong-number-of-arguments error is
|
|
signaled.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <let> src names gensyms vals exp
|
|
@deftpx {External Representation} (let @var{names} @var{gensyms} @var{vals} @var{exp})
|
|
Lexical binding, like Scheme's @code{let}. @var{names} are the original
|
|
binding names, @var{gensyms} are gensyms corresponding to the
|
|
@var{names}, and @var{vals} are Tree-IL expressions for the values.
|
|
@var{exp} is a single Tree-IL expression.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <letrec> in-order? src names gensyms vals exp
|
|
@deftpx {External Representation} (letrec @var{names} @var{gensyms} @var{vals} @var{exp})
|
|
@deftpx {External Representation} (letrec* @var{names} @var{gensyms} @var{vals} @var{exp})
|
|
A version of @code{<let>} that creates recursive bindings, like
|
|
Scheme's @code{letrec}, or @code{letrec*} if @var{in-order?} is true.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <prompt> escape-only? tag body handler
|
|
@deftpx {External Representation} (prompt @var{escape-only?} @var{tag} @var{body} @var{handler})
|
|
A dynamic prompt. Instates a prompt named @var{tag}, an expression,
|
|
during the dynamic extent of the execution of @var{body}, also an
|
|
expression. If an abort occurs to this prompt, control will be passed
|
|
to @var{handler}, also an expression, which should be a procedure. The
|
|
first argument to the handler procedure will be the captured
|
|
continuation, followed by all of the values passed to the abort. If
|
|
@var{escape-only?} is true, the handler should be a @code{<lambda>} with
|
|
a single @code{<lambda-case>} body expression with no optional or
|
|
keyword arguments, and no alternate, and whose first argument is
|
|
unreferenced. @xref{Prompts}, for more information.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <abort> tag args tail
|
|
@deftpx {External Representation} (abort @var{tag} @var{args} @var{tail})
|
|
An abort to the nearest prompt with the name @var{tag}, an expression.
|
|
@var{args} should be a list of expressions to pass to the prompt's
|
|
handler, and @var{tail} should be an expression that will evaluate to
|
|
a list of additional arguments. An abort will save the partial
|
|
continuation, which may later be reinstated, resulting in the
|
|
@code{<abort>} expression evaluating to some number of values.
|
|
@end deftp
|
|
|
|
There are two Tree-IL constructs that are not normally produced by
|
|
higher-level compilers, but instead are generated during the
|
|
source-to-source optimization and analysis passes that the Tree-IL
|
|
compiler does. Users should not generate these expressions directly,
|
|
unless they feel very clever, as the default analysis pass will generate
|
|
them as necessary.
|
|
|
|
@deftp {Scheme Variable} <let-values> src names gensyms exp body
|
|
@deftpx {External Representation} (let-values @var{names} @var{gensyms} @var{exp} @var{body})
|
|
Like Scheme's @code{receive} -- binds the values returned by
|
|
evaluating @code{exp} to the @code{lambda}-like bindings described by
|
|
@var{gensyms}. That is to say, @var{gensyms} may be an improper list.
|
|
|
|
@code{<let-values>} is an optimization of a @code{<call>} to the
|
|
primitive, @code{call-with-values}.
|
|
@end deftp
|
|
|
|
@deftp {Scheme Variable} <fix> src names gensyms vals body
|
|
@deftpx {External Representation} (fix @var{names} @var{gensyms} @var{vals} @var{body})
|
|
Like @code{<letrec>}, but only for @var{vals} that are unset
|
|
@code{lambda} expressions.
|
|
|
|
@code{fix} is an optimization of @code{letrec} (and @code{let}).
|
|
@end deftp
|
|
|
|
Tree-IL is a convenient compilation target from source languages. It
|
|
can be convenient as a medium for optimization, though CPS is usually
|
|
better. The strength of Tree-IL is that it does not fix order of
|
|
evaluation, so it makes some code motion a bit easier.
|
|
|
|
Optimization passes performed on Tree-IL currently include:
|
|
|
|
@itemize
|
|
@item Open-coding (turning toplevel-refs into primitive-refs,
|
|
and calls to primitives to primcalls)
|
|
@item Partial evaluation (comprising inlining, copy propagation, and
|
|
constant folding)
|
|
@item Common subexpression elimination (CSE)
|
|
@end itemize
|
|
|
|
In the future, we will move the CSE pass to operate over the lower-level
|
|
CPS language.
|
|
|
|
@node Continuation-Passing Style
|
|
@subsection Continuation-Passing Style
|
|
|
|
@cindex CPS
|
|
Continuation-passing style (CPS) is ...
|
|
|
|
@node Bytecode
|
|
@subsection Bytecode
|
|
|
|
Blah blah ...
|
|
|
|
@xref{Object File Format}
|
|
|
|
(system vm loader)
|
|
|
|
@deffn {Scheme Variable} load-thunk-from-file file
|
|
@deffnx {C Function} scm_load_thunk_from_file (file)
|
|
Load object code from a file named @var{file}. The file will be mapped
|
|
into memory via @code{mmap}, so this is a very fast operation.
|
|
|
|
On disk, object code is embedded in ELF, a flexible container format
|
|
created for use in UNIX systems. Guile has its own ELF linker and
|
|
loader, so it uses the ELF format on all systems.
|
|
@end deffn
|
|
|
|
likewise load-thunk-from-memory
|
|
|
|
Compiling object code to the fake language, @code{value}, is performed
|
|
via loading objcode into a program, then executing that thunk with
|
|
respect to the compilation environment. Normally the environment
|
|
propagates through the compiler transparently, but users may specify
|
|
the compilation environment manually as well, as a module.
|
|
|
|
|
|
@node Writing New High-Level Languages
|
|
@subsection Writing New High-Level Languages
|
|
|
|
In order to integrate a new language @var{lang} into Guile's compiler
|
|
system, one has to create the module @code{(language @var{lang} spec)}
|
|
containing the language definition and referencing the parser,
|
|
compiler and other routines processing it. The module hierarchy in
|
|
@code{(language brainfuck)} defines a very basic Brainfuck
|
|
implementation meant to serve as easy-to-understand example on how to
|
|
do this. See for instance @url{http://en.wikipedia.org/wiki/Brainfuck}
|
|
for more information about the Brainfuck language itself.
|
|
|
|
|
|
@node Extending the Compiler
|
|
@subsection Extending the Compiler
|
|
|
|
At this point we take a detour from the impersonal tone of the rest of
|
|
the manual. Admit it: if you've read this far into the compiler
|
|
internals manual, you are a junkie. Perhaps a course at your university
|
|
left you unsated, or perhaps you've always harbored a desire to hack the
|
|
holy of computer science holies: a compiler. Well you're in good
|
|
company, and in a good position. Guile's compiler needs your help.
|
|
|
|
There are many possible avenues for improving Guile's compiler.
|
|
Probably the most important improvement, speed-wise, will be some form
|
|
of native compilation, both just-in-time and ahead-of-time. This could
|
|
be done in many ways. Probably the easiest strategy would be to extend
|
|
the compiled procedure structure to include a pointer to a native code
|
|
vector, and compile from bytecode to native code at run-time after a
|
|
procedure is called a certain number of times.
|
|
|
|
The name of the game is a profiling-based harvest of the low-hanging
|
|
fruit, running programs of interest under a system-level profiler and
|
|
determining which improvements would give the most bang for the buck.
|
|
It's really getting to the point though that native compilation is the
|
|
next step.
|
|
|
|
The compiler also needs help at the top end, enhancing the Scheme that
|
|
it knows to also understand R6RS, and adding new high-level compilers.
|
|
We have JavaScript and Emacs Lisp mostly complete, but they could use
|
|
some love; Lua would be nice as well, but whatever language it is
|
|
that strikes your fancy would be welcome too.
|
|
|
|
Compilers are for hacking, not for admiring or for complaining about.
|
|
Get to it!
|