mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-05-01 20:30:28 +02:00
Patches applied: * lcourtes@laas.fr--2005-libre/lightning--sparc-fixes--1.2--base-0 tag of lcourtes@laas.fr--2005-libre/lightning--stable--1.2--patch-18 * lcourtes@laas.fr--2005-libre/lightning--sparc-fixes--1.2--patch-1 tests/push-pop.c: Use more `pushr's. * lcourtes@laas.fr--2005-libre/lightning--sparc-fixes--1.2--patch-3 Added a test for `JIT_RET' (fails on SPARC). * lcourtes@laas.fr--2005-libre/lightning--sparc-fixes--1.2--patch-4 Fixed use of `JIT_RET': Move %o0 into %i0 after `calli' and `callr'. * lcourtes@laas.fr--2005-libre/lightning--stable--1.2--patch-19 Merge from `sparc-fixes': Fixed `pushr' and `popr', fixed `JIT_RET'. * lcourtes@laas.fr--2005-libre/lightning--stable--1.2--patch-20 Undoed `lightning--sparc-fixes--1.2--patch-4' (about `JIT_RET') which was wrong. * lcourtes@laas.fr--2005-libre/lightning--stable--1.2--patch-21 tests/ret.c: Use `jit_retval_i' to copy the function's return value. * lcourtes@laas.fr--2005-libre/lightning--stable--1.2--patch-22 Doc: Clarified the use of `JIT_RET' and documented `jit_retval'. git-archimport-id: bonzini@gnu.org--2004b/lightning--stable--1.2--patch-32
1222 lines
47 KiB
Text
1222 lines
47 KiB
Text
@node Installation
|
|
@chapter Configuring and installing @lightning{}
|
|
|
|
The first thing to do to use @lightning{} is to configure the
|
|
program, picking the set of macros to be used on the host
|
|
architecture; this configuration is automatically performed by
|
|
the @file{configure} shell script; to run it, merely type:
|
|
@example
|
|
./configure
|
|
@end example
|
|
|
|
@lightning{} supports cross-compiling in that you can choose a
|
|
different set of macros from the one needed on the computer that
|
|
you are compiling @lightning{} on. For example,
|
|
@example
|
|
./configure --host=sparc-sun-linux
|
|
@end example
|
|
|
|
@noindent will select the SPARC set of runtime assemblers. You can use
|
|
configure's ability to make reasonable assumptions about the vendor
|
|
and operating system and simply type
|
|
@example
|
|
./configure --host=i386
|
|
./configure --host=ppc
|
|
./configure --host=sparc
|
|
@end example
|
|
|
|
Another option that @file{configure} accepts is
|
|
@code{--enable-assertions}, which enables several consistency checks in
|
|
the run-time assemblers. These are not usually needed, so you can
|
|
decide to simply forget about it; also remember that these consistency
|
|
checks tend to slow down your code generator.
|
|
|
|
After you've configured @lightning{}, you don't have to compile it
|
|
because it is nothing more than a set of include files. If you want to
|
|
compile the examples, run @file{make} as usual. The next important
|
|
step is:
|
|
@example
|
|
make install
|
|
@end example
|
|
|
|
This ends the process of installing @lightning{}.
|
|
|
|
@node The instruction set
|
|
@chapter @lightning{}'s instruction set
|
|
|
|
@lightning{}'s instruction set was designed by deriving instructions
|
|
that closely match those of most existing RISC architectures, or
|
|
that can be easily syntesized if absent. Each instruction is composed
|
|
of:
|
|
@itemize @bullet
|
|
@item
|
|
an operation, like @code{sub} or @code{mul}
|
|
|
|
@item
|
|
sometimes, an register/immediate flag (@code{r} or @code{i})
|
|
|
|
@item
|
|
a type identifier or, occasionally, two
|
|
@end itemize
|
|
|
|
The second and third field are separated by an underscore; thus,
|
|
examples of legal mnemonics are @code{addr_i} (integer add, with three
|
|
register operands) and @code{muli_l} (long integer multiply, with two
|
|
register operands and an immediate operand). Each instruction takes
|
|
two or three operands; in most cases, one of them can be an immediate
|
|
value instead of a register.
|
|
|
|
@lightning{} supports a full range of integer types: operands can be 1,
|
|
2 or 4 bytes long (64-bit architectures might support 8 bytes long
|
|
operands), either signed or unsigned. The types are listed in the
|
|
following table together with the C types they represent:
|
|
|
|
@example
|
|
c @r{signed char}
|
|
uc @r{unsigned char}
|
|
s @r{short}
|
|
us @r{unsigned short}
|
|
i @r{int}
|
|
ui @r{unsigned int}
|
|
l @r{long}
|
|
ul @r{unsigned long}
|
|
f @r{float}
|
|
d @r{double}
|
|
p @r{void *}
|
|
@end example
|
|
|
|
Some of these types may not be distinct: for example, (e.g., @code{l}
|
|
is equivalent to @code{i} on 32-bit machines, and @code{p} is
|
|
substantially equivalent to @code{ul}).
|
|
|
|
There are at least seven integer registers, of which six are
|
|
general-purpose, while the last is used to contain the stack pointer
|
|
(@code{SP}). The stack pointer can be used to allocate and access local
|
|
variables on the stack (which is supposed to grow downwards in memory
|
|
on all architectures).
|
|
|
|
Of the general-purpose registers, at least three are guaranteed to be
|
|
preserved across function calls (@code{V0}, @code{V1} and
|
|
@code{V2}) and at least three are not (@code{R0}, @code{R1} and
|
|
@code{R2}). Six registers are not very much, but this
|
|
restriction was forced by the need to target CISC architectures
|
|
which, like the x86, are poor of registers; anyway, backends can
|
|
specify the actual number of available caller- and callee-save
|
|
registers.
|
|
|
|
In addition, there is a special @code{RET} register which contains the
|
|
return value of the current function (@emph{not} the return value of
|
|
callees---use the @code{retval} instruction for this). You should
|
|
always remember, however, that writing this register could overwrite
|
|
either a general-purpose register or an incoming parameter, depending
|
|
on the architecture.
|
|
|
|
There are at least six floating-point registers, named @code{FPR0} to
|
|
@code{FPR5}. These are separate from the integer registers on
|
|
all the supported architectures; on Intel architectures, the
|
|
register stack is mapped to a flat register file.
|
|
|
|
The complete instruction set follows; as you can see, most non-memory
|
|
operations only take integers, long integers (either signed or
|
|
unsigned) and pointers as operands; this was done in order to reduce
|
|
the instruction set, and because most architectures only provide word
|
|
and long word operations on registers. There are instructions that
|
|
allow operands to be extended to fit a larger data type, both in a
|
|
signed and in an unsigned way.
|
|
|
|
@table @b
|
|
@item Binary ALU operations
|
|
These accept three operands; the last one can be an immediate
|
|
value for integer operands, or a register for all operand types.
|
|
@code{addx} operations must directly follow @code{addc}, and
|
|
@code{subx} must follow @code{subc}; otherwise, results are undefined.
|
|
@example
|
|
addr i ui l ul p f d O1 = O2 + O3
|
|
addi i ui l ul p O1 = O2 + O3
|
|
addxr i ui l ul O1 = O2 + (O3 + carry)
|
|
addxi i ui l ul O1 = O2 + (O3 + carry)
|
|
addcr i ui l ul O1 = O2 + O3, set carry
|
|
addci i ui l ul O1 = O2 + O3, set carry
|
|
subr i ui l ul p f d O1 = O2 - O3
|
|
subi i ui l ul p O1 = O2 - O3
|
|
subxr i ui l ul O1 = O2 - (O3 + carry)
|
|
subxi i ui l ul O1 = O2 - (O3 + carry)
|
|
subcr i ui l ul O1 = O2 - O3, set carry
|
|
subci i ui l ul O1 = O2 - O3, set carry
|
|
rsbr i ui l ul p f d O1 = O3 - O2
|
|
rsbi i ui l ul p O1 = O3 - O2
|
|
mulr i ui l ul f d O1 = O2 * O3
|
|
muli i ui l ul O1 = O2 * O3
|
|
hmulr i ui l ul O1 = @r{high bits of} O2 * O3
|
|
hmuli i ui l ul O1 = @r{high bits of} O2 * O3
|
|
divr i ui l ul f d O1 = O2 / O3
|
|
divi i ui l ul O1 = O2 / O3
|
|
modr i ui l ul O1 = O2 % O3
|
|
modi i ui l ul O1 = O2 % O3
|
|
andr i ui l ul O1 = O2 & O3
|
|
andi i ui l ul O1 = O2 & O3
|
|
orr i ui l ul O1 = O2 | O3
|
|
ori i ui l ul O1 = O2 | O3
|
|
xorr i ui l ul O1 = O2 ^ O3
|
|
xori i ui l ul O1 = O2 ^ O3
|
|
lshr i ui l ul O1 = O2 << O3
|
|
lshi i ui l ul O1 = O2 << O3
|
|
rshr i ui l ul O1 = O2 >> O3@footnote{The sign bit is propagated for signed types.}
|
|
rshi i ui l ul O1 = O2 >> O3@footnote{The sign bit is propagated for signed types.}
|
|
@end example
|
|
|
|
@item Unary ALU operations
|
|
These accept two operands, both of which must be registers.
|
|
@example
|
|
negr i l f d O1 = -O2
|
|
notr i ui l ul O1 = ~O2
|
|
@end example
|
|
|
|
@item Compare instructions
|
|
These accept three operands; again, the last can be an immediate
|
|
value for integer data types. The last two operands are compared,
|
|
and the first operand is set to either 0 or 1, according to
|
|
whether the given condition was met or not.
|
|
|
|
The conditions given below are for the standard behavior of C,
|
|
where the ``unordered'' comparison result is mapped to false.
|
|
|
|
@example
|
|
ltr i ui l ul p f d O1 = (O2 < O3)
|
|
lti i ui l ul p O1 = (O2 < O3)
|
|
ler i ui l ul p f d O1 = (O2 <= O3)
|
|
lei i ui l ul p O1 = (O2 <= O3)
|
|
gtr i ui l ul p f d O1 = (O2 > O3)
|
|
gti i ui l ul p O1 = (O2 > O3)
|
|
ger i ui l ul p f d O1 = (O2 >= O3)
|
|
gei i ui l ul p O1 = (O2 >= O3)
|
|
eqr i ui l ul p f d O1 = (O2 == O3)
|
|
eqi i ui l ul p O1 = (O2 == O3)
|
|
ner i ui l ul p f d O1 = (O2 != O3)
|
|
nei i ui l ul p O1 = (O2 != O3)
|
|
unltr f d O1 = !(O2 >= O3)
|
|
unler f d O1 = !(O2 > O3)
|
|
ungtr f d O1 = !(O2 <= O3)
|
|
unger f d O1 = !(O2 < O3)
|
|
uneqr f d O1 = !(O2 < O3) && !(O2 > O3)
|
|
ltgtr f d O1 = !(O2 >= O3) || !(O2 <= O3)
|
|
ordr f d O1 = (O2 == O2) && (O3 == O3)
|
|
unordr f d O1 = (O2 != O2) || (O3 != O3)
|
|
@end example
|
|
|
|
@item Transfer operations
|
|
These accept two operands; for @code{ext} both of them must be
|
|
registers, while @code{mov} accepts an immediate value as the second
|
|
operand.
|
|
|
|
Unlike @code{movr} and @code{movi}, the other instructions are applied
|
|
between operands of different data types, and they need @strong{two}
|
|
data type specifications. You can use @code{extr} to convert between
|
|
integer data types, in which case the first must be smaller in size
|
|
than the second; for example @code{extr_c_ui} is correct while
|
|
@code{extr_ul_us} is not. You can also use @code{extr} to convert
|
|
an integer to a floating point value: the only available possibilities
|
|
are @code{extr_i_f} and @code{extr_i_d}. The other instructions
|
|
convert a floating point value to an integer, so the possible
|
|
suffixes are @code{_f_i} and @code{_d_i}.
|
|
|
|
@example
|
|
movr i ui l ul p f d O1 = O2
|
|
movi i ui l ul p f d O1 = O2
|
|
extr c uc s us i ui l ul f d O1 = O2
|
|
roundr i f d O1 = round(O2)
|
|
truncr i f d O1 = trunc(O2)
|
|
floorr i f d O1 = floor(O2)
|
|
ceilr i f d O1 = ceil(O2)
|
|
@end example
|
|
|
|
Note that the order of the arguments is @emph{destination first,
|
|
source second} as for all other @lightning{} instructions, but
|
|
the order of the types is always reversed with respect to that
|
|
of the arguments: @emph{shorter}---source---@emph{first,
|
|
longer}---destination---@emph{second}. This happens for historical
|
|
reasons.
|
|
|
|
@item Network extensions
|
|
These accept two operands, both of which must be registers; these
|
|
two instructions actually perform the same task, yet they are
|
|
assigned to two mnemonics for the sake of convenience and
|
|
completeness. As usual, the first operand is the destination and
|
|
the second is the source.
|
|
@example
|
|
hton us ui @r{Host-to-network (big endian) order}
|
|
ntoh us ui @r{Network-to-host order }
|
|
@end example
|
|
|
|
@item Load operations
|
|
@code{ld} accepts two operands while @code{ldx} accepts three;
|
|
in both cases, the last can be either a register or an immediate
|
|
value. Values are extended (with or without sign, according to
|
|
the data type specification) to fit a whole register.
|
|
@example
|
|
ldr c uc s us i ui l ul p f d O1 = *O2
|
|
ldi c uc s us i ui l ul p f d O1 = *O2
|
|
ldxr c uc s us i ui l ul p f d O1 = *(O2+O3)
|
|
ldxi c uc s us i ui l ul p f d O1 = *(O2+O3)
|
|
@end example
|
|
|
|
@item Store operations
|
|
@code{st} accepts two operands while @code{stx} accepts three; in
|
|
both cases, the first can be either a register or an immediate
|
|
value. Values are sign-extended to fit a whole register.
|
|
@example
|
|
str c uc s us i ui l ul p f d *O1 = O2
|
|
sti c uc s us i ui l ul p f d *O1 = O2
|
|
stxr c uc s us i ui l ul p f d *(O1+O2) = O3
|
|
stxi c uc s us i ui l ul p f d *(O1+O2) = O3
|
|
@end example
|
|
|
|
@item Stack management
|
|
These accept a single register parameter. These operations are not
|
|
guaranteed to be efficient on all architectures.
|
|
|
|
@example
|
|
pushr i ui l ul p @r{push }O1@r{ on the stack}
|
|
popr i ui l ul p @r{pop }O1@r{ off the stack}
|
|
@end example
|
|
|
|
@item Argument management
|
|
These are:
|
|
@example
|
|
prepare i f d
|
|
pusharg c uc s us i ui l ul p f d
|
|
getarg c uc s us i ui l ul p f d
|
|
arg c uc s us i ui l ul p f d
|
|
retval c uc s us i ui l ul p
|
|
@end example
|
|
|
|
Of these, the first two are used by the caller, while the last two
|
|
are used by the callee. A code snippet that wants to call another
|
|
procedure and has to pass registers must, in order: use the
|
|
@code{prepare} instruction, giving the number of arguments to
|
|
be passed to the procedure (once for each data type); use
|
|
@code{pusharg} to push the arguments @strong{in reverse order};
|
|
and use @code{calli} or @code{finish} (explained below) to
|
|
perform the actual call.
|
|
|
|
@code{arg} and @code{getarg} are used by the callee.
|
|
@code{arg} is different from other instruction in that it does not
|
|
actually generate any code: instead, it is a function which returns
|
|
a value to be passed to @code{getarg}.@footnote{``Return a
|
|
value'' means that @lightning{} macros that compile these
|
|
instructions return a value when expanded.} You should call
|
|
@code{arg} as soon as possible, before any function call or, more
|
|
easily, right after the @code{prolog} or @code{leaf} instructions
|
|
(which are treated later).
|
|
|
|
@code{getarg} accepts a register argument and a value returned by
|
|
@code{arg}, and will move that argument to the register, extending
|
|
it (with or without sign, according to the data type specification)
|
|
to fit a whole register. These instructions are more intimately
|
|
related to the usage of the @lightning{} instruction set in code
|
|
that generates other code, so they will be treated more
|
|
specifically in @ref{GNU lightning macros, , Generating code at
|
|
run-time}.
|
|
|
|
Finally, the @code{retval} instruction fetches the return value of a
|
|
called function in a register. The @code{retval} instruction takes a
|
|
register argument and copies the return value of the previously called
|
|
function in that register. A function should put its own return value
|
|
in the @code{RET} register before returning. @xref{Fibonacci, the
|
|
Fibonacci numbers}, for an example.
|
|
|
|
You should observe a few rules when using these macros. First of
|
|
all, it is not allowed to call functions with more than six arguments;
|
|
this was done to simplify and speed up the implementation on
|
|
architectures that use registers for parameter passing.
|
|
|
|
You should not nest calls to @code{prepare}, nor call zero-argument
|
|
functions (which do not need a call to @code{prepare}) inside a
|
|
@code{prepare/calli} or @code{prepare/finish} block. Doing this
|
|
might corrupt already pushed arguments.
|
|
|
|
You @strong{cannot} pass parameters between subroutines using
|
|
the six general-purpose registers. This might work only when
|
|
targeting particular architectures.
|
|
|
|
On the other hand, it is possible to assume that callee-saved registers
|
|
(@code{R0} through @code{R2}) are not clobbered by another dynamically
|
|
generated function which does not use them as operands in its code and
|
|
which does not return a value.
|
|
|
|
@item Branch instructions
|
|
Like @code{arg}, these also return a value which, in this case,
|
|
is to be used to compile forward branches as explained in
|
|
@ref{Fibonacci, , Fibonacci numbers}. They accept a pointer to the
|
|
destination of the branch and two operands to be compared; of these,
|
|
the last can be either a register or an immediate. They are:
|
|
@example
|
|
bltr i ui l ul p f d @r{if }(O2 < O3)@r{ goto }O1
|
|
blti i ui l ul p @r{if }(O2 < O3)@r{ goto }O1
|
|
bler i ui l ul p f d @r{if }(O2 <= O3)@r{ goto }O1
|
|
blei i ui l ul p @r{if }(O2 <= O3)@r{ goto }O1
|
|
bgtr i ui l ul p f d @r{if }(O2 > O3)@r{ goto }O1
|
|
bgti i ui l ul p @r{if }(O2 > O3)@r{ goto }O1
|
|
bger i ui l ul p f d @r{if }(O2 >= O3)@r{ goto }O1
|
|
bgei i ui l ul p @r{if }(O2 >= O3)@r{ goto }O1
|
|
beqr i ui l ul p f d @r{if }(O2 == O3)@r{ goto }O1
|
|
beqi i ui l ul p @r{if }(O2 == O3)@r{ goto }O1
|
|
bner i ui l ul p f d @r{if }(O2 != O3)@r{ goto }O1
|
|
bnei i ui l ul p @r{if }(O2 != O3)@r{ goto }O1
|
|
|
|
bunltr f d @r{if }!(O2 >= O3)@r{ goto }O1
|
|
bunler f d @r{if }!(O2 > O3)@r{ goto }O1
|
|
bungtr f d @r{if }!(O2 <= O3)@r{ goto }O1
|
|
bunger f d @r{if }!(O2 < O3)@r{ goto }O1
|
|
buneqr f d @r{if }!(O2 < O3) && !(O2 > O3)@r{ goto }O1
|
|
bltgtr f d @r{if }!(O2 >= O3) || !(O2 <= O3)@r{ goto }O1
|
|
bordr f d @r{if } (O2 == O2) && (O3 == O3)@r{ goto }O1
|
|
bunordr f d @r{if }!(O2 != O2) || (O3 != O3)@r{ goto }O1
|
|
|
|
bmsr i ui l ul @r{if }O2 & O3@r{ goto }O1
|
|
bmsi i ui l ul @r{if }O2 & O3@r{ goto }O1
|
|
bmcr i ui l ul @r{if }!(O2 & O3)@r{ goto }O1
|
|
bmci i ui l ul @r{if }!(O2 & O3)@r{ goto }O1@footnote{These mnemonics mean, respectively, @dfn{branch if mask set} and @dfn{branch if mask cleared}.}
|
|
boaddr i ui l ul O2 += O3@r{, goto }O1@r{ on overflow}
|
|
boaddi i ui l ul O2 += O3@r{, goto }O1@r{ on overflow}
|
|
bosubr i ui l ul O2 -= O3@r{, goto }O1@r{ on overflow}
|
|
bosubi i ui l ul O2 -= O3@r{, goto }O1@r{ on overflow}
|
|
@end example
|
|
|
|
@item Jump and return operations
|
|
These accept one argument except @code{ret} which has none; the
|
|
difference between @code{finish} and @code{calli} is that the
|
|
latter does not clean the stack from pushed parameters (if any)
|
|
and the former must @strong{always} follow a @code{prepare}
|
|
instruction. Results are undefined when using function calls
|
|
in a leaf function.
|
|
@example
|
|
calli (not specified) @r{function call to O1}
|
|
callr (not specified) @r{function call to a register}
|
|
finish (not specified) @r{function call to O1}
|
|
finishr (not specified) @r{function call to a register}
|
|
jmpi/jmpr (not specified) @r{unconditional jump to O1}
|
|
prolog (not specified) @r{function prolog for O1 args}
|
|
leaf (not specified) @r{the same for leaf functions}
|
|
ret (not specified) @r{return from subroutine}
|
|
retval c uc s us i ui l ul p f d @r{move return value}
|
|
@r{to register}
|
|
@end example
|
|
|
|
Like branch instruction, @code{jmpi} also returns a value which is to
|
|
be used to compile forward branches. @xref{Fibonacci, , Fibonacci
|
|
numbers}.
|
|
|
|
@end table
|
|
|
|
As a small appetizer, here is a small function that adds 1 to the input
|
|
parameter (an @code{int}). I'm using an assembly-like syntax here which
|
|
is a bit different from the one used when writing real subroutines with
|
|
@lightning{}; the real syntax will be introduced in @xref{GNU lightning
|
|
macros, , Generating code at run-time}.
|
|
|
|
@example
|
|
incr:
|
|
leaf 1
|
|
in = arg_i @rem{! We have an integer argument}
|
|
getarg_i R0, in @rem{! Move it to R0}
|
|
addi_i RET, R0, 1 @rem{! Add 1\, put result in return value}
|
|
ret @rem{! And return the result}
|
|
@end example
|
|
|
|
And here is another function which uses the @code{printf} function from
|
|
the standard C library to write a number in hexadecimal notation:
|
|
|
|
@example
|
|
printhex:
|
|
prolog 1
|
|
in = arg_i @rem{! Same as above}
|
|
getarg_i R0, in
|
|
prepare 2 @rem{! Begin call sequence for printf}
|
|
pusharg_i R0 @rem{! Push second argument}
|
|
pusharg_p "%x" @rem{! Push format string}
|
|
finish printf @rem{! Call printf}
|
|
ret @rem{! Return to caller}
|
|
@end example
|
|
|
|
@node GNU lightning macros
|
|
@chapter Generating code at run-time
|
|
|
|
To use @lightning{}, you should include the @file{lightning.h} file that
|
|
is put in your include directory by the @samp{make install} command.
|
|
That include files defines about four hundred public macros (plus
|
|
others that are private to @lightning{}), one for each opcode listed
|
|
above.
|
|
|
|
Each of the instructions above translates to a macro. All you have to
|
|
do is prepend @code{jit_} (lowercase) to opcode names and @code{JIT_}
|
|
(uppercase) to register names. Of course, parameters are to be put
|
|
between parentheses, just like with every other @sc{cpp} macro.
|
|
|
|
This small tutorial presents three examples:
|
|
|
|
@iftex
|
|
@itemize @bullet
|
|
@item
|
|
The @code{incr} function found in @ref{The instruction set, ,
|
|
@lightning{}'s instruction set}:
|
|
|
|
@item
|
|
A simple function call to @code{printf}
|
|
|
|
@item
|
|
An RPN calculator.
|
|
|
|
@item
|
|
Fibonacci numbers
|
|
@end itemize
|
|
@end iftex
|
|
@ifnottex
|
|
@menu
|
|
* incr:: A function which increments a number by one
|
|
* printf:: A simple function call to printf
|
|
* RPN calculator:: A more complex example, an RPN calculator
|
|
* Fibonacci:: Calculating Fibonacci numbers
|
|
@end menu
|
|
@end ifnottex
|
|
|
|
@node incr
|
|
@section A function which increments a number by one
|
|
|
|
Let's see how to create and use the sample @code{incr} function created
|
|
in @ref{The instruction set, , @lightning{}'s instruction set}:
|
|
|
|
@example
|
|
#include <stdio.h>
|
|
#include "lightning.h"
|
|
|
|
static jit_insn codeBuffer[1024];
|
|
|
|
typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */}
|
|
|
|
int main()
|
|
@{
|
|
pifi incr = (pifi) (jit_set_ip(codeBuffer).iptr);
|
|
int in;
|
|
|
|
jit_leaf(1); @rem{/* @t{ leaf 1 } */}
|
|
in = jit_arg_i(); @rem{/* @t{in = arg_i } */}
|
|
jit_getarg_i(JIT_R0, in); @rem{/* @t{ getarg_i R0 } */}
|
|
jit_addi_i(JIT_RET, JIT_R0, 1); @rem{/* @t{ addi_i RET\, R0\, 1} */}
|
|
jit_ret(); @rem{/* @t{ ret } */}
|
|
|
|
jit_flush_code(codeBuffer, jit_get_ip().ptr);
|
|
|
|
@rem{/* call the generated code\, passing 5 as an argument */}
|
|
printf("%d + 1 = %d\n", 5, incr(5));
|
|
return 0;
|
|
@}
|
|
@end example
|
|
|
|
Let's examine the code line by line (well, almost@dots{}):
|
|
|
|
@table @t
|
|
@item #include "lightning.h"
|
|
You already know about this. It defines all of @lightning{}'s macros.
|
|
|
|
@item static jit_insn codeBuffer[1024];
|
|
You might wonder about what is @code{jit_insn}. It is just a type that
|
|
is defined by @lightning{}. Its exact definition depends on the
|
|
architecture; in general, defining an array of 1024 @code{jit_insn}s
|
|
allows one to write 100 to 400 @lightning{} instructions (depending on
|
|
the architecture and exact instructions).
|
|
|
|
@item typedef int (*pifi)(int);
|
|
Just a handy typedef for a pointer to a function that takes an
|
|
@code{int} and returns another.
|
|
|
|
@item pifi incr = (pifi) (jit_set_ip(codeBuffer).iptr);
|
|
This is the first @lightning{} macro we encounter that does not map to
|
|
an instruction. It is @code{jit_set_ip}, which takes a pointer to an
|
|
area of memory where compiled code will be put and returns the same
|
|
value, cast to a @code{union} type whose members are pointers to
|
|
functions returning different C types. This union is called
|
|
@code{jit_code} and is defined as follows:
|
|
|
|
@example
|
|
typedef union jit_code @{
|
|
char *ptr;
|
|
void (*vptr)();
|
|
char (*cptr)();
|
|
unsigned char (*ucptr)();
|
|
short (*sptr)();
|
|
unsigned short (*usptr)();
|
|
int (*iptr)();
|
|
unsigned int (*uiptr)();
|
|
long (*lptr)();
|
|
unsigned long (*ulptr)();
|
|
void * (*pptr)();
|
|
float (*fptr)();
|
|
double (*dptr)();
|
|
@} jit_code;
|
|
@end example
|
|
|
|
Any of the members could have been used, since the result is soon casted
|
|
to type @code{pifi} but, for the sake of clarity, the program uses
|
|
@code{iptr}, a pointer to a function with no prototype and returning an
|
|
@code{int}.
|
|
|
|
Analogous to @code{jit_set_ip} is @code{jit_get_ip}, which does not
|
|
modify the instruction pointer---it is nothing more than a cast of the
|
|
current @sc{ip} to @code{jit_code}.
|
|
|
|
@item int in;
|
|
A footnote in @ref{The instruction set, , @lightning{}'s instruction
|
|
set}, under the description of @code{arg}, says that macros implementing
|
|
@code{arg} return a value---we'll be using this variable to store the
|
|
result of @code{arg}.
|
|
|
|
@item jit_leaf(1);
|
|
Ok, so we start generating code for our beloved function@dots{} it will
|
|
accept one argument and won't call any other function.
|
|
|
|
@item in = jit_arg_i();
|
|
@itemx jit_getarg_i(JIT_R0, in);
|
|
We retrieve the first (and only) argument, an integer, and store it
|
|
into the general-purpose register @code{R0}.
|
|
|
|
@item jit_addi_i(JIT_RET, JIT_R0, 1);
|
|
We add one to the content of the register and store the result in the
|
|
return value.
|
|
|
|
@item jit_ret();
|
|
This instruction generates a standard function epilog that returns
|
|
the contents of the @code{RET} register.
|
|
|
|
@item jit_flush_code(codeBuffer, jit_get_ip().ptr);
|
|
This instruction is very important. It flushes the generated code
|
|
area out of the processor's instruction cache, avoiding the processor
|
|
executes bogus data that it happens to find there. The
|
|
@code{jit_flush_code} function accepts the first and the last address
|
|
to flush; we use @code{jit_get_ip} to find out the latter.
|
|
|
|
@item printf("%d + 1 = %d", 5, incr(5));
|
|
Calling our function is this simple---it is not distinguishable from
|
|
a normal C function call, the only difference being that @code{incr}
|
|
is a variable.
|
|
@end table
|
|
|
|
@lightning{} abstracts two phases of dynamic code generation: selecting
|
|
instructions that map the standard representation, and emitting binary
|
|
code for these instructions. The client program has the responsibility
|
|
of describing the code to be generated using the standard @lightning{}
|
|
instruction set.
|
|
|
|
Let's examine the code generated for @code{incr} on the SPARC and x86
|
|
architectures (on the right is the code that an assembly-language
|
|
programmer would write):
|
|
|
|
@table @b
|
|
@item SPARC
|
|
@example
|
|
save %sp, -96, %sp
|
|
mov %i0, %l0 retl
|
|
add %l0, 1, %i0 add %o0, 1, %o0
|
|
ret
|
|
restore
|
|
@end example
|
|
In this case, @lightning{} introduces overhead to create a register
|
|
window (not knowing that the procedure is a leaf procedure) and to
|
|
move the argument to the general purpose register @code{R0} (which
|
|
maps to @code{%l0} on the SPARC). The former overhead could be
|
|
avoided by teaching @lightning{} about leaf procedures (@pxref{Future});
|
|
the latter could instead be avoided by rewriting the getarg instruction
|
|
as @code{jit_getarg_i(JIT_RET, in)}, which was not done in this
|
|
example.
|
|
|
|
@item x86
|
|
@example
|
|
pushl %ebp
|
|
movl %esp, %ebp
|
|
pushl %ebx
|
|
pushl %esi
|
|
pushl %edi
|
|
movl 8(%ebp), %eax movl 4(%esp), %eax
|
|
addl $1, %eax incl %eax
|
|
popl %edi
|
|
popl %esi
|
|
popl %ebx
|
|
popl %ebp
|
|
ret ret
|
|
@end example
|
|
In this case, the main overhead is due to the function's prolog and
|
|
epilog, which is nine instructions long on the x86; a hand-written
|
|
routine would not save unused callee-preserved registers on the stack.
|
|
It is to be said, however, that this is not a problem in more
|
|
complicated uses, because more complex procedure would probably use
|
|
the @code{V0} through @code{V2} registers (@code{%ebx}, @code{%esi},
|
|
@code{%edi}); in this case, a hand-written routine would have included
|
|
the prolog too. Also, a ten byte prolog would probably be a small
|
|
overhead in a more complex function.
|
|
@end table
|
|
|
|
In such a simple case, the macros that make up the back-end compile
|
|
reasonably efficient code, with the notable exception of prolog/epilog
|
|
code.
|
|
|
|
@node printf
|
|
@section A simple function call to @code{printf}
|
|
|
|
Again, here is the code for the example:
|
|
|
|
@example
|
|
#include <stdio.h>
|
|
#include "lightning.h"
|
|
|
|
static jit_insn codeBuffer[1024];
|
|
|
|
typedef void (*pvfi)(int); @rem{/* Pointer to Void Function of Int */}
|
|
|
|
int main()
|
|
@{
|
|
pvfi myFunction; @rem{/* ptr to generated code */}
|
|
char *start, *end; @rem{/* a couple of labels */}
|
|
int in; @rem{/* to get the argument */}
|
|
|
|
myFunction = (pvfi) (jit_set_ip(codeBuffer).vptr);
|
|
start = jit_get_ip().ptr;
|
|
jit_prolog(1);
|
|
in = jit_arg_i();
|
|
jit_movi_p(JIT_R0, "generated %d bytes\n");
|
|
jit_getarg_i(JIT_R1, in);
|
|
jit_prepare(2);
|
|
jit_pusharg_i(JIT_R1); @rem{/* push in reverse order */}
|
|
jit_pusharg_p(JIT_R0);
|
|
jit_finish(printf);
|
|
jit_ret();
|
|
end = jit_get_ip().ptr;
|
|
|
|
@rem{/* call the generated code\, passing its size as argument */}
|
|
jit_flush_code(start, end);
|
|
myFunction(end - start);
|
|
@}
|
|
@end example
|
|
|
|
The function shows how many bytes were generated. Most of the code
|
|
is not very interesting, as it resembles very closely the program
|
|
presented in @ref{incr, , A function which increments a number by one}.
|
|
|
|
For this reason, we're going to concentrate on just a few statements.
|
|
|
|
@table @t
|
|
@item start = jit_get_ip().ptr;
|
|
@itemx @r{@dots{}}
|
|
@itemx end = jit_get_ip().ptr;
|
|
These two instruction call the @code{jit_get_ip} macro which was
|
|
mentioned in @ref{incr, , A function which increments a number by one}
|
|
too. In this case we use the only field of @code{jit_code} that is
|
|
not a function pointer: @code{ptr}, which is a simple @code{char *}.
|
|
|
|
@item jit_movi_p(JIT_R0, "generated %d bytes\n");
|
|
Note the use of the @samp{p} type specifier, which automatically
|
|
casts the second parameter to an @code{unsigned long} to make the
|
|
code more clear and less cluttered by typecasts.
|
|
|
|
@item jit_prepare(2);
|
|
@itemx jit_pusharg_i(JIT_R1);
|
|
@itemx jit_pusharg_p(JIT_R0);
|
|
@itemx jit_finish(printf);
|
|
Once the arguments to @code{printf} have been put in general-purpose
|
|
registers, we can start a prepare/pusharg/finish sequence that
|
|
moves the argument to either the stack or registers, then calls
|
|
@code{printf}, then cleans up the stack. Note how @lightning{}
|
|
abstracts the differences between different architectures and
|
|
ABI's -- the client program does not know how parameter passing
|
|
works on the host architecture.
|
|
@end table
|
|
|
|
@node RPN calculator
|
|
@section A more complex example, an RPN calculator
|
|
|
|
We create a small stack-based RPN calculator which applies a series
|
|
of operators to a given parameter and to other numeric operands.
|
|
Unlike previous examples, the code generator is fully parameterized
|
|
and is able to compile different formulas to different functions.
|
|
Here is the code for the expression compiler; a sample usage will
|
|
follow.
|
|
|
|
@example
|
|
#include <stdio.h>
|
|
#include "lightning.h"
|
|
|
|
typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */}
|
|
|
|
pifi compile_rpn(char *expr)
|
|
@{
|
|
pifi fn;
|
|
int in;
|
|
fn = (pifi) (jit_get_ip().iptr);
|
|
jit_leaf(1);
|
|
in = jit_arg_i();
|
|
jit_getarg_i(JIT_R0, in);
|
|
|
|
while (*expr) @{
|
|
char buf[32];
|
|
int n;
|
|
if (sscanf(expr, "%[0-9]%n", buf, &n)) @{
|
|
expr += n - 1;
|
|
jit_push_i(JIT_R0);
|
|
jit_movi_i(JIT_R0, atoi(buf));
|
|
@} else if (*expr == '+') @{
|
|
jit_pop_i(JIT_R1);
|
|
jit_addr_i(JIT_R0, JIT_R1, JIT_R0);
|
|
@} else if (*expr == '-') @{
|
|
jit_pop_i(JIT_R1);
|
|
jit_subr_i(JIT_R0, JIT_R1, JIT_R0);
|
|
@} else if (*expr == '*') @{
|
|
jit_pop_i(JIT_R1);
|
|
jit_mulr_i(JIT_R0, JIT_R1, JIT_R0);
|
|
@} else if (*expr == '/') @{
|
|
jit_pop_i(JIT_R1);
|
|
jit_divr_i(JIT_R0, JIT_R1, JIT_R0);
|
|
@} else @{
|
|
fprintf(stderr, "cannot compile: %s\n", expr);
|
|
abort();
|
|
@}
|
|
++expr;
|
|
@}
|
|
jit_movr_i(JIT_RET, JIT_R0);
|
|
jit_ret();
|
|
return fn;
|
|
@}
|
|
@end example
|
|
|
|
The principle on which the calculator is based is easy: the stack
|
|
top is held in R0, while the remaining items of the stack are held
|
|
on the hardware stack. Compiling an operand pushes the old stack
|
|
top onto the stack and moves the operand into R0; compiling an
|
|
operator pops the second operand off the stack into R1, and
|
|
compiles the operation so that the result goes into R0, thus
|
|
becoming the new stack top.
|
|
|
|
Try to locate a call to @code{jit_set_ip} in the source code. You
|
|
will not find one; this means that the client has to manually set
|
|
the instruction pointer. This technique has one advantage and one
|
|
drawback. The advantage is that the client can simply set the
|
|
instruction pointer once and then generate code for multiple functions,
|
|
one after another, without caring about passing a different instruction
|
|
pointer each time; see @ref{Reentrancy, , Re-entrant usage of
|
|
@lightning{}} for the disadvantage.
|
|
|
|
Source code for the client (which lies in the same source file) follows:
|
|
|
|
@example
|
|
static jit_insn codeBuffer[1024];
|
|
|
|
int main()
|
|
@{
|
|
pifi c2f, f2c;
|
|
int i;
|
|
|
|
jit_set_ip(codeBuffer);
|
|
c2f = compile_rpn("9*5/32+");
|
|
f2c = compile_rpn("32-5*9/");
|
|
jit_flush_code(codeBuffer, jit_get_ip().ptr);
|
|
|
|
printf("\nC:");
|
|
for (i = 0; i <= 100; i += 10) printf("%3d ", i);
|
|
printf("\nF:");
|
|
for (i = 0; i <= 100; i += 10) printf("%3d ", c2f(i));
|
|
printf("\n");
|
|
|
|
printf("\nF:");
|
|
for (i = 32; i <= 212; i += 10) printf("%3d ", i);
|
|
printf("\nC:");
|
|
for (i = 32; i <= 212; i += 10) printf("%3d ", f2c(i));
|
|
printf("\n");
|
|
return 0;
|
|
@}
|
|
@end example
|
|
|
|
The client displays a conversion table between Celsius and Fahrenheit
|
|
degrees (both Celsius-to-Fahrenheit and Fahrenheit-to-Celsius). The
|
|
formulas are, @math{F(c) = c*9/5+32} and @math{C(f) = (f-32)*5/9},
|
|
respectively.
|
|
|
|
Providing the formula as an argument to @code{compile_rpn} effectively
|
|
parameterizes code generation, making it possible to use the same code
|
|
to compile different functions; this is what makes dynamic code
|
|
generation so powerful.
|
|
|
|
The @file{rpn.c} file in the @lightning{} distribution includes a more
|
|
complete (and more complex) implementation of @code{compile_rpn},
|
|
which does constant folding, allows the argument to the functions
|
|
to be used more than once, and is able to assemble instructions with
|
|
an immediate parameter.
|
|
|
|
@node Fibonacci
|
|
@section Fibonacci numbers
|
|
|
|
The code in this section calculates a variant of the Fibonacci sequence.
|
|
While the traditional Fibonacci sequence is modeled by the recurrence
|
|
relation:
|
|
@display
|
|
f(0) = f(1) = 1
|
|
f(n) = f(n-1) + f(n-2)
|
|
@end display
|
|
|
|
@noindent
|
|
the functions in this section calculates the following sequence, which
|
|
is more interesting as a benchmark@footnote{That's because, as is
|
|
easily seen, the sequence represents the number of activations of the
|
|
@code{nfibs} procedure that are needed to compute its value through
|
|
recursion.}:
|
|
@display
|
|
nfibs(0) = nfibs(1) = 1
|
|
nfibs(n) = nfibs(n-1) + nfibs(n-2) + 1
|
|
@end display
|
|
|
|
The purpose of this example is to introduce branches. There are two
|
|
kind of branches: backward branches and forward branches. We'll
|
|
present the calculation in a recursive and iterative form; the
|
|
former only uses forward branches, while the latter uses both.
|
|
|
|
@example
|
|
#include <stdio.h>
|
|
#include "lightning.h"
|
|
|
|
static jit_insn codeBuffer[1024];
|
|
|
|
typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */}
|
|
|
|
int main()
|
|
@{
|
|
pifi nfibs = (pifi) (jit_set_ip(codeBuffer).iptr);
|
|
int in; @rem{/* offset of the argument */}
|
|
jit_insn *ref; @rem{/* to patch the forward reference */}
|
|
|
|
jit_prolog (1);
|
|
in = jit_arg_ui ();
|
|
jit_getarg_ui(JIT_V0, in); @rem{/* V0 = n */}
|
|
ref = jit_blti_ui (jit_forward(), JIT_V0, 2);
|
|
jit_subi_ui (JIT_V1, JIT_V0, 1); @rem{/* V1 = n-1 */}
|
|
jit_subi_ui (JIT_V2, JIT_V0, 2); @rem{/* V2 = n-2 */}
|
|
jit_prepare(1);
|
|
jit_pusharg_ui(JIT_V1);
|
|
jit_finish(nfibs);
|
|
jit_retval(JIT_V1); @rem{/* V1 = nfibs(n-1) */}
|
|
jit_prepare(1);
|
|
jit_pusharg_ui(JIT_V2);
|
|
jit_finish(nfibs);
|
|
jit_retval(JIT_V2); @rem{/* V2 = nfibs(n-2) */}
|
|
jit_addi_ui(JIT_V1, JIT_V1, 1);
|
|
jit_addr_ui(JIT_RET, JIT_V1, JIT_V2); @rem{/* RET = V1 + V2 + 1 */}
|
|
jit_ret();
|
|
|
|
jit_patch(ref); @rem{/* patch jump */}
|
|
jit_movi_i(JIT_RET, 1); @rem{/* RET = 1 */}
|
|
jit_ret();
|
|
|
|
@rem{/* call the generated code\, passing 32 as an argument */}
|
|
jit_flush_code(codeBuffer, jit_get_ip().ptr);
|
|
printf("nfibs(%d) = %d", 32, nfibs(32));
|
|
return 0;
|
|
@}
|
|
@end example
|
|
|
|
As said above, this is the first example of dynamically compiling
|
|
branches. Branch instructions have three operands: two contains the
|
|
values to be compared, while the first is a @dfn{label}; @lightning{}
|
|
label's are represented as @code{jit_insn *} values. Unlike other
|
|
instructions (apart from @code{arg}, which is actually a directive
|
|
rather than an instruction), branch instructions also return a value
|
|
which, as we see in the example above, can be used to compile
|
|
forward references.
|
|
|
|
Compiling a forward reference is a two-step operation. First, a
|
|
branch is compiled with a dummy label, since the actual destination
|
|
of the jump is not yet known; the dummy label is returned by the
|
|
@code{jit_forward()} macro. The value returned by the branch
|
|
instruction is saved to be used later.
|
|
|
|
Then, when the destination of the jump is reached, another macro
|
|
is used, @code{jit_patch()}. This macro must be called once for
|
|
@strong{every} point in which the code had a forward branch to the
|
|
instruction following @code{jit_patch} (in this case a @code{movi_i}
|
|
instruction).
|
|
|
|
Now, here is the iterative version:
|
|
|
|
@example
|
|
#include <stdio.h>
|
|
#include "lightning.h"
|
|
|
|
static jit_insn codeBuffer[1024];
|
|
|
|
typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */}
|
|
|
|
int main()
|
|
@{
|
|
pifi nfibs = (pifi) (jit_set_ip(codeBuffer).iptr);
|
|
int in; @rem{/* offset of the argument */}
|
|
jit_insn *ref; @rem{/* to patch the forward reference */}
|
|
jit_insn *loop; @rem{/* start of the loop */}
|
|
|
|
jit_leaf (1);
|
|
in = jit_arg_ui ();
|
|
jit_getarg_ui(JIT_R2, in); @rem{/* R2 = n */}
|
|
jit_movi_ui (JIT_R1, 1);
|
|
ref = jit_blti_ui (jit_forward(), JIT_R2, 2);
|
|
jit_subi_ui (JIT_R2, JIT_R2, 1);
|
|
jit_movi_ui (JIT_R0, 1);
|
|
|
|
loop= jit_get_label();
|
|
jit_subi_ui (JIT_R2, JIT_R2, 1); @rem{/* decr. counter */}
|
|
jit_addr_ui (JIT_V0, JIT_R0, JIT_R1); @rem{/* V0 = R0 + R1 */}
|
|
jit_movr_ui (JIT_R0, JIT_R1); @rem{/* R0 = R1 */}
|
|
jit_addi_ui (JIT_R1, JIT_V0, 1); @rem{/* R1 = V0 + 1 */}
|
|
jit_bnei_ui (loop, JIT_R2, 0); @rem{/* if (R2) goto loop; */}
|
|
|
|
jit_patch(ref); @rem{/* patch forward jump */}
|
|
jit_movr_ui (JIT_RET, JIT_R1); @rem{/* RET = R1 */}
|
|
jit_ret ();
|
|
|
|
@rem{/* call the generated code\, passing 36 as an argument */}
|
|
jit_flush_code(codeBuffer, jit_get_ip().ptr);
|
|
printf("nfibs(%d) = %d", 36, nfibs(36));
|
|
return 0;
|
|
@}
|
|
@end example
|
|
|
|
This code calculates the recurrence relation using iteration (a
|
|
@code{for} loop in high-level languages). There is still a forward
|
|
reference (indicated by the @code{jit_forward}/@code{jit_patch} pair);
|
|
there are no function calls anymore: instead, there is a backward
|
|
jump (the @code{bnei} at the end of the loop).
|
|
|
|
In this case, the destination address should be known, because the
|
|
jumps lands on an instruction that has already been compiled.
|
|
However the program must make a provision and remember the address
|
|
where the jump will land. This is achieved with @code{jit_get_label},
|
|
yet another macro that is much similar to @code{jit_get_ip} but,
|
|
instead of a @code{jit_code} union, it answers an @code{jit_insn *}
|
|
that the branch macros accept.
|
|
|
|
Now, let's make one more change: let's rewrite the loop like this:
|
|
|
|
@example
|
|
@r{@dots{}}
|
|
|
|
jit_delay(
|
|
jit_movi_ui (JIT_R1, 1),
|
|
ref = jit_blti_ui (jit_forward(), JIT_R2, 2));
|
|
jit_subi_ui (JIT_R2, JIT_R2, 1);
|
|
|
|
loop= jit_get_label();
|
|
jit_subi_ui (JIT_R2, JIT_R2, 1); @rem{/* decr. counter */}
|
|
jit_addr_ui (JIT_V0, JIT_R0, JIT_R1); @rem{/* V0 = R0 + R1 */}
|
|
jit_movr_ui (JIT_R0, JIT_R1); @rem{/* R0 = R1 */}
|
|
jit_delay(
|
|
jit_addi_ui (JIT_R1, JIT_V0, 1), @rem{/* R1 = V0 + 1 */}
|
|
jit_bnei_ui (loop, JIT_R2, 0)); @rem{/* if (R2) goto loop; */}
|
|
|
|
@r{@dots{}}
|
|
@end example
|
|
|
|
The @code{jit_delay} macro is used to schedule delay slots in jumps and
|
|
branches. This is optional, but might lead to performance improvements
|
|
in tight inner loops (of course not in a loop that is executed 35
|
|
times, but this is just an example).
|
|
|
|
@code{jit_delay} takes two @lightning{} instructions, a @dfn{delay
|
|
instruction} and a @dfn{branch instruction}. Note that the two
|
|
instructions must be written in execution order (first the delay
|
|
instruction, then the branch instruction), @strong{not} with the branch
|
|
first. If the current machine has a delay slot, the delay instruction
|
|
(or part of it) is placed in the delay slot after the branch
|
|
instruction; otherwise, it emits the delay instruction before the branch
|
|
instruction. The delay instruction must not depend on being executed
|
|
before or after the branch.
|
|
|
|
Instead of @code{jit_patch}, you can use @code{jit_patch_at}, which
|
|
takes two arguments: the first is the same as for @code{jit_patch}, and
|
|
the second is the valued to be patched in. In other words, these two
|
|
invocations have the same effect:
|
|
|
|
@example
|
|
jit_patch (jump_pc);
|
|
jit_patch_at (jump_pc, jit_get_ip ());
|
|
@end example
|
|
|
|
Dual to branches and @code{jit_patch_at} are @code{jit_movi_p}
|
|
and @code{jit_patch_movi}, which can also be used to implement
|
|
forward references. @code{jit_movi_p} is carefully implemented
|
|
to use an encoding that is as long as possible, so that it can
|
|
always be patched; in addition, like branches, it will return
|
|
an address which is then passed to @code{jit_patch_movi}. The
|
|
usage of @code{jit_patch_movi} is similar to @code{jit_patch_at}.
|
|
|
|
@node Reentrancy
|
|
@chapter Re-entrant usage of @lightning{}
|
|
|
|
By default, @lightning{} is able to compile different functions at the
|
|
same time as long as it happens in different object files, and on the
|
|
other hand constrains code generation tasks to reside in a single
|
|
object file.
|
|
|
|
The reason for this is not apparent, but is easily explained:
|
|
the @file{lightning.h} header file defines its state as a
|
|
@code{static} variable, so calls to @code{jit_set_ip} and
|
|
@code{jit_get_ip} residing in different files access different
|
|
instruction pointers. This was not done without reason: it makes
|
|
the usage of @lightning{} much simpler, as it limits the initialization
|
|
tasks to the bare minimum and removes the need to link the program
|
|
with a separate library.
|
|
|
|
On the other hand, multi-threaded or otherwise concurrent programs
|
|
require reentrancy in the code generator, so this approach cannot be
|
|
the only one. In fact, it is possible to define your own copy of
|
|
@lightning{}'s instruction state by defining a variable of type
|
|
@code{jit_state} and @code{#define}-ing @code{_jit} to it:
|
|
|
|
@example
|
|
struct jit_state lightning;
|
|
#define _jit lightning
|
|
@end example
|
|
|
|
You are free to define the @code{jit_state} variable as you like:
|
|
@code{extern}, @code{static} to a function, @code{auto}, or global.
|
|
|
|
This feature takes advantage of an aspect of macros (@dfn{cascaded
|
|
macros}), which is documented thus in @acronym{CPP}'s reference manual:
|
|
|
|
@quotation
|
|
A cascade of macros is when one macro's body contains a reference to
|
|
another macro. This is very common practice. For example,
|
|
@example
|
|
#define BUFSIZE 1020
|
|
#define TABLESIZE BUFSIZE
|
|
@end example
|
|
This is not at all the same as defining @code{TABLESIZE} to be
|
|
@samp{1020}. The @code{#define} for @code{TABLESIZE} uses exactly the
|
|
body you specify---in this case, @code{BUFSIZE}---and does not check to
|
|
see whether it too is the name of a macro; it's only when you use
|
|
@code{TABLESIZE} that the result of its expansion is checked for more
|
|
macro names.
|
|
|
|
This makes a difference if you change the definition of @code{BUFSIZE}
|
|
at some point in the source file. @code{TABLESIZE}, defined as shown,
|
|
will always expand using the definition of @code{BUFSIZE} that is
|
|
currently in effect:
|
|
#define BUFSIZE 1020
|
|
#define TABLESIZE BUFSIZE
|
|
#undef BUFSIZE
|
|
#define BUFSIZE 37
|
|
|
|
Now @code{TABLESIZE} expands (in two stages) to `37'. (The @code{#undef}
|
|
is to prevent any warning about the nontrivial redefinition of
|
|
@code{BUFSIZE}.)
|
|
@end quotation
|
|
|
|
@noindent
|
|
In the same way, @code{jit_get_label} will adopt whatever definition of
|
|
@code{_jit} is in effect:
|
|
@example
|
|
#define jit_get_label() (_jit.pc)
|
|
@end example
|
|
|
|
Special care must be taken when functions residing in separate files
|
|
must access the same state. This could be the case, for example, if a
|
|
special library contained function for strength reduction of
|
|
multiplications to adds & shifts, or maybe of divisions to
|
|
multiplications and shifts. The function would be compiled using a
|
|
single definition of @code{_jit} and that definition would be used
|
|
whenever the function would be called.
|
|
|
|
Since @lightning{} uses a feature of the preprocessor to obtain
|
|
re-entrancy, it makes sense to rely on the preprocessor in this case
|
|
too.
|
|
|
|
The idea is to pass the current @code{struct jit_state} to the
|
|
function:
|
|
|
|
@example
|
|
static void
|
|
_opt_muli_i(jit, dest, source, n)
|
|
register struct jit_state *jit;
|
|
register int dest, source, n;
|
|
@{
|
|
#define _jit jit
|
|
@dots{}
|
|
#undef _jit
|
|
@}
|
|
@end example
|
|
|
|
@noindent
|
|
doing this unbeknownst to the client, using a macro in the header file:
|
|
|
|
@example
|
|
extern void _opt_muli_i(struct jit_state *, int, int, int);
|
|
|
|
#define opt_muli_i(rd, rs, n) _opt_muli_i(&_jit, (rd), (rs), (n))
|
|
@end example
|
|
|
|
|
|
@section Registers
|
|
@chapter Accessing the whole register file
|
|
|
|
As mentioned earlier in this chapter, all @lightning{} back-ends
|
|
are guaranteed to have at least six integer registers and six
|
|
floating-point registers, but many back-ends will have more.
|
|
|
|
To access the entire register files, you can use the
|
|
@code{JIT_R}, @code{JIT_V} and @code{JIT_FPR} macros. They
|
|
accept a parameter that identifies the register number, which
|
|
must be strictly less than @code{JIT_R_NUM}, @code{JIT_V_NUM}
|
|
and @code{JIT_FPR_NUM} respectively; the number need not be
|
|
constant. Of course, expressions like @code{JIT_R0} and
|
|
@code{JIT_R(0)} denote the same register, and likewise for
|
|
integer callee-saved, or floating-point, registers.
|
|
|
|
@node Bundling GNU lightning
|
|
@chapter Using @lightning{} in your programs
|
|
|
|
It is very easy to include @lightning{}'s source code (without the
|
|
documentation and examples) into your program's distribution
|
|
so that people don't need to have it installed in order to use it.
|
|
|
|
Here is a step by step explanation of what to do:
|
|
|
|
@enumerate
|
|
@item Run @command{lightningize} from your package's main
|
|
distribution directory.
|
|
@example
|
|
lightningize
|
|
@end example
|
|
|
|
@noindent
|
|
This will copy the source code for the @lightning{} back ends
|
|
into the @file{lightning} directory of your package.
|
|
|
|
@item If you're using Automake, you might be pleased to know that
|
|
@file{Makefile.am} files will be already there.
|
|
|
|
If you're not using Automake and @code{aclocal}, instead,
|
|
you should delete the @file{Makefile.am} files (they are of no use
|
|
to you) and copy the contents of the @file{lightning.m4} file, found in
|
|
@command{aclocal}'s macro repository (usually @file{/usr/share/aclocal},
|
|
to your @file{configure.in} or @file{acinclude.m4} or @file{aclocal.m4} file.
|
|
|
|
@item Include a call to the @code{LIGHTNING_CONFIGURE_IF_NOT_FOUND}
|
|
macro in your @file{configure.in} file.
|
|
@end enumerate
|
|
|
|
@code{LIGHTNING_CONFIGURE_IF_NOT_FOUND} will first look for a
|
|
pre-installed copy of @lightning{} and, if it can be found, it will
|
|
use it; otherwise, it will test if there is a back-end for the host
|
|
system. If @lightning{} is already installed, or if the system is
|
|
supported by lightning, it will define the @code{HAVE_LIGHTNING}
|
|
symbol.
|
|
|
|
In addition, an Automake conditional named @code{HAVE_INSTALLED_LIGHTNING}
|
|
will be set if @lightning{} is already installed, which can be used to
|
|
set up include paths appropriately.
|
|
|
|
Finally, @code{LIGHTNING_CONFIGURE_IF_NOT_FOUND} accepts two
|
|
optional parameters: respectively, an action to be taken if @lightning{}
|
|
is available, and an action to be taken if it is not.
|