mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-07-03 08:10:31 +02:00
update "data representation" part of guile internals doc
* doc/ref/api-control.texi (Handling Errors): Move the "Signalling Type Errors" section here. * doc/ref/data-rep.texi (Data Representation): Refactor, lopping and cropping and stitching. * doc/ref/libguile-concepts.texi (Dynamic Types): * doc/ref/libguile-smobs.texi (Describing a New Type, Double Smobs): * doc/ref/guile.texi (Guile Implementation, Programming in C): Adapt to refactorings. * doc/ref/history.texi (A Scheme of Many Maintainers): (A Timeline of Selected Guile Releases, Status): Update.
This commit is contained in:
parent
06dcb9dfb6
commit
0f7e6c56cd
6 changed files with 200 additions and 710 deletions
|
@ -1393,6 +1393,42 @@ which is the name of the procedure incorrectly invoked.
|
|||
@end deftypefn
|
||||
|
||||
|
||||
@subsubsection Signalling Type Errors
|
||||
|
||||
Every function visible at the Scheme level should aggressively check the
|
||||
types of its arguments, to avoid misinterpreting a value, and perhaps
|
||||
causing a segmentation fault. Guile provides some macros to make this
|
||||
easier.
|
||||
|
||||
@deftypefn Macro void SCM_ASSERT (int @var{test}, SCM @var{obj}, unsigned int @var{position}, const char *@var{subr})
|
||||
If @var{test} is zero, signal a ``wrong type argument'' error,
|
||||
attributed to the subroutine named @var{subr}, operating on the value
|
||||
@var{obj}, which is the @var{position}'th argument of @var{subr}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_ARG1
|
||||
@deftypefnx Macro int SCM_ARG2
|
||||
@deftypefnx Macro int SCM_ARG3
|
||||
@deftypefnx Macro int SCM_ARG4
|
||||
@deftypefnx Macro int SCM_ARG5
|
||||
@deftypefnx Macro int SCM_ARG6
|
||||
@deftypefnx Macro int SCM_ARG7
|
||||
One of the above values can be used for @var{position} to indicate the
|
||||
number of the argument of @var{subr} which is being checked.
|
||||
Alternatively, a positive integer number can be used, which allows to
|
||||
check arguments after the seventh. However, for parameter numbers up to
|
||||
seven it is preferable to use @code{SCM_ARGN} instead of the
|
||||
corresponding raw number, since it will make the code easier to
|
||||
understand.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_ARGn
|
||||
Passing a value of zero or @code{SCM_ARGn} for @var{position} allows to
|
||||
leave it unspecified which argument's type is incorrect. Again,
|
||||
@code{SCM_ARGn} should be preferred over a raw zero constant.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Continuation Barriers
|
||||
@subsection Continuation Barriers
|
||||
|
||||
|
|
|
@ -1,11 +1,11 @@
|
|||
@c -*-texinfo-*-
|
||||
@c This is part of the GNU Guile Reference Manual.
|
||||
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004
|
||||
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2010
|
||||
@c Free Software Foundation, Inc.
|
||||
@c See the file guile.texi for copying conditions.
|
||||
|
||||
@node Data Representation in Scheme
|
||||
@section Data Representation in Scheme
|
||||
@node Data Representation
|
||||
@section Data Representation
|
||||
|
||||
Scheme is a latently-typed language; this means that the system cannot,
|
||||
in general, determine the type of a given expression at compile time.
|
||||
|
@ -27,27 +27,25 @@ single type large enough to hold either a complete value or a pointer
|
|||
to a complete value, along with the necessary typing information.
|
||||
|
||||
The following sections will present a simple typing system, and then
|
||||
make some refinements to correct its major weaknesses. However, this is
|
||||
not a description of the system Guile actually uses. It is only an
|
||||
illustration of the issues Guile's system must address. We provide all
|
||||
the information one needs to work with Guile's data in @ref{The
|
||||
Libguile Runtime Environment}.
|
||||
|
||||
make some refinements to correct its major weaknesses. We then conclude
|
||||
with a discussion of specific choices that Guile has made regarding
|
||||
garbage collection and data representation.
|
||||
|
||||
@menu
|
||||
* A Simple Representation::
|
||||
* Faster Integers::
|
||||
* Cheaper Pairs::
|
||||
* Guile Is Hairier::
|
||||
* Conservative GC::
|
||||
* The SCM Type in Guile::
|
||||
@end menu
|
||||
|
||||
@node A Simple Representation
|
||||
@subsection A Simple Representation
|
||||
|
||||
The simplest way to meet the above requirements in C would be to
|
||||
represent each value as a pointer to a structure containing a type
|
||||
indicator, followed by a union carrying the real value. Assuming that
|
||||
@code{SCM} is the name of our universal type, we can write:
|
||||
The simplest way to represent Scheme values in C would be to represent
|
||||
each value as a pointer to a structure containing a type indicator,
|
||||
followed by a union carrying the real value. Assuming that @code{SCM} is
|
||||
the name of our universal type, we can write:
|
||||
|
||||
@example
|
||||
enum type @{ integer, pair, string, vector, ... @};
|
||||
|
@ -98,17 +96,17 @@ too costly, in both time and space. Integers should be very cheap to
|
|||
create and manipulate.
|
||||
|
||||
One possible solution comes from the observation that, on many
|
||||
architectures, structures must be aligned on a four-byte boundary.
|
||||
(Whether or not the machine actually requires it, we can write our own
|
||||
allocator for @code{struct value} objects that assures this is true.)
|
||||
In this case, the lower two bits of the structure's address are known to
|
||||
be zero.
|
||||
architectures, heap-allocated data (i.e., what you get when you call
|
||||
@code{malloc}) must be aligned on an eight-byte boundary. (Whether or
|
||||
not the machine actually requires it, we can write our own allocator for
|
||||
@code{struct value} objects that assures this is true.) In this case,
|
||||
the lower three bits of the structure's address are known to be zero.
|
||||
|
||||
This gives us the room we need to provide an improved representation
|
||||
for integers. We make the following rules:
|
||||
@itemize @bullet
|
||||
@item
|
||||
If the lower two bits of an @code{SCM} value are zero, then the SCM
|
||||
If the lower three bits of an @code{SCM} value are zero, then the SCM
|
||||
value is a pointer to a @code{struct value}, and everything proceeds as
|
||||
before.
|
||||
@item
|
||||
|
@ -132,11 +130,11 @@ struct value @{
|
|||
@} value;
|
||||
@};
|
||||
|
||||
#define POINTER_P(x) (((int) (x) & 3) == 0)
|
||||
#define POINTER_P(x) (((int) (x) & 7) == 0)
|
||||
#define INTEGER_P(x) (! POINTER_P (x))
|
||||
|
||||
#define GET_INTEGER(x) ((int) (x) >> 2)
|
||||
#define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1))
|
||||
#define GET_INTEGER(x) ((int) (x) >> 3)
|
||||
#define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))
|
||||
@end example
|
||||
|
||||
Notice that @code{integer} no longer appears as an element of @code{enum
|
||||
|
@ -174,34 +172,36 @@ integers, we can compute their sum as follows:
|
|||
@example
|
||||
MAKE_INTEGER (GET_INTEGER (@var{x}) + GET_INTEGER (@var{y}))
|
||||
@end example
|
||||
Now, integer math requires no allocation or memory references. Most
|
||||
real Scheme systems actually use an even more efficient representation,
|
||||
but this essay isn't about bit-twiddling. (Hint: what if pointers had
|
||||
@code{01} in their least significant bits, and integers had @code{00}?)
|
||||
Now, integer math requires no allocation or memory references. Most real
|
||||
Scheme systems actually implement addition and other operations using an
|
||||
even more efficient algorithm, but this essay isn't about
|
||||
bit-twiddling. (Hint: how do you decide when to overflow to a bignum?
|
||||
How would you do it in assembly?)
|
||||
|
||||
|
||||
@node Cheaper Pairs
|
||||
@subsection Cheaper Pairs
|
||||
|
||||
However, there is yet another issue to confront. Most Scheme heaps
|
||||
contain more pairs than any other type of object; Jonathan Rees says
|
||||
that pairs occupy 45% of the heap in his Scheme implementation, Scheme
|
||||
48. However, our representation above spends three @code{SCM}-sized
|
||||
words per pair --- one for the type, and two for the @sc{car} and
|
||||
@sc{cdr}. Is there any way to represent pairs using only two words?
|
||||
contain more pairs than any other type of object; Jonathan Rees said at
|
||||
one point that pairs occupy 45% of the heap in his Scheme
|
||||
implementation, Scheme 48. However, our representation above spends
|
||||
three @code{SCM}-sized words per pair --- one for the type, and two for
|
||||
the @sc{car} and @sc{cdr}. Is there any way to represent pairs using
|
||||
only two words?
|
||||
|
||||
Let us refine the convention we established earlier. Let us assert
|
||||
that:
|
||||
@itemize @bullet
|
||||
@item
|
||||
If the bottom two bits of an @code{SCM} value are @code{#b00}, then
|
||||
If the bottom three bits of an @code{SCM} value are @code{#b000}, then
|
||||
it is a pointer, as before.
|
||||
@item
|
||||
If the bottom two bits are @code{#b01}, then the upper bits are an
|
||||
If the bottom three bits are @code{#b001}, then the upper bits are an
|
||||
integer. This is a bit more restrictive than before.
|
||||
@item
|
||||
If the bottom two bits are @code{#b10}, then the value, with the bottom
|
||||
two bits masked out, is the address of a pair.
|
||||
If the bottom two bits are @code{#b010}, then the value, with the bottom
|
||||
three bits masked out, is the address of a pair.
|
||||
@end itemize
|
||||
|
||||
Here is the new C code:
|
||||
|
@ -223,14 +223,14 @@ struct pair @{
|
|||
SCM car, cdr;
|
||||
@};
|
||||
|
||||
#define POINTER_P(x) (((int) (x) & 3) == 0)
|
||||
#define POINTER_P(x) (((int) (x) & 7) == 0)
|
||||
|
||||
#define INTEGER_P(x) (((int) (x) & 3) == 1)
|
||||
#define GET_INTEGER(x) ((int) (x) >> 2)
|
||||
#define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1))
|
||||
#define INTEGER_P(x) (((int) (x) & 7) == 1)
|
||||
#define GET_INTEGER(x) ((int) (x) >> 3)
|
||||
#define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))
|
||||
|
||||
#define PAIR_P(x) (((int) (x) & 3) == 2)
|
||||
#define GET_PAIR(x) ((struct pair *) ((int) (x) & ~3))
|
||||
#define PAIR_P(x) (((int) (x) & 7) == 2)
|
||||
#define GET_PAIR(x) ((struct pair *) ((int) (x) & ~7))
|
||||
@end example
|
||||
|
||||
Notice that @code{enum type} and @code{struct value} now only contain
|
||||
|
@ -278,94 +278,32 @@ are referencing, making a modified pointer as fast to use as an
|
|||
unmodified pointer.
|
||||
|
||||
|
||||
@node Guile Is Hairier
|
||||
@subsection Guile Is Hairier
|
||||
|
||||
We originally started with a very simple typing system --- each object
|
||||
has a field that indicates its type. Then, for the sake of efficiency
|
||||
in both time and space, we moved some of the typing information directly
|
||||
into the @code{SCM} value, and left the rest in the @code{struct value}.
|
||||
Guile itself employs a more complex hierarchy, storing finer and finer
|
||||
gradations of type information in different places, depending on the
|
||||
object's coarser type.
|
||||
|
||||
In the author's opinion, Guile could be simplified greatly without
|
||||
significant loss of efficiency, but the simplified system would still be
|
||||
more complex than what we've presented above.
|
||||
|
||||
|
||||
@node The Libguile Runtime Environment
|
||||
@section The Libguile Runtime Environment
|
||||
|
||||
Here we present the specifics of how Guile represents its data. We
|
||||
don't go into complete detail; an exhaustive description of Guile's
|
||||
system would be boring, and we do not wish to encourage people to write
|
||||
code which depends on its details anyway. We do, however, present
|
||||
everything one need know to use Guile's data. It is assumed that the
|
||||
reader understands the concepts laid out in @ref{Data Representation
|
||||
in Scheme}.
|
||||
|
||||
FIXME: much of this is outdated as of 1.8, we don't provide many of
|
||||
these macros any more. Also here we're missing sections about the
|
||||
evaluator implementation, which is interesting, and notes about tail
|
||||
recursion between scheme and c.
|
||||
|
||||
@menu
|
||||
* General Rules::
|
||||
* Conservative GC::
|
||||
* Immediates vs Non-immediates::
|
||||
* Immediate Datatypes::
|
||||
* Non-immediate Datatypes::
|
||||
* Signalling Type Errors::
|
||||
* Unpacking the SCM type::
|
||||
@end menu
|
||||
|
||||
@node General Rules
|
||||
@subsection General Rules
|
||||
|
||||
Any code which operates on Guile datatypes must @code{#include} the
|
||||
header file @code{<libguile.h>}. This file contains a definition for
|
||||
the @code{SCM} typedef (Guile's universal type, as in the examples
|
||||
above), and definitions and declarations for a host of macros and
|
||||
functions that operate on @code{SCM} values.
|
||||
|
||||
All identifiers declared by @code{<libguile.h>} begin with @code{scm_}
|
||||
or @code{SCM_}.
|
||||
|
||||
@c [[I wish this were true, but I don't think it is at the moment. -JimB]]
|
||||
@c Macros do not evaluate their arguments more than once, unless documented
|
||||
@c to do so.
|
||||
|
||||
The functions described here generally check the types of their
|
||||
@code{SCM} arguments, and signal an error if their arguments are of an
|
||||
inappropriate type. Macros generally do not, unless that is their
|
||||
specified purpose. You must verify their argument types beforehand, as
|
||||
necessary.
|
||||
|
||||
Macros and functions that return a boolean value have names ending in
|
||||
@code{P} or @code{_p} (for ``predicate''). Those that return a negated
|
||||
boolean value have names starting with @code{SCM_N}. For example,
|
||||
@code{SCM_IMP (@var{x})} is a predicate which returns non-zero iff
|
||||
@var{x} is an immediate value (an @code{IM}). @code{SCM_NCONSP
|
||||
(@var{x})} is a predicate which returns non-zero iff @var{x} is
|
||||
@emph{not} a pair object (a @code{CONS}).
|
||||
|
||||
|
||||
@node Conservative GC
|
||||
@subsection Conservative Garbage Collection
|
||||
|
||||
Aside from the latent typing, the major source of constraints on a
|
||||
Scheme implementation's data representation is the garbage collector.
|
||||
The collector must be able to traverse every live object in the heap, to
|
||||
determine which objects are not live.
|
||||
determine which objects are not live, and thus collectable.
|
||||
|
||||
There are many ways to implement this, but Guile uses an algorithm
|
||||
called @dfn{mark and sweep}. The collector scans the system's global
|
||||
variables and the local variables on the stack to determine which
|
||||
objects are immediately accessible by the C code. It then scans those
|
||||
objects to find the objects they point to, @i{et cetera}. The collector
|
||||
sets a @dfn{mark bit} on each object it finds, so each object is
|
||||
traversed only once. This process is called @dfn{tracing}.
|
||||
There are many ways to implement this. Guile's garbage collection is
|
||||
built on a library, the Boehm-Demers-Weiser conservative garbage
|
||||
collector (BDW-GC). The BDW-GC ``just works'', for the most part. But
|
||||
since it is interesting to know how these things work, we include here a
|
||||
high-level description of what the BDW-GC does.
|
||||
|
||||
Garbage collection has two logical phases: a @dfn{mark} phase, in which
|
||||
the set of live objects is enumerated, and a @dfn{sweep} phase, in which
|
||||
objects not traversed in the mark phase are collected. Correct
|
||||
functioning of the collector depends on being able to traverse the
|
||||
entire set of live objects.
|
||||
|
||||
In the mark phase, the collector scans the system's global variables and
|
||||
the local variables on the stack to determine which objects are
|
||||
immediately accessible by the C code. It then scans those objects to
|
||||
find the objects they point to, and so on. The collector logically sets
|
||||
a @dfn{mark bit} on each object it finds, so each object is traversed
|
||||
only once.
|
||||
|
||||
When the collector can find no unmarked objects pointed to by marked
|
||||
objects, it assumes that any objects that are still unmarked will never
|
||||
|
@ -382,7 +320,7 @@ for the collector's benefit.
|
|||
The list of global variables is usually not too difficult to maintain,
|
||||
since global variables are relatively rare. However, an explicitly
|
||||
maintained list of local variables (in the author's personal experience)
|
||||
is a nightmare to maintain. Thus, Guile uses a technique called
|
||||
is a nightmare to maintain. Thus, the BDW-GC uses a technique called
|
||||
@dfn{conservative garbage collection}, to make the local variable list
|
||||
unnecessary.
|
||||
|
||||
|
@ -392,50 +330,21 @@ is a pointer into the heap. Thus, the collector marks all objects whose
|
|||
addresses appear anywhere in the stack, without knowing for sure how
|
||||
that word is meant to be interpreted.
|
||||
|
||||
In addition to the stack, the BDW-GC will also scan static data
|
||||
sections. This means that global variables are also scanned when looking
|
||||
for live Scheme objects.
|
||||
|
||||
Obviously, such a system will occasionally retain objects that are
|
||||
actually garbage, and should be freed. In practice, this is not a
|
||||
problem. The alternative, an explicitly maintained list of local
|
||||
variable addresses, is effectively much less reliable, due to programmer
|
||||
error.
|
||||
|
||||
To accommodate this technique, data must be represented so that the
|
||||
collector can accurately determine whether a given stack word is a
|
||||
pointer or not. Guile does this as follows:
|
||||
|
||||
@itemize @bullet
|
||||
@item
|
||||
Every heap object has a two-word header, called a @dfn{cell}. Some
|
||||
objects, like pairs, fit entirely in a cell's two words; others may
|
||||
store pointers to additional memory in either of the words. For
|
||||
example, strings and vectors store their length in the first word, and a
|
||||
pointer to their elements in the second.
|
||||
|
||||
@item
|
||||
Guile allocates whole arrays of cells at a time, called @dfn{heap
|
||||
segments}. These segments are always allocated so that the cells they
|
||||
contain fall on eight-byte boundaries, or whatever is appropriate for
|
||||
the machine's word size. Guile keeps all cells in a heap segment
|
||||
initialized, whether or not they are currently in use.
|
||||
|
||||
@item
|
||||
Guile maintains a sorted table of heap segments.
|
||||
@end itemize
|
||||
|
||||
Thus, given any random word @var{w} fetched from the stack, Guile's
|
||||
garbage collector can consult the table to see if @var{w} falls within a
|
||||
known heap segment, and check @var{w}'s alignment. If both tests pass,
|
||||
the collector knows that @var{w} is a valid pointer to a cell,
|
||||
intentional or not, and proceeds to trace the cell.
|
||||
|
||||
Note that heap segments do not contain all the data Guile uses; cells
|
||||
for objects like vectors and strings contain pointers to other memory
|
||||
areas. However, since those pointers are internal, and not shared among
|
||||
many pieces of code, it is enough for the collector to find the cell,
|
||||
and then use the cell's type to find more pointers to trace.
|
||||
error. Interested readers should see the BDW-GC web page at
|
||||
@uref{http://www.hpl.hp.com/personal/Hans_Boehm/gc}, for more
|
||||
information.
|
||||
|
||||
|
||||
@node Immediates vs Non-immediates
|
||||
@subsection Immediates vs Non-immediates
|
||||
@node The SCM Type in Guile
|
||||
@subsection The SCM Type in Guile
|
||||
|
||||
Guile classifies Scheme objects into two kinds: those that fit entirely
|
||||
within an @code{SCM}, and those that require heap storage.
|
||||
|
@ -446,481 +355,15 @@ mysterious end-of-file object, and some others.
|
|||
|
||||
The remaining types are called, not surprisingly, @dfn{non-immediates}.
|
||||
They include pairs, procedures, strings, vectors, and all other data
|
||||
types in Guile.
|
||||
types in Guile. For non-immediates, the @code{SCM} word contains a
|
||||
pointer to data on the heap, with further information about the object
|
||||
in question is stored in that data.
|
||||
|
||||
@deftypefn Macro int SCM_IMP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is an immediate object.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_NIMP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a non-immediate object. This is the
|
||||
exact complement of @code{SCM_IMP}, above.
|
||||
@end deftypefn
|
||||
|
||||
Note that for versions of Guile prior to 1.4 it was necessary to use the
|
||||
@code{SCM_NIMP} macro before calling a finer-grained predicate to
|
||||
determine @var{x}'s type, such as @code{SCM_CONSP} or
|
||||
@code{SCM_VECTORP}. This is no longer required: the definitions of all
|
||||
Guile type predicates now include a call to @code{SCM_NIMP} where
|
||||
necessary.
|
||||
|
||||
|
||||
@node Immediate Datatypes
|
||||
@subsection Immediate Datatypes
|
||||
|
||||
The following datatypes are immediate values; that is, they fit entirely
|
||||
within an @code{SCM} value. The @code{SCM_IMP} and @code{SCM_NIMP}
|
||||
macros will distinguish these from non-immediates; see @ref{Immediates
|
||||
vs Non-immediates} for an explanation of the distinction.
|
||||
|
||||
Note that the type predicates for immediate values work correctly on any
|
||||
@code{SCM} value; you do not need to call @code{SCM_IMP} first, to
|
||||
establish that a value is immediate.
|
||||
|
||||
@menu
|
||||
* Integer Data::
|
||||
* Character Data::
|
||||
* Boolean Data::
|
||||
* Unique Values::
|
||||
@end menu
|
||||
|
||||
@node Integer Data
|
||||
@subsubsection Integers
|
||||
|
||||
Here are functions for operating on small integers, that fit within an
|
||||
@code{SCM}. Such integers are called @dfn{immediate numbers}, or
|
||||
@dfn{INUMs}. In general, INUMs occupy all but two bits of an
|
||||
@code{SCM}.
|
||||
|
||||
Bignums and floating-point numbers are non-immediate objects, and have
|
||||
their own, separate accessors. The functions here will not work on
|
||||
them. This is not as much of a problem as you might think, however,
|
||||
because the system never constructs bignums that could fit in an INUM,
|
||||
and never uses floating point values for exact integers.
|
||||
|
||||
@deftypefn Macro int SCM_INUMP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a small integer value.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_NINUMP (SCM @var{x})
|
||||
The complement of SCM_INUMP.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_INUM (SCM @var{x})
|
||||
Return the value of @var{x} as an ordinary, C integer. If @var{x}
|
||||
is not an INUM, the result is undefined.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_MAKINUM (int @var{i})
|
||||
Given a C integer @var{i}, return its representation as an @code{SCM}.
|
||||
This function does not check for overflow.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Character Data
|
||||
@subsubsection Characters
|
||||
|
||||
Here are functions for operating on characters.
|
||||
|
||||
@deftypefn Macro int SCM_CHARP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a character value.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro {unsigned int} SCM_CHAR (SCM @var{x})
|
||||
Return the value of @code{x} as a C character. If @var{x} is not a
|
||||
Scheme character, the result is undefined.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_MAKE_CHAR (int @var{c})
|
||||
Given a C character @var{c}, return its representation as a Scheme
|
||||
character value.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Boolean Data
|
||||
@subsubsection Booleans
|
||||
|
||||
Booleans are represented as two specific immediate SCM values,
|
||||
@code{SCM_BOOL_T} and @code{SCM_BOOL_F}. @xref{Booleans}, for more
|
||||
This section describes how the @code{SCM} type is actually represented
|
||||
and used at the C level. Interested readers should see
|
||||
@code{libguile/tags.h} for an exposition of how Guile stores type
|
||||
information.
|
||||
|
||||
@node Unique Values
|
||||
@subsubsection Unique Values
|
||||
|
||||
The immediate values that are neither small integers, characters, nor
|
||||
booleans are all unique values --- that is, datatypes with only one
|
||||
instance.
|
||||
|
||||
@deftypefn Macro SCM SCM_EOL
|
||||
The Scheme empty list object, or ``End Of List'' object, usually written
|
||||
in Scheme as @code{'()}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_EOF_VAL
|
||||
The Scheme end-of-file value. It has no standard written
|
||||
representation, for obvious reasons.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_UNSPECIFIED
|
||||
The value returned by expressions which the Scheme standard says return
|
||||
an ``unspecified'' value.
|
||||
|
||||
This is sort of a weirdly literal way to take things, but the standard
|
||||
read-eval-print loop prints nothing when the expression returns this
|
||||
value, so it's not a bad idea to return this when you can't think of
|
||||
anything else helpful.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_UNDEFINED
|
||||
The ``undefined'' value. Its most important property is that is not
|
||||
equal to any valid Scheme value. This is put to various internal uses
|
||||
by C code interacting with Guile.
|
||||
|
||||
For example, when you write a C function that is callable from Scheme
|
||||
and which takes optional arguments, the interpreter passes
|
||||
@code{SCM_UNDEFINED} for any arguments you did not receive.
|
||||
|
||||
We also use this to mark unbound variables.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_UNBNDP (SCM @var{x})
|
||||
Return true if @var{x} is @code{SCM_UNDEFINED}. Apply this to a
|
||||
symbol's value to see if it has a binding as a global variable.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Non-immediate Datatypes
|
||||
@subsection Non-immediate Datatypes
|
||||
|
||||
A non-immediate datatype is one which lives in the heap, either because
|
||||
it cannot fit entirely within a @code{SCM} word, or because it denotes a
|
||||
specific storage location (in the nomenclature of the Revised^5 Report
|
||||
on Scheme).
|
||||
|
||||
The @code{SCM_IMP} and @code{SCM_NIMP} macros will distinguish these
|
||||
from immediates; see @ref{Immediates vs Non-immediates}.
|
||||
|
||||
Given a cell, Guile distinguishes between pairs and other non-immediate
|
||||
types by storing special @dfn{tag} values in a non-pair cell's car, that
|
||||
cannot appear in normal pairs. A cell with a non-tag value in its car
|
||||
is an ordinary pair. The type of a cell with a tag in its car depends
|
||||
on the tag; the non-immediate type predicates test this value. If a tag
|
||||
value appears elsewhere (in a vector, for example), the heap may become
|
||||
corrupted.
|
||||
|
||||
Note how the type information for a non-immediate object is split
|
||||
between the @code{SCM} word and the cell that the @code{SCM} word points
|
||||
to. The @code{SCM} word itself only indicates that the object is
|
||||
non-immediate --- in other words stored in a heap cell. The tag stored
|
||||
in the first word of the heap cell indicates more precisely the type of
|
||||
that object.
|
||||
|
||||
The type predicates for non-immediate values work correctly on any
|
||||
@code{SCM} value; you do not need to call @code{SCM_NIMP} first, to
|
||||
establish that a value is non-immediate.
|
||||
|
||||
@menu
|
||||
* Pair Data::
|
||||
* Vector Data::
|
||||
* Procedures::
|
||||
* Closures::
|
||||
* Subrs::
|
||||
* Port Data::
|
||||
@end menu
|
||||
|
||||
|
||||
@node Pair Data
|
||||
@subsubsection Pairs
|
||||
|
||||
Pairs are the essential building block of list structure in Scheme. A
|
||||
pair object has two fields, called the @dfn{car} and the @dfn{cdr}.
|
||||
|
||||
It is conventional for a pair's @sc{car} to contain an element of a
|
||||
list, and the @sc{cdr} to point to the next pair in the list, or to
|
||||
contain @code{SCM_EOL}, indicating the end of the list. Thus, a set of
|
||||
pairs chained through their @sc{cdr}s constitutes a singly-linked list.
|
||||
Scheme and libguile define many functions which operate on lists
|
||||
constructed in this fashion, so although lists chained through the
|
||||
@sc{car}s of pairs will work fine too, they may be less convenient to
|
||||
manipulate, and receive less support from the community.
|
||||
|
||||
Guile implements pairs by mapping the @sc{car} and @sc{cdr} of a pair
|
||||
directly into the two words of the cell.
|
||||
|
||||
|
||||
@deftypefn Macro int SCM_CONSP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a Scheme pair object.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_NCONSP (SCM @var{x})
|
||||
The complement of SCM_CONSP.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefun SCM scm_cons (SCM @var{car}, SCM @var{cdr})
|
||||
Allocate (``CONStruct'') a new pair, with @var{car} and @var{cdr} as its
|
||||
contents.
|
||||
@end deftypefun
|
||||
|
||||
The macros below perform no type checking. The results are undefined if
|
||||
@var{cell} is an immediate. However, since all non-immediate Guile
|
||||
objects are constructed from cells, and these macros simply return the
|
||||
first element of a cell, they actually can be useful on datatypes other
|
||||
than pairs. (Of course, it is not very modular to use them outside of
|
||||
the code which implements that datatype.)
|
||||
|
||||
@deftypefn Macro SCM SCM_CAR (SCM @var{cell})
|
||||
Return the @sc{car}, or first field, of @var{cell}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_CDR (SCM @var{cell})
|
||||
Return the @sc{cdr}, or second field, of @var{cell}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro void SCM_SETCAR (SCM @var{cell}, SCM @var{x})
|
||||
Set the @sc{car} of @var{cell} to @var{x}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro void SCM_SETCDR (SCM @var{cell}, SCM @var{x})
|
||||
Set the @sc{cdr} of @var{cell} to @var{x}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_CAAR (SCM @var{cell})
|
||||
@deftypefnx Macro SCM SCM_CADR (SCM @var{cell})
|
||||
@deftypefnx Macro SCM SCM_CDAR (SCM @var{cell}) @dots{}
|
||||
@deftypefnx Macro SCM SCM_CDDDDR (SCM @var{cell})
|
||||
Return the @sc{car} of the @sc{car} of @var{cell}, the @sc{car} of the
|
||||
@sc{cdr} of @var{cell}, @i{et cetera}.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Vector Data
|
||||
@subsubsection Vectors, Strings, and Symbols
|
||||
|
||||
Vectors, strings, and symbols have some properties in common. They all
|
||||
have a length, and they all have an array of elements. In the case of a
|
||||
vector, the elements are @code{SCM} values; in the case of a string or
|
||||
symbol, the elements are characters.
|
||||
|
||||
All these types store their length (along with some tagging bits) in the
|
||||
@sc{car} of their header cell, and store a pointer to the elements in
|
||||
their @sc{cdr}. Thus, the @code{SCM_CAR} and @code{SCM_CDR} macros
|
||||
are (somewhat) meaningful when applied to these datatypes.
|
||||
|
||||
@deftypefn Macro int SCM_VECTORP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a vector.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_STRINGP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a string.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_SYMBOLP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a symbol.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_VECTOR_LENGTH (SCM @var{x})
|
||||
@deftypefnx Macro int SCM_STRING_LENGTH (SCM @var{x})
|
||||
@deftypefnx Macro int SCM_SYMBOL_LENGTH (SCM @var{x})
|
||||
Return the length of the object @var{x}. The result is undefined if
|
||||
@var{x} is not a vector, string, or symbol, respectively.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro {SCM *} SCM_VECTOR_BASE (SCM @var{x})
|
||||
Return a pointer to the array of elements of the vector @var{x}.
|
||||
The result is undefined if @var{x} is not a vector.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro {char *} SCM_STRING_CHARS (SCM @var{x})
|
||||
@deftypefnx Macro {char *} SCM_SYMBOL_CHARS (SCM @var{x})
|
||||
Return a pointer to the characters of @var{x}. The result is undefined
|
||||
if @var{x} is not a symbol or string, respectively.
|
||||
@end deftypefn
|
||||
|
||||
There are also a few magic values stuffed into memory before a symbol's
|
||||
characters, but you don't want to know about those. What cruft!
|
||||
|
||||
Note that @code{SCM_VECTOR_BASE}, @code{SCM_STRING_CHARS} and
|
||||
@code{SCM_SYMBOL_CHARS} return pointers to data within the respective
|
||||
object. Care must be taken that the object is not garbage collected
|
||||
while that data is still being accessed. This is the same as for a
|
||||
smob, @xref{Remembering During Operations}.
|
||||
|
||||
|
||||
@node Procedures
|
||||
@subsubsection Procedures
|
||||
|
||||
Guile provides two kinds of procedures: @dfn{closures}, which are the
|
||||
result of evaluating a @code{lambda} expression, and @dfn{subrs}, which
|
||||
are C functions packaged up as Scheme objects, to make them available to
|
||||
Scheme programmers.
|
||||
|
||||
(There are actually other sorts of procedures: compiled closures, and
|
||||
continuations; see the source code for details about them.)
|
||||
|
||||
@deftypefun SCM scm_procedure_p (SCM @var{x})
|
||||
Return @code{SCM_BOOL_T} iff @var{x} is a Scheme procedure object, of
|
||||
any sort. Otherwise, return @code{SCM_BOOL_F}.
|
||||
@end deftypefun
|
||||
|
||||
|
||||
@node Closures
|
||||
@subsubsection Closures
|
||||
|
||||
[FIXME: this needs to be further subbed, but texinfo has no subsubsub]
|
||||
|
||||
A closure is a procedure object, generated as the value of a
|
||||
@code{lambda} expression in Scheme. The representation of a closure is
|
||||
straightforward --- it contains a pointer to the code of the lambda
|
||||
expression from which it was created, and a pointer to the environment
|
||||
it closes over.
|
||||
|
||||
In Guile, each closure also has a property list, allowing the system to
|
||||
store information about the closure. I'm not sure what this is used for
|
||||
at the moment --- the debugger, maybe?
|
||||
|
||||
@deftypefn Macro int SCM_CLOSUREP (SCM @var{x})
|
||||
Return non-zero iff @var{x} is a closure.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_PROCPROPS (SCM @var{x})
|
||||
Return the property list of the closure @var{x}. The results are
|
||||
undefined if @var{x} is not a closure.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro void SCM_SETPROCPROPS (SCM @var{x}, SCM @var{p})
|
||||
Set the property list of the closure @var{x} to @var{p}. The results
|
||||
are undefined if @var{x} is not a closure.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_CODE (SCM @var{x})
|
||||
Return the code of the closure @var{x}. The result is undefined if
|
||||
@var{x} is not a closure.
|
||||
|
||||
This function should probably only be used internally by the
|
||||
interpreter, since the representation of the code is intimately
|
||||
connected with the interpreter's implementation.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_ENV (SCM @var{x})
|
||||
Return the environment enclosed by @var{x}.
|
||||
The result is undefined if @var{x} is not a closure.
|
||||
|
||||
This function should probably only be used internally by the
|
||||
interpreter, since the representation of the environment is intimately
|
||||
connected with the interpreter's implementation.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Subrs
|
||||
@subsubsection Subrs
|
||||
|
||||
[FIXME: this needs to be further subbed, but texinfo has no subsubsub]
|
||||
|
||||
A subr is a pointer to a C function, packaged up as a Scheme object to
|
||||
make it callable by Scheme code. In addition to the function pointer,
|
||||
the subr also contains a pointer to the name of the function, and
|
||||
information about the number of arguments accepted by the C function, for
|
||||
the sake of error checking.
|
||||
|
||||
There is no single type predicate macro that recognizes subrs, as
|
||||
distinct from other kinds of procedures. The closest thing is
|
||||
@code{scm_procedure_p}; see @ref{Procedures}.
|
||||
|
||||
@deftypefn Macro {char *} SCM_SNAME (@var{x})
|
||||
Return the name of the subr @var{x}. The result is undefined if
|
||||
@var{x} is not a subr.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefun SCM scm_c_define_gsubr (char *@var{name}, int @var{req}, int @var{opt}, int @var{rest}, SCM (*@var{function})())
|
||||
Create a new subr object named @var{name}, based on the C function
|
||||
@var{function}, make it visible to Scheme the value of as a global
|
||||
variable named @var{name}, and return the subr object.
|
||||
|
||||
The subr object accepts @var{req} required arguments, @var{opt} optional
|
||||
arguments, and a @var{rest} argument iff @var{rest} is non-zero. The C
|
||||
function @var{function} should accept @code{@var{req} + @var{opt}}
|
||||
arguments, or @code{@var{req} + @var{opt} + 1} arguments if @code{rest}
|
||||
is non-zero.
|
||||
|
||||
When a subr object is applied, it must be applied to at least @var{req}
|
||||
arguments, or else Guile signals an error. @var{function} receives the
|
||||
subr's first @var{req} arguments as its first @var{req} arguments. If
|
||||
there are fewer than @var{opt} arguments remaining, then @var{function}
|
||||
receives the value @code{SCM_UNDEFINED} for any missing optional
|
||||
arguments.
|
||||
|
||||
If @var{rst} is non-zero, then any arguments after the first
|
||||
@code{@var{req} + @var{opt}} are packaged up as a list and passed as
|
||||
@var{function}'s last argument. @var{function} must not modify that
|
||||
list. (Because when subr is called through @code{apply} the list is
|
||||
directly from the @code{apply} argument, which the caller will expect
|
||||
to be unchanged.)
|
||||
|
||||
Note that subrs can actually only accept a predefined set of
|
||||
combinations of required, optional, and rest arguments. For example, a
|
||||
subr can take one required argument, or one required and one optional
|
||||
argument, but a subr can't take one required and two optional arguments.
|
||||
It's bizarre, but that's the way the interpreter was written. If the
|
||||
arguments to @code{scm_c_define_gsubr} do not fit one of the predefined
|
||||
patterns, then @code{scm_c_define_gsubr} will return a compiled closure
|
||||
object instead of a subr object.
|
||||
@end deftypefun
|
||||
|
||||
|
||||
@node Port Data
|
||||
@subsubsection Ports
|
||||
|
||||
Haven't written this yet, 'cos I don't understand ports yet.
|
||||
|
||||
|
||||
@node Signalling Type Errors
|
||||
@subsection Signalling Type Errors
|
||||
|
||||
Every function visible at the Scheme level should aggressively check the
|
||||
types of its arguments, to avoid misinterpreting a value, and perhaps
|
||||
causing a segmentation fault. Guile provides some macros to make this
|
||||
easier.
|
||||
|
||||
@deftypefn Macro void SCM_ASSERT (int @var{test}, SCM @var{obj}, unsigned int @var{position}, const char *@var{subr})
|
||||
If @var{test} is zero, signal a ``wrong type argument'' error,
|
||||
attributed to the subroutine named @var{subr}, operating on the value
|
||||
@var{obj}, which is the @var{position}'th argument of @var{subr}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_ARG1
|
||||
@deftypefnx Macro int SCM_ARG2
|
||||
@deftypefnx Macro int SCM_ARG3
|
||||
@deftypefnx Macro int SCM_ARG4
|
||||
@deftypefnx Macro int SCM_ARG5
|
||||
@deftypefnx Macro int SCM_ARG6
|
||||
@deftypefnx Macro int SCM_ARG7
|
||||
One of the above values can be used for @var{position} to indicate the
|
||||
number of the argument of @var{subr} which is being checked.
|
||||
Alternatively, a positive integer number can be used, which allows to
|
||||
check arguments after the seventh. However, for parameter numbers up to
|
||||
seven it is preferable to use @code{SCM_ARGN} instead of the
|
||||
corresponding raw number, since it will make the code easier to
|
||||
understand.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_ARGn
|
||||
Passing a value of zero or @code{SCM_ARGn} for @var{position} allows to
|
||||
leave it unspecified which argument's type is incorrect. Again,
|
||||
@code{SCM_ARGn} should be preferred over a raw zero constant.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Unpacking the SCM type
|
||||
@subsection Unpacking the SCM Type
|
||||
|
||||
The previous sections have explained how @code{SCM} values can refer to
|
||||
immediate and non-immediate Scheme objects. For immediate objects, the
|
||||
complete object value is stored in the @code{SCM} word itself, while for
|
||||
non-immediates, the @code{SCM} word contains a pointer to a heap cell,
|
||||
and further information about the object in question is stored in that
|
||||
cell. This section describes how the @code{SCM} type is actually
|
||||
represented and used at the C level.
|
||||
|
||||
In fact, there are two basic C data types to represent objects in
|
||||
Guile: @code{SCM} and @code{scm_t_bits}.
|
||||
|
||||
|
@ -931,7 +374,6 @@ Guile: @code{SCM} and @code{scm_t_bits}.
|
|||
* Allocating Cells::
|
||||
* Heap Cell Type Information::
|
||||
* Accessing Cell Entries::
|
||||
* Basic Rules for Accessing Cell Entries::
|
||||
@end menu
|
||||
|
||||
|
||||
|
@ -986,6 +428,48 @@ If so, all of the type and value information can be determined from the
|
|||
(@var{x})}.
|
||||
@end itemize
|
||||
|
||||
There are a number of special values in Scheme, most of them documented
|
||||
elsewhere in this manual. It's not quite the right place to put them,
|
||||
but for now, here's a list of the C names given to some of these values:
|
||||
|
||||
@deftypefn Macro SCM SCM_EOL
|
||||
The Scheme empty list object, or ``End Of List'' object, usually written
|
||||
in Scheme as @code{'()}.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_EOF_VAL
|
||||
The Scheme end-of-file value. It has no standard written
|
||||
representation, for obvious reasons.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_UNSPECIFIED
|
||||
The value returned by expressions which the Scheme standard says return
|
||||
an ``unspecified'' value.
|
||||
|
||||
This is sort of a weirdly literal way to take things, but the standard
|
||||
read-eval-print loop prints nothing when the expression returns this
|
||||
value, so it's not a bad idea to return this when you can't think of
|
||||
anything else helpful.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro SCM SCM_UNDEFINED
|
||||
The ``undefined'' value. Its most important property is that is not
|
||||
equal to any valid Scheme value. This is put to various internal uses
|
||||
by C code interacting with Guile.
|
||||
|
||||
For example, when you write a C function that is callable from Scheme
|
||||
and which takes optional arguments, the interpreter passes
|
||||
@code{SCM_UNDEFINED} for any arguments you did not receive.
|
||||
|
||||
We also use this to mark unbound variables.
|
||||
@end deftypefn
|
||||
|
||||
@deftypefn Macro int SCM_UNBNDP (SCM @var{x})
|
||||
Return true if @var{x} is @code{SCM_UNDEFINED}. Note that this is not a
|
||||
check to see if @var{x} is @code{SCM_UNBOUND}. History will not be kind
|
||||
to us.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@node Non-immediate objects
|
||||
@subsubsection Non-immediate objects
|
||||
|
@ -1187,31 +671,6 @@ entries.
|
|||
@end itemize
|
||||
|
||||
|
||||
@node Basic Rules for Accessing Cell Entries
|
||||
@subsubsection Basic Rules for Accessing Cell Entries
|
||||
|
||||
For each cell type it is generally up to the implementation of that type
|
||||
which of the corresponding cell entries hold Scheme objects and which
|
||||
hold raw C values. However, there is one basic rule that has to be
|
||||
followed: Scheme pairs consist of exactly two cell entries, which both
|
||||
contain Scheme objects. Further, a cell which contains a Scheme object
|
||||
in it first entry has to be a Scheme pair. In other words, it is not
|
||||
allowed to store a Scheme object in the first cell entry and a non
|
||||
Scheme object in the second cell entry.
|
||||
|
||||
@c Fixme:shouldn't this rather be SCM_PAIRP / SCM_PAIR_P ?
|
||||
@deftypefn Macro int SCM_CONSP (SCM @var{x})
|
||||
Determine, whether the Scheme object @var{x} is a Scheme pair,
|
||||
i.e. whether @var{x} references a heap cell consisting of exactly two
|
||||
entries, where both entries contain a Scheme object. In this case, both
|
||||
entries will have to be accessed using the @code{SCM_CELL_OBJECT}
|
||||
macros. On the contrary, if the @code{SCM_CONSP} predicate is not
|
||||
fulfilled, the first entry of the Scheme cell is guaranteed not to be a
|
||||
Scheme value and thus the first cell entry must be accessed using the
|
||||
@code{SCM_CELL_WORD_0} macro.
|
||||
@end deftypefn
|
||||
|
||||
|
||||
@c Local Variables:
|
||||
@c TeX-master: "guile.texi"
|
||||
@c End:
|
||||
|
|
|
@ -13,7 +13,7 @@
|
|||
@copying
|
||||
This manual documents Guile version @value{VERSION}.
|
||||
|
||||
Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2009 Free
|
||||
Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2009, 2010 Free
|
||||
Software Foundation.
|
||||
|
||||
Permission is granted to copy, distribute and/or modify this document
|
||||
|
@ -254,13 +254,10 @@ musings and guidelines about programming with Guile. It explores
|
|||
different ways to design a program around Guile, or how to embed Guile
|
||||
into existing programs.
|
||||
|
||||
There is also a pedagogical yet detailed explanation of how the data
|
||||
representation of Guile is implemented, see @ref{Data Representation in
|
||||
Scheme} and @ref{The Libguile Runtime Environment}.
|
||||
|
||||
You don't need to know the details given there to use Guile from C,
|
||||
but they are useful when you want to modify Guile itself or when you
|
||||
are just curious about how it is all done.
|
||||
For a pedagogical yet detailed explanation of how the data representation of
|
||||
Guile is implemented, @xref{Data Representation}. You don't need to know the
|
||||
details given there to use Guile from C, but they are useful when you want to
|
||||
modify Guile itself or when you are just curious about how it is all done.
|
||||
|
||||
For detailed reference information on the variables, functions
|
||||
etc. that make up Guile's application programming interface (API),
|
||||
|
@ -400,10 +397,7 @@ merely familiar with Scheme to being a real hacker.
|
|||
|
||||
@menu
|
||||
* History:: A brief history of Guile.
|
||||
* Data Representation in Scheme:: Why things aren't just totally
|
||||
straightforward, in general terms.
|
||||
* The Libguile Runtime Environment:: Low-level details on Guile's C
|
||||
runtime library.
|
||||
* Data Representation:: How Guile represents Scheme data.
|
||||
* A Virtual Machine for Guile:: How compiled procedures work.
|
||||
* Compiling to the Virtual Machine:: Not as hard as you might think.
|
||||
@end menu
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
@c -*-texinfo-*-
|
||||
@c This is part of the GNU Guile Reference Manual.
|
||||
@c Copyright (C) 2008
|
||||
@c Copyright (C) 2008, 2010
|
||||
@c Free Software Foundation, Inc.
|
||||
@c See the file guile.texi for copying conditions.
|
||||
|
||||
|
@ -134,7 +134,8 @@ Since then, Guile has had a group maintainership. The first group was
|
|||
Maciej Stachowiak, Mikael Djurfeldt, and Marius Vollmer, with Vollmer
|
||||
staying on the longest. By late 2007, Vollmer had mostly moved on to
|
||||
other things, so Neil Jerram and Ludovic Courtès stepped up to take on
|
||||
the primary maintenance responsibility.
|
||||
the primary maintenance responsibility. Jerram and Courtès were joined
|
||||
by Andy Wingo in late 2009.
|
||||
|
||||
Of course, a large part of the actual work on Guile has come from
|
||||
other contributors too numerous to mention, but without whom the world
|
||||
|
@ -167,18 +168,17 @@ less the same form.
|
|||
@itemx 1.2 --- 24 June 1997
|
||||
Support for Tcl/Tk and ctax were split off as separate packages, and
|
||||
have remained there since. Guile became more compatible with SCSH, and
|
||||
more useful as a UNIX scripting language. Libguile can now be built as
|
||||
more useful as a UNIX scripting language. Libguile could now be built as
|
||||
a shared library, and third-party extensions written in C became
|
||||
loadable via dynamic linking.
|
||||
|
||||
@item 1.3.0 --- 19 October 1998
|
||||
Command-line editing became much more pleasant through the use of the
|
||||
readline library. The initial support for internationalization via
|
||||
multi-byte strings was removed, and has yet to be added back, though
|
||||
UTF-8 hacks are common. Modules gained the ability to have custom
|
||||
expanders, which is still used for syntax-case macros. Initial Emacs
|
||||
Lisp support landed, ports gained better support for file descriptors,
|
||||
and fluids were added.
|
||||
multi-byte strings was removed; 10 years were to pass before proper
|
||||
internationalization would land again. Initial Emacs Lisp support
|
||||
landed, ports gained better support for file descriptors, and fluids
|
||||
were added.
|
||||
|
||||
@item 1.3.2 --- 20 August 1999
|
||||
@itemx 1.3.4 --- 25 September 1999
|
||||
|
@ -186,8 +186,8 @@ and fluids were added.
|
|||
A long list of lispy features were added: hooks, Common Lisp's
|
||||
@code{format}, optional and keyword procedure arguments,
|
||||
@code{getopt-long}, sorting, random numbers, and many other fixes and
|
||||
enhancements. Guile now has an interactive debugger, interactive help,
|
||||
and gives better backtraces.
|
||||
enhancements. Guile also gained an interactive debugger, interactive
|
||||
help, and better backtraces.
|
||||
|
||||
@item 1.6 --- 6 September 2002
|
||||
Guile gained support for the R5RS standard, and added a number of SRFI
|
||||
|
@ -202,12 +202,15 @@ user-space threading was removed in favor of POSIX pre-emptive
|
|||
threads, providing true multiprocessing. Gettext support was added,
|
||||
and Guile's C API was cleaned up and orthogonalized in a massive way.
|
||||
|
||||
@item 2.0 --- thus far, only unstable snapshots available
|
||||
A virtual machine was added to Guile, along with the associated
|
||||
compiler and toolchain. Support for internationalization was added.
|
||||
Running Guile instances became controllable and debuggable from within
|
||||
Emacs, via GDS, which was also backported to 1.8.5. An SRFI-18
|
||||
interface to multithreading was added, including thread cancellation.
|
||||
@item 2.0 --- March 2010
|
||||
A virtual machine was added to Guile, along with the associated compiler
|
||||
and toolchain. Support for internationalization was finally
|
||||
reimplemented, in terms of unicode, locales, and libunistring. Running
|
||||
Guile instances became controllable and debuggable from within Emacs,
|
||||
via GDS and Geiser. Guile caught up to features found in a number of
|
||||
other Schemes: SRFI-18 threads, including thread cancellation,
|
||||
module-hygienic macros, a profiler, tracer, and debugger, SSAX XML
|
||||
integration, bytevectors, module versions, and partial support for R6RS.
|
||||
@end table
|
||||
|
||||
@node Status
|
||||
|
@ -267,12 +270,12 @@ language with a syntax that is closer to C, or to Python. Another
|
|||
interesting idea to consider is compiling e.g. Python to Guile. It's
|
||||
not that far-fetched of an idea: see for example IronPython or JRuby.
|
||||
|
||||
And then there's Emacs itself. Though there is a somewhat-working
|
||||
Emacs Lisp translator for Guile, it cannot yet execute all of Emacs
|
||||
Lisp. A serious integration of Guile with Emacs would replace the
|
||||
Elisp virtual machine with Guile, and provide the necessary C shims so
|
||||
that Guile could emulate Emacs' C API. This would give lots of
|
||||
exciting things to Emacs: native threads, a real object system, more
|
||||
And then there's Emacs itself. Though there is a somewhat-working Emacs
|
||||
Lisp language frontend for Guile, it cannot yet execute all of Emacs
|
||||
Lisp. A serious integration of Guile with Emacs would replace the Elisp
|
||||
virtual machine with Guile, and provide the necessary C shims so that
|
||||
Guile could emulate Emacs' C API. This would give lots of exciting
|
||||
things to Emacs: native threads, a real object system, more
|
||||
sophisticated types, cleaner syntax, and access to all of the Guile
|
||||
extensions.
|
||||
|
||||
|
|
|
@ -153,8 +153,8 @@ that have been added to Guile by third-party libraries.
|
|||
|
||||
Also, computing with @code{SCM} is not necessarily inefficient. Small
|
||||
integers will be encoded directly in the @code{SCM} value, for example,
|
||||
and do not need any additional memory on the heap. See @ref{The
|
||||
Libguile Runtime Environment} to find out the details.
|
||||
and do not need any additional memory on the heap. See @ref{Data
|
||||
Representation} to find out the details.
|
||||
|
||||
Some special @code{SCM} values are available to C code without needing
|
||||
to convert them from C values:
|
||||
|
@ -170,9 +170,8 @@ In addition to @code{SCM}, Guile also defines the related type
|
|||
@code{scm_t_bits}. This is an unsigned integral type of sufficient
|
||||
size to hold all information that is directly contained in a
|
||||
@code{SCM} value. The @code{scm_t_bits} type is used internally by
|
||||
Guile to do all the bit twiddling explained in @ref{The Libguile
|
||||
Runtime Environment}, but you will encounter it occasionally in low-level
|
||||
user code as well.
|
||||
Guile to do all the bit twiddling explained in @ref{Data Representation}, but
|
||||
you will encounter it occasionally in low-level user code as well.
|
||||
|
||||
|
||||
@node Garbage Collection
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
@c -*-texinfo-*-
|
||||
@c This is part of the GNU Guile Reference Manual.
|
||||
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005
|
||||
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2010
|
||||
@c Free Software Foundation, Inc.
|
||||
@c See the file guile.texi for copying conditions.
|
||||
|
||||
|
@ -69,8 +69,7 @@ function is allowed to do.
|
|||
Guile will apply this function to each instance of the new type to print
|
||||
the value, as for @code{display} or @code{write}. The default print
|
||||
function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
|
||||
argument passed to @code{scm_make_smob_type}. For more information on
|
||||
printing, see @ref{Port Data}.
|
||||
argument passed to @code{scm_make_smob_type}.
|
||||
|
||||
@item equalp
|
||||
If Scheme code asks the @code{equal?} function to compare two instances
|
||||
|
@ -521,7 +520,7 @@ Smobs are called smob because they are small: they normally have only
|
|||
room for one @code{void*} or @code{SCM} value plus 16 bits. The
|
||||
reason for this is that smobs are directly implemented by using the
|
||||
low-level, two-word cells of Guile that are also used to implement
|
||||
pairs, for example. (@pxref{The Libguile Runtime Environment} for the
|
||||
pairs, for example. (@pxref{Data Representation} for the
|
||||
details.) One word of the two-word cells is used for
|
||||
@code{SCM_SMOB_DATA} (or @code{SCM_SMOB_OBJECT}), the other contains
|
||||
the 16-bit type tag and the 16 extra bits.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue