1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-05-20 11:40:18 +02:00

Update data representation chapter in the manual

* doc/ref/data-rep.texi (The SCM Type in Guile): Refer to scm.h.
(Relationship Between SCM and scm_t_bits): Better title-case.
(Immediate Objects): Better title-case.  Prefer "heap object" over
"cell".
(Non-Immediate Objects): Better title-case.  Deprecate the concept of
cells.
(Allocating Heap Objects): Rename from Allocating Cells.
(Heap Object Type Information): Rename from Heap Cell Type Information.
(Accessing Heap Object Fields): Rename from Accessing Cell Entries.
* doc/ref/vm.texi: Update references.
This commit is contained in:
Andy Wingo 2018-09-27 13:43:48 +02:00
parent a691540703
commit 6be54f4526
2 changed files with 132 additions and 149 deletions

View file

@ -365,24 +365,24 @@ in question is stored in that data.
This section describes how the @code{SCM} type is actually represented This section describes how the @code{SCM} type is actually represented
and used at the C level. Interested readers should see and used at the C level. Interested readers should see
@code{libguile/tags.h} for an exposition of how Guile stores type @code{libguile/scm.h} for an exposition of how Guile stores type
information. information.
In fact, there are two basic C data types to represent objects in In fact, there are two basic C data types to represent objects in
Guile: @code{SCM} and @code{scm_t_bits}. Guile: @code{SCM} and @code{scm_t_bits}.
@menu @menu
* Relationship between SCM and scm_t_bits:: * Relationship Between SCM and scm_t_bits::
* Immediate objects:: * Immediate Objects::
* Non-immediate objects:: * Non-Immediate Objects::
* Allocating Cells:: * Allocating Heap Objects::
* Heap Cell Type Information:: * Heap Object Type Information::
* Accessing Cell Entries:: * Accessing Heap Object Fields::
@end menu @end menu
@node Relationship between SCM and scm_t_bits @node Relationship Between SCM and scm_t_bits
@subsubsection Relationship between @code{SCM} and @code{scm_t_bits} @subsubsection Relationship Between @code{SCM} and @code{scm_t_bits}
A variable of type @code{SCM} is guaranteed to hold a valid Scheme A variable of type @code{SCM} is guaranteed to hold a valid Scheme
object. A variable of type @code{scm_t_bits}, on the other hand, may object. A variable of type @code{scm_t_bits}, on the other hand, may
@ -402,19 +402,20 @@ chapter (@pxref{Cheaper Pairs}). Conversely, a valid bit encoding of a
Scheme value as a @code{scm_t_bits} variable can be transformed into the Scheme value as a @code{scm_t_bits} variable can be transformed into the
corresponding @code{SCM} value using the @code{SCM_PACK} macro. corresponding @code{SCM} value using the @code{SCM_PACK} macro.
@node Immediate objects @node Immediate Objects
@subsubsection Immediate objects @subsubsection Immediate Objects
A Scheme object may either be an immediate, i.e.@: carrying all necessary A Scheme object may either be an immediate, i.e.@: carrying all
information by itself, or it may contain a reference to a @dfn{cell} necessary information by itself, or it may contain a reference to a
with additional information on the heap. Although in general it should @dfn{heap object} which is, as the name implies, data on the heap.
be irrelevant for user code whether an object is an immediate or not, Although in general it should be irrelevant for user code whether an
within Guile's own code the distinction is sometimes of importance. object is an immediate or not, within Guile's own code the distinction
Thus, the following low level macro is provided: is sometimes of importance. Thus, the following low level macro is
provided:
@deftypefn Macro int SCM_IMP (SCM @var{x}) @deftypefn Macro int SCM_IMP (SCM @var{x})
A Scheme object is an immediate if it fulfills the @code{SCM_IMP} A Scheme object is an immediate if it fulfills the @code{SCM_IMP}
predicate, otherwise it holds an encoded reference to a heap cell. The predicate, otherwise it holds an encoded reference to a heap object. The
result of the predicate is delivered as a C style boolean value. User result of the predicate is delivered as a C style boolean value. User
code and code that extends Guile should normally not be required to use code and code that extends Guile should normally not be required to use
this macro. this macro.
@ -475,67 +476,88 @@ to us.
@end deftypefn @end deftypefn
@node Non-immediate objects @node Non-Immediate Objects
@subsubsection Non-immediate objects @subsubsection Non-Immediate Objects
A Scheme object of type @code{SCM} that does not fulfill the A Scheme object of type @code{SCM} that does not fulfill the
@code{SCM_IMP} predicate holds an encoded reference to a heap cell. @code{SCM_IMP} predicate holds an encoded reference to a heap object.
This reference can be decoded to a C pointer to a heap cell using the This reference can be decoded to a C pointer to a heap object using the
@code{SCM2PTR} macro. The encoding of a pointer to a heap cell into a @code{SCM_UNPACK_POINTER} macro. The encoding of a pointer to a heap
@code{SCM} value is done using the @code{PTR2SCM} macro. object into a @code{SCM} value is done using the @code{SCM_PACK_POINTER}
macro.
@c (FIXME:: this name should be changed) @cindex cells, deprecated concept
@deftypefn Macro {scm_t_cell *} SCM2PTR (SCM @var{x}) Before Guile 2.0, Guile had a custom garbage collector that allocated
Extract and return the heap cell pointer from a non-immediate @code{SCM} heap objects in units of 2-word @dfn{cells}. With the move to the
object @var{x}. BDW-GC collector in Guile 2.0, Guile can allocate heap objects of any
size, and the concept of a cell is now obsolete. Still, we mention
it here as the name stil appears in various low-level interfaces.
@deftypefn Macro {scm_t_bits *} SCM_UNPACK_POINTER (SCM @var{x})
@deftypefnx Macro {scm_t_cell *} SCM2PTR (SCM @var{x})
Extract and return the heap object pointer from a non-immediate
@code{SCM} object @var{x}. The name @code{SCM2PTR} is deprecated but
still common.
@end deftypefn @end deftypefn
@c (FIXME:: this name should be changed) @deftypefn Macro SCM_PACK_POINTER (scm_t_bits * @var{x})
@deftypefn Macro SCM PTR2SCM (scm_t_cell * @var{x}) @deftypefnx Macro SCM PTR2SCM (scm_t_cell * @var{x})
Return a @code{SCM} value that encodes a reference to the heap cell Return a @code{SCM} value that encodes a reference to the heap object
pointer @var{x}. pointer @var{x}. The name @code{PTR2SCM} is deprecated but still
common.
@end deftypefn @end deftypefn
Note that it is also possible to transform a non-immediate @code{SCM} Note that it is also possible to transform a non-immediate @code{SCM}
value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable. value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable.
However, the result of @code{SCM_UNPACK} may not be used as a pointer to However, the result of @code{SCM_UNPACK} may not be used as a pointer to
a @code{scm_t_cell}: only @code{SCM2PTR} is guaranteed to transform a a heap object: only @code{SCM_UNPACK_POINTER} is guaranteed to transform
@code{SCM} object into a valid pointer to a heap cell. Also, it is not a @code{SCM} object into a valid pointer to a heap object. Also, it is
allowed to apply @code{PTR2SCM} to anything that is not a valid pointer not allowed to apply @code{SCM_PACK_POINTER} to anything that is not a
to a heap cell. valid pointer to a heap object.
@noindent @noindent
Summary: Summary:
@itemize @bullet @itemize @bullet
@item @item
Only use @code{SCM2PTR} on @code{SCM} values for which @code{SCM_IMP} is Only use @code{SCM_UNPACK_POINTER} on @code{SCM} values for which
false! @code{SCM_IMP} is false!
@item @item
Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use @code{SCM2PTR Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use
(@var{x})} instead! @code{SCM_UNPACK_POINTER (@var{x})} instead!
@item @item
Don't use @code{PTR2SCM} for anything but a cell pointer! Don't use @code{SCM_PACK_POINTER} for anything but a heap object pointer!
@end itemize @end itemize
@node Allocating Cells @node Allocating Heap Objects
@subsubsection Allocating Cells @subsubsection Allocating Heap Objects
Guile provides both ordinary cells with two slots, and double cells Heap objects are heap-allocated data pointed to by non-immediate
with four slots. The following two function are the most primitive @code{SCM} value. The first word of the heap object should contain a
way to allocate such cells. type code. The object may be any number of words in length, and is
generally scanned by the garbage collector for additional unless the
object was allocated using a ``pointerless'' allocation function.
If the caller intends to use it as a header for some other type, she You should generally not need these functions, unless you are
must pass an appropriate magic value in @var{word_0}, to mark it as a implementing a new data type, and thoroughly understand the code in
member of that type, and pass whatever value as @var{word_1}, etc that @code{<libguile/scm.h>}.
the type expects. You should generally not need these functions,
unless you are implementing a new datatype, and thoroughly understand
the code in @code{<libguile/tags.h>}.
If you just want to allocate pairs, use @code{scm_cons}. If you just want to allocate pairs, use @code{scm_cons}.
@deftypefn Function SCM scm_words (scm_t_bits word_0, uint32_t n_words)
Allocate a new heap object containing @var{n_words}, and initialize the
first slot to @var{word_0}, and return a non-immediate @code{SCM} value
encoding a pointer to the object. Typically @var{word_0} will contain
the type tag.
@end deftypefn
There are also deprecated but common variants of @code{scm_words} that
use the term ``cell'' to indicate 2-word objects.
@deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1) @deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1)
Allocate a new cell, initialize the two slots with @var{word_0} and Allocate a new 2-word heap object, initialize the two slots with
@var{word_1}, and return it. @var{word_0} and @var{word_1}, and return it. Just like calling
@code{scm_words (@var{word_0}, 2)}, then initializing the second slot to
@var{word_1}.
Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}. Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}.
If you want to pass a @code{SCM} object, you need to use If you want to pass a @code{SCM} object, you need to use
@ -543,123 +565,80 @@ If you want to pass a @code{SCM} object, you need to use
@end deftypefn @end deftypefn
@deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3) @deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3)
Like @code{scm_cell}, but allocates a double cell with four Like @code{scm_cell}, but allocates a 4-word heap object.
slots.
@end deftypefn @end deftypefn
@node Heap Cell Type Information @node Heap Object Type Information
@subsubsection Heap Cell Type Information @subsubsection Heap Object Type Information
Heap cells contain a number of entries, each of which is either a scheme Heap objects contain a type tag and are followed by a number of
object of type @code{SCM} or a raw C value of type @code{scm_t_bits}. word-sized slots. The interpretation of the object contents depends on
Which of the cell entries contain Scheme objects and which contain raw C the type of the object.
values is determined by the first entry of the cell, which holds the
cell type information.
@deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x}) @deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x})
For a non-immediate Scheme object @var{x}, deliver the content of the Extract the first word of the heap object pointed to by @var{x}. This
first entry of the heap cell referenced by @var{x}. This value holds value holds the information about the cell type.
the information about the cell type.
@end deftypefn @end deftypefn
@deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t}) @deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t})
For a non-immediate Scheme object @var{x}, write the value @var{t} into For a non-immediate Scheme object @var{x}, write the value @var{t} into
the first entry of the heap cell referenced by @var{x}. The value the first word of the heap object referenced by @var{x}. The value
@var{t} must hold a valid cell type. @var{t} must hold a valid cell type.
@end deftypefn @end deftypefn
@node Accessing Cell Entries @node Accessing Heap Object Fields
@subsubsection Accessing Cell Entries @subsubsection Accessing Heap Object Fields
For a non-immediate Scheme object @var{x}, the object type can be For a non-immediate Scheme object @var{x}, the object type can be
determined by reading the cell type entry using the @code{SCM_CELL_TYPE} determined by using the @code{SCM_CELL_TYPE} macro described in the
macro. For each different type of cell it is known which cell entries previous section. For each different type of heap object it is known
hold Scheme objects and which cell entries hold raw C data. To access which fields hold tagged Scheme objects and which fields hold untagged
the different cell entries appropriately, the following macros are raw data. To access the different fields appropriately, the following
provided. macros are provided.
@deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n}) @deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n})
Deliver the cell entry @var{n} of the heap cell referenced by the @deftypefnx Macro scm_t_bits SCM_CELL_WORD_0 (@var{x})
non-immediate Scheme object @var{x} as raw data. It is illegal, to @deftypefnx Macro scm_t_bits SCM_CELL_WORD_1 (@var{x})
access cell entries that hold Scheme objects by using these macros. For @deftypefnx Macro scm_t_bits SCM_CELL_WORD_2 (@var{x})
convenience, the following macros are also provided. @deftypefnx Macro scm_t_bits SCM_CELL_WORD_3 (@var{x})
@itemize @bullet Deliver the field @var{n} of the heap object referenced by the
@item non-immediate Scheme object @var{x} as raw untagged data. Only use this
SCM_CELL_WORD_0 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 0) macro for fields containing untagged data; don't use it for fields
@item containing tagged @code{SCM} objects.
SCM_CELL_WORD_1 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 1)
@item
@dots{}
@item
SCM_CELL_WORD_@var{n} (@var{x}) @result{} SCM_CELL_WORD (@var{x}, @var{n})
@end itemize
@end deftypefn @end deftypefn
@deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}) @deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n})
Deliver the cell entry @var{n} of the heap cell referenced by the @deftypefnx Macro SCM SCM_CELL_OBJECT_0 (SCM @var{x})
non-immediate Scheme object @var{x} as a Scheme object. It is illegal, @deftypefnx Macro SCM SCM_CELL_OBJECT_1 (SCM @var{x})
to access cell entries that do not hold Scheme objects by using these @deftypefnx Macro SCM SCM_CELL_OBJECT_2 (SCM @var{x})
macros. For convenience, the following macros are also provided. @deftypefnx Macro SCM SCM_CELL_OBJECT_3 (SCM @var{x})
@itemize @bullet Deliver the field @var{n} of the heap object referenced by the
@item non-immediate Scheme object @var{x} as a Scheme object. Only use this
SCM_CELL_OBJECT_0 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 0) macro for fields containing tagged @code{SCM} objects; don't use it for
@item fields containing untagged data.
SCM_CELL_OBJECT_1 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 1)
@item
@dots{}
@item
SCM_CELL_OBJECT_@var{n} (@var{x}) @result{} SCM_CELL_OBJECT (@var{x},
@var{n})
@end itemize
@end deftypefn @end deftypefn
@deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w}) @deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w})
Write the raw C value @var{w} into entry number @var{n} of the heap cell @deftypefnx Macro void SCM_SET_CELL_WORD_0 (@var{x}, @var{w})
@deftypefnx Macro void SCM_SET_CELL_WORD_1 (@var{x}, @var{w})
@deftypefnx Macro void SCM_SET_CELL_WORD_2 (@var{x}, @var{w})
@deftypefnx Macro void SCM_SET_CELL_WORD_3 (@var{x}, @var{w})
Write the raw value @var{w} into field number @var{n} of the heap object
referenced by the non-immediate Scheme value @var{x}. Values that are referenced by the non-immediate Scheme value @var{x}. Values that are
written into cells this way may only be read from the cells using the written into heap objects as raw values should only be read later using
@code{SCM_CELL_WORD} macros or, in case cell entry 0 is written, using the @code{SCM_CELL_WORD} macros.
the @code{SCM_CELL_TYPE} macro. For the special case of cell entry 0 it
has to be made sure that @var{w} contains a cell type information which
does not describe a Scheme object. For convenience, the following
macros are also provided.
@itemize @bullet
@item
SCM_SET_CELL_WORD_0 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD
(@var{x}, 0, @var{w})
@item
SCM_SET_CELL_WORD_1 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD
(@var{x}, 1, @var{w})
@item
@dots{}
@item
SCM_SET_CELL_WORD_@var{n} (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD
(@var{x}, @var{n}, @var{w})
@end itemize
@end deftypefn @end deftypefn
@deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o}) @deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o})
Write the Scheme object @var{o} into entry number @var{n} of the heap @deftypefnx Macro void SCM_SET_CELL_OBJECT_0 (SCM @var{x}, SCM @var{o})
cell referenced by the non-immediate Scheme value @var{x}. Values that @deftypefnx Macro void SCM_SET_CELL_OBJECT_1 (SCM @var{x}, SCM @var{o})
are written into cells this way may only be read from the cells using @deftypefnx Macro void SCM_SET_CELL_OBJECT_2 (SCM @var{x}, SCM @var{o})
the @code{SCM_CELL_OBJECT} macros or, in case cell entry 0 is written, @deftypefnx Macro void SCM_SET_CELL_OBJECT_3 (SCM @var{x}, SCM @var{o})
using the @code{SCM_CELL_TYPE} macro. For the special case of cell Write the Scheme object @var{o} into field number @var{n} of the heap
entry 0 the writing of a Scheme object into this cell is only allowed object referenced by the non-immediate Scheme value @var{x}. Values
if the cell forms a Scheme pair. For convenience, the following macros that are written into heap objects as objects should only be read using
are also provided. the @code{SCM_CELL_OBJECT} macros.
@itemize @bullet
@item
SCM_SET_CELL_OBJECT_0 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT
(@var{x}, 0, @var{o})
@item
SCM_SET_CELL_OBJECT_1 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT
(@var{x}, 1, @var{o})
@item
@dots{}
@item
SCM_SET_CELL_OBJECT_@var{n} (@var{x}, @var{o}) @result{}
SCM_SET_CELL_OBJECT (@var{x}, @var{n}, @var{o})
@end itemize
@end deftypefn @end deftypefn
@noindent @noindent
@ -669,9 +648,13 @@ Summary:
For a non-immediate Scheme object @var{x} of unknown type, get the type For a non-immediate Scheme object @var{x} of unknown type, get the type
information by using @code{SCM_CELL_TYPE (@var{x})}. information by using @code{SCM_CELL_TYPE (@var{x})}.
@item @item
As soon as the cell type information is available, only use the As soon as the type information is available, only use the appropriate
appropriate access methods to read and write data to the different cell access methods to read and write data to the different heap object
entries. fields.
@item
Note that field 0 stores the cell type information. Generally speaking,
other data associated with a heap object is stored starting from field
1.
@end itemize @end itemize

View file

@ -272,7 +272,7 @@ program)}. @xref{Compiled Procedures}, for a full API reference.
A procedure may reference data that was statically allocated when the A procedure may reference data that was statically allocated when the
procedure was compiled. For example, a pair of immediate objects procedure was compiled. For example, a pair of immediate objects
(@pxref{Immediate objects}) can be allocated directly in the memory (@pxref{Immediate Objects}) can be allocated directly in the memory
segment that contains the compiled bytecode, and accessed directly by segment that contains the compiled bytecode, and accessed directly by
the bytecode. the bytecode.
@ -495,7 +495,7 @@ An offset from the current @code{ip}, in 32-bit units, as a signed
24-bit value. Indicates a bytecode address, for a relative jump. 24-bit value. Indicates a bytecode address, for a relative jump.
@item i16 @item i16
@itemx i32 @itemx i32
An immediate Scheme value (@pxref{Immediate objects}), encoded directly An immediate Scheme value (@pxref{Immediate Objects}), encoded directly
in 16 or 32 bits. in 16 or 32 bits.
@item a32 @item a32
@itemx b32 @itemx b32