mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-10 14:00:21 +02:00
Merge remote-tracking branch 'origin/stable-2.0'
This commit is contained in:
commit
e690a3cbf2
20 changed files with 1726 additions and 359 deletions
|
@ -1,7 +1,7 @@
|
|||
;;; -*- mode: scheme; coding: utf-8; -*-
|
||||
;;; SRFI-1.
|
||||
;;;
|
||||
;;; Copyright 2010 Free Software Foundation, Inc.
|
||||
;;; Copyright 2010, 2011 Free Software Foundation, Inc.
|
||||
;;;
|
||||
;;; This program is free software; you can redistribute it and/or
|
||||
;;; modify it under the terms of the GNU Lesser General Public License
|
||||
|
@ -45,3 +45,20 @@
|
|||
|
||||
(benchmark "small" 2000000
|
||||
(drop-while (lambda (n) #t) %small-list)))
|
||||
|
||||
(with-benchmark-prefix "map"
|
||||
|
||||
(benchmark "big" 30
|
||||
(map (lambda (x) x) %big-list))
|
||||
|
||||
(benchmark "small" 2000000
|
||||
(map (lambda (x) x) %small-list)))
|
||||
|
||||
(with-benchmark-prefix "for-each"
|
||||
|
||||
(benchmark "big" 30
|
||||
(for-each (lambda (x) #f) %big-list))
|
||||
|
||||
(benchmark "small" 2000000
|
||||
(for-each (lambda (x) #f) %small-list)))
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
@c -*-texinfo-*-
|
||||
@c This is part of the GNU Guile Reference Manual.
|
||||
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010
|
||||
@c Free Software Foundation, Inc.
|
||||
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
|
||||
@c 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
|
||||
@c See the file guile.texi for copying conditions.
|
||||
|
||||
@node Compound Data Types
|
||||
|
@ -3294,8 +3294,9 @@ Again the choice of @var{hash-proc} must be consistent with previous calls to
|
|||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} vhash-fold proc vhash
|
||||
Fold over the key/pair elements of @var{vhash}. For each pair call @var{proc}
|
||||
as @code{(@var{proc} key value result)}.
|
||||
@deffnx {Scheme Procedure} vhash-fold-right proc vhash
|
||||
Fold over the key/value elements of @var{vhash} in the given direction.
|
||||
For each pair call @var{proc} as @code{(@var{proc} key value result)}.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} vhash-fold* proc init key vhash [equal? [hash]]
|
||||
|
|
|
@ -1152,22 +1152,364 @@ The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
|
|||
the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
|
||||
io ports)} module. It provides features, such as binary I/O and Unicode
|
||||
string I/O, that complement or refine Guile's historical port API
|
||||
presented above (@pxref{Input and Output}).
|
||||
presented above (@pxref{Input and Output}). Note that R6RS ports are not
|
||||
disjoint from Guile's native ports, so Guile-specific procedures will
|
||||
work on ports created using the R6RS API, and vice versa.
|
||||
|
||||
The text in this section is taken from the R6RS standard libraries
|
||||
document, with only minor adaptions for inclusion in this manual. The
|
||||
Guile developers offer their thanks to the R6RS editors for having
|
||||
provided the report's text under permissive conditions making this
|
||||
possible.
|
||||
|
||||
@c FIXME: Update description when implemented.
|
||||
@emph{Note}: The implementation of this R6RS API is not complete yet.
|
||||
|
||||
@menu
|
||||
* R6RS File Names:: File names.
|
||||
* R6RS File Options:: Options for opening files.
|
||||
* R6RS Buffer Modes:: Influencing buffering behavior.
|
||||
* R6RS Transcoders:: Influencing port encoding.
|
||||
* R6RS End-of-File:: The end-of-file object.
|
||||
* R6RS Port Manipulation:: Manipulating R6RS ports.
|
||||
* R6RS Input Ports:: Input Ports.
|
||||
* R6RS Binary Input:: Binary input.
|
||||
* R6RS Textual Input:: Textual input.
|
||||
* R6RS Output Ports:: Output Ports.
|
||||
* R6RS Binary Output:: Binary output.
|
||||
* R6RS Textual Output:: Textual output.
|
||||
@end menu
|
||||
|
||||
A subset of the @code{(rnrs io ports)} module is provided by the
|
||||
@code{(ice-9 binary-ports)} module. It contains binary input/output
|
||||
procedures and does not rely on R6RS support.
|
||||
|
||||
@node R6RS File Names
|
||||
@subsubsection File Names
|
||||
|
||||
Some of the procedures described in this chapter accept a file name as an
|
||||
argument. Valid values for such a file name include strings that name a file
|
||||
using the native notation of filesystem paths on an implementation's
|
||||
underlying operating system, and may include implementation-dependent
|
||||
values as well.
|
||||
|
||||
A @var{filename} parameter name means that the
|
||||
corresponding argument must be a file name.
|
||||
|
||||
@node R6RS File Options
|
||||
@subsubsection File Options
|
||||
@cindex file options
|
||||
|
||||
When opening a file, the various procedures in this library accept a
|
||||
@code{file-options} object that encapsulates flags to specify how the
|
||||
file is to be opened. A @code{file-options} object is an enum-set
|
||||
(@pxref{rnrs enums}) over the symbols constituting valid file options.
|
||||
|
||||
A @var{file-options} parameter name means that the corresponding
|
||||
argument must be a file-options object.
|
||||
|
||||
@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
|
||||
|
||||
Each @var{file-options-symbol} must be a symbol.
|
||||
|
||||
The @code{file-options} syntax returns a file-options object that
|
||||
encapsulates the specified options.
|
||||
|
||||
When supplied to an operation that opens a file for output, the
|
||||
file-options object returned by @code{(file-options)} specifies that the
|
||||
file is created if it does not exist and an exception with condition
|
||||
type @code{&i/o-file-already-exists} is raised if it does exist. The
|
||||
following standard options can be included to modify the default
|
||||
behavior.
|
||||
|
||||
@table @code
|
||||
@item no-create
|
||||
If the file does not already exist, it is not created;
|
||||
instead, an exception with condition type @code{&i/o-file-does-not-exist}
|
||||
is raised.
|
||||
If the file already exists, the exception with condition type
|
||||
@code{&i/o-file-already-exists} is not raised
|
||||
and the file is truncated to zero length.
|
||||
@item no-fail
|
||||
If the file already exists, the exception with condition type
|
||||
@code{&i/o-file-already-exists} is not raised,
|
||||
even if @code{no-create} is not included,
|
||||
and the file is truncated to zero length.
|
||||
@item no-truncate
|
||||
If the file already exists and the exception with condition type
|
||||
@code{&i/o-file-already-exists} has been inhibited by inclusion of
|
||||
@code{no-create} or @code{no-fail}, the file is not truncated, but
|
||||
the port's current position is still set to the beginning of the
|
||||
file.
|
||||
@end table
|
||||
|
||||
These options have no effect when a file is opened only for input.
|
||||
Symbols other than those listed above may be used as
|
||||
@var{file-options-symbol}s; they have implementation-specific meaning,
|
||||
if any.
|
||||
|
||||
@quotation Note
|
||||
Only the name of @var{file-options-symbol} is significant.
|
||||
@end quotation
|
||||
@end deffn
|
||||
|
||||
@node R6RS Buffer Modes
|
||||
@subsubsection Buffer Modes
|
||||
|
||||
Each port has an associated buffer mode. For an output port, the
|
||||
buffer mode defines when an output operation flushes the buffer
|
||||
associated with the output port. For an input port, the buffer mode
|
||||
defines how much data will be read to satisfy read operations. The
|
||||
possible buffer modes are the symbols @code{none} for no buffering,
|
||||
@code{line} for flushing upon line endings and reading up to line
|
||||
endings, or other implementation-dependent behavior,
|
||||
and @code{block} for arbitrary buffering. This section uses
|
||||
the parameter name @var{buffer-mode} for arguments that must be
|
||||
buffer-mode symbols.
|
||||
|
||||
If two ports are connected to the same mutable source, both ports
|
||||
are unbuffered, and reading a byte or character from that shared
|
||||
source via one of the two ports would change the bytes or characters
|
||||
seen via the other port, a lookahead operation on one port will
|
||||
render the peeked byte or character inaccessible via the other port,
|
||||
while a subsequent read operation on the peeked port will see the
|
||||
peeked byte or character even though the port is otherwise unbuffered.
|
||||
|
||||
In other words, the semantics of buffering is defined in terms of side
|
||||
effects on shared mutable sources, and a lookahead operation has the
|
||||
same side effect on the shared source as a read operation.
|
||||
|
||||
@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
|
||||
|
||||
@var{buffer-mode-symbol} must be a symbol whose name is one of
|
||||
@code{none}, @code{line}, and @code{block}. The result is the
|
||||
corresponding symbol, and specifies the associated buffer mode.
|
||||
|
||||
@quotation Note
|
||||
Only the name of @var{buffer-mode-symbol} is significant.
|
||||
@end quotation
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} buffer-mode? obj
|
||||
Returns @code{#t} if the argument is a valid buffer-mode symbol, and
|
||||
returns @code{#f} otherwise.
|
||||
@end deffn
|
||||
|
||||
@node R6RS Transcoders
|
||||
@subsubsection Transcoders
|
||||
@cindex codec
|
||||
@cindex end-of-line style
|
||||
@cindex transcoder
|
||||
@cindex binary port
|
||||
@cindex textual port
|
||||
|
||||
Several different Unicode encoding schemes describe standard ways to
|
||||
encode characters and strings as byte sequences and to decode those
|
||||
sequences. Within this document, a @dfn{codec} is an immutable Scheme
|
||||
object that represents a Unicode or similar encoding scheme.
|
||||
|
||||
An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
|
||||
describes how a textual port transcodes representations of line endings.
|
||||
|
||||
A @dfn{transcoder} is an immutable Scheme object that combines a codec
|
||||
with an end-of-line style and a method for handling decoding errors.
|
||||
Each transcoder represents some specific bidirectional (but not
|
||||
necessarily lossless), possibly stateful translation between byte
|
||||
sequences and Unicode characters and strings. Every transcoder can
|
||||
operate in the input direction (bytes to characters) or in the output
|
||||
direction (characters to bytes). A @var{transcoder} parameter name
|
||||
means that the corresponding argument must be a transcoder.
|
||||
|
||||
A @dfn{binary port} is a port that supports binary I/O, does not have an
|
||||
associated transcoder and does not support textual I/O. A @dfn{textual
|
||||
port} is a port that supports textual I/O, and does not support binary
|
||||
I/O. A textual port may or may not have an associated transcoder.
|
||||
|
||||
@deffn {Scheme Procedure} latin-1-codec
|
||||
@deffnx {Scheme Procedure} utf-8-codec
|
||||
@deffnx {Scheme Procedure} utf-16-codec
|
||||
|
||||
These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
|
||||
encoding schemes.
|
||||
|
||||
A call to any of these procedures returns a value that is equal in the
|
||||
sense of @code{eqv?} to the result of any other call to the same
|
||||
procedure.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
|
||||
|
||||
@var{eol-style-symbol} should be a symbol whose name is one of
|
||||
@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
|
||||
and @code{none}.
|
||||
|
||||
The form evaluates to the corresponding symbol. If the name of
|
||||
@var{eol-style-symbol} is not one of these symbols, the effect and
|
||||
result are implementation-dependent; in particular, the result may be an
|
||||
eol-style symbol acceptable as an @var{eol-style} argument to
|
||||
@code{make-transcoder}. Otherwise, an exception is raised.
|
||||
|
||||
All eol-style symbols except @code{none} describe a specific
|
||||
line-ending encoding:
|
||||
|
||||
@table @code
|
||||
@item lf
|
||||
linefeed
|
||||
@item cr
|
||||
carriage return
|
||||
@item crlf
|
||||
carriage return, linefeed
|
||||
@item nel
|
||||
next line
|
||||
@item crnel
|
||||
carriage return, next line
|
||||
@item ls
|
||||
line separator
|
||||
@end table
|
||||
|
||||
For a textual port with a transcoder, and whose transcoder has an
|
||||
eol-style symbol @code{none}, no conversion occurs. For a textual input
|
||||
port, any eol-style symbol other than @code{none} means that all of the
|
||||
above line-ending encodings are recognized and are translated into a
|
||||
single linefeed. For a textual output port, @code{none} and @code{lf}
|
||||
are equivalent. Linefeed characters are encoded according to the
|
||||
specified eol-style symbol, and all other characters that participate in
|
||||
possible line endings are encoded as is.
|
||||
|
||||
@quotation Note
|
||||
Only the name of @var{eol-style-symbol} is significant.
|
||||
@end quotation
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} native-eol-style
|
||||
Returns the default end-of-line style of the underlying platform, e.g.,
|
||||
@code{lf} on Unix and @code{crlf} on Windows.
|
||||
@end deffn
|
||||
|
||||
@deffn {Condition Type} &i/o-decoding
|
||||
@deffnx {Scheme Procedure} make-i/o-decoding-error port
|
||||
@deffnx {Scheme Procedure} i/o-decoding-error? obj
|
||||
|
||||
This condition type could be defined by
|
||||
|
||||
@lisp
|
||||
(define-condition-type &i/o-decoding &i/o-port
|
||||
make-i/o-decoding-error i/o-decoding-error?)
|
||||
@end lisp
|
||||
|
||||
An exception with this type is raised when one of the operations for
|
||||
textual input from a port encounters a sequence of bytes that cannot be
|
||||
translated into a character or string by the input direction of the
|
||||
port's transcoder.
|
||||
|
||||
When such an exception is raised, the port's position is past the
|
||||
invalid encoding.
|
||||
@end deffn
|
||||
|
||||
@deffn {Condition Type} &i/o-encoding
|
||||
@deffnx {Scheme Procedure} make-i/o-encoding-error port char
|
||||
@deffnx {Scheme Procedure} i/o-encoding-error? obj
|
||||
@deffnx {Scheme Procedure} i/o-encoding-error-char condition
|
||||
|
||||
This condition type could be defined by
|
||||
|
||||
@lisp
|
||||
(define-condition-type &i/o-encoding &i/o-port
|
||||
make-i/o-encoding-error i/o-encoding-error?
|
||||
(char i/o-encoding-error-char))
|
||||
@end lisp
|
||||
|
||||
An exception with this type is raised when one of the operations for
|
||||
textual output to a port encounters a character that cannot be
|
||||
translated into bytes by the output direction of the port's transcoder.
|
||||
@var{Char} is the character that could not be encoded.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
|
||||
|
||||
@var{error-handling-mode-symbol} should be a symbol whose name is one of
|
||||
@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
|
||||
the corresponding symbol. If @var{error-handling-mode-symbol} is not
|
||||
one of these identifiers, effect and result are
|
||||
implementation-dependent: The result may be an error-handling-mode
|
||||
symbol acceptable as a @var{handling-mode} argument to
|
||||
@code{make-transcoder}. If it is not acceptable as a
|
||||
@var{handling-mode} argument to @code{make-transcoder}, an exception is
|
||||
raised.
|
||||
|
||||
@quotation Note
|
||||
Only the name of @var{error-handling-style-symbol} is significant.
|
||||
@end quotation
|
||||
|
||||
The error-handling mode of a transcoder specifies the behavior
|
||||
of textual I/O operations in the presence of encoding or decoding
|
||||
errors.
|
||||
|
||||
If a textual input operation encounters an invalid or incomplete
|
||||
character encoding, and the error-handling mode is @code{ignore}, an
|
||||
appropriate number of bytes of the invalid encoding are ignored and
|
||||
decoding continues with the following bytes.
|
||||
|
||||
If the error-handling mode is @code{replace}, the replacement
|
||||
character U+FFFD is injected into the data stream, an appropriate
|
||||
number of bytes are ignored, and decoding
|
||||
continues with the following bytes.
|
||||
|
||||
If the error-handling mode is @code{raise}, an exception with condition
|
||||
type @code{&i/o-decoding} is raised.
|
||||
|
||||
If a textual output operation encounters a character it cannot encode,
|
||||
and the error-handling mode is @code{ignore}, the character is ignored
|
||||
and encoding continues with the next character. If the error-handling
|
||||
mode is @code{replace}, a codec-specific replacement character is
|
||||
emitted by the transcoder, and encoding continues with the next
|
||||
character. The replacement character is U+FFFD for transcoders whose
|
||||
codec is one of the Unicode encodings, but is the @code{?} character
|
||||
for the Latin-1 encoding. If the error-handling mode is @code{raise},
|
||||
an exception with condition type @code{&i/o-encoding} is raised.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} make-transcoder codec
|
||||
@deffnx {Scheme Procedure} make-transcoder codec eol-style
|
||||
@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
|
||||
|
||||
@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
|
||||
symbol; and @var{handling-mode}, if present, an error-handling-mode
|
||||
symbol.
|
||||
|
||||
@var{eol-style} may be omitted, in which case it defaults to the native
|
||||
end-of-line style of the underlying platform. @var{Handling-mode} may
|
||||
be omitted, in which case it defaults to @code{replace}. The result is
|
||||
a transcoder with the behavior specified by its arguments.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme procedure} native-transcoder
|
||||
Returns an implementation-dependent transcoder that represents a
|
||||
possibly locale-dependent ``native'' transcoding.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} transcoder-codec transcoder
|
||||
@deffnx {Scheme Procedure} transcoder-eol-style transcoder
|
||||
@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
|
||||
|
||||
These are accessors for transcoder objects; when applied to a
|
||||
transcoder returned by @code{make-transcoder}, they return the
|
||||
@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
|
||||
respectively.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} bytevector->string bytevector transcoder
|
||||
|
||||
Returns the string that results from transcoding the
|
||||
@var{bytevector} according to the input direction of the transcoder.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} string->bytevector string transcoder
|
||||
|
||||
Returns the bytevector that results from transcoding the
|
||||
@var{string} according to the output direction of the transcoder.
|
||||
@end deffn
|
||||
|
||||
@node R6RS End-of-File
|
||||
@subsubsection The End-of-File Object
|
||||
|
||||
|
@ -1200,6 +1542,65 @@ Return the end-of-file (EOF) object.
|
|||
|
||||
The procedures listed below operate on any kind of R6RS I/O port.
|
||||
|
||||
@deffn {Scheme Procedure} port? obj
|
||||
Returns @code{#t} if the argument is a port, and returns @code{#f}
|
||||
otherwise.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} port-transcoder port
|
||||
Returns the transcoder associated with @var{port} if @var{port} is
|
||||
textual and has an associated transcoder, and returns @code{#f} if
|
||||
@var{port} is binary or does not have an associated transcoder.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} binary-port? port
|
||||
Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
|
||||
binary data input/output.
|
||||
|
||||
Note that internally Guile does not differentiate between binary and
|
||||
textual ports, unlike the R6RS. Thus, this procedure returns true when
|
||||
@var{port} does not have an associated encoding---i.e., when
|
||||
@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
|
||||
port-encoding}). This is the case for ports returned by R6RS procedures
|
||||
such as @code{open-bytevector-input-port} and
|
||||
@code{make-custom-binary-output-port}.
|
||||
|
||||
However, Guile currently does not prevent use of textual I/O procedures
|
||||
such as @code{display} or @code{read-char} with binary ports. Doing so
|
||||
``upgrades'' the port from binary to textual, under the ISO-8859-1
|
||||
encoding. Likewise, Guile does not prevent use of
|
||||
@code{set-port-encoding!} on a binary port, which also turns it into a
|
||||
``textual'' port.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} textual-port? port
|
||||
Always return @var{#t}, as all ports can be used for textual I/O in
|
||||
Guile.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} transcoded-port obj
|
||||
The @code{transcoded-port} procedure
|
||||
returns a new textual port with the specified @var{transcoder}.
|
||||
Otherwise the new textual port's state is largely the same as
|
||||
that of @var{binary-port}.
|
||||
If @var{binary-port} is an input port, the new textual
|
||||
port will be an input port and
|
||||
will transcode the bytes that have not yet been read from
|
||||
@var{binary-port}.
|
||||
If @var{binary-port} is an output port, the new textual
|
||||
port will be an output port and
|
||||
will transcode output characters into bytes that are
|
||||
written to the byte sink represented by @var{binary-port}.
|
||||
|
||||
As a side effect, however, @code{transcoded-port}
|
||||
closes @var{binary-port} in
|
||||
a special way that allows the new textual port to continue to
|
||||
use the byte source or sink represented by @var{binary-port},
|
||||
even though @var{binary-port} itself is closed and cannot
|
||||
be used by the input and output operations described in this
|
||||
chapter.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} port-position port
|
||||
If @var{port} supports it (see below), return the offset (an integer)
|
||||
indicating where the next octet will be read from/written to in
|
||||
|
@ -1233,31 +1634,67 @@ Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
|
|||
of @var{proc}. Return the return values of @var{proc}.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} binary-port? port
|
||||
Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
|
||||
binary data input/output.
|
||||
@node R6RS Input Ports
|
||||
@subsubsection Input Ports
|
||||
|
||||
Note that internally Guile does not differentiate between binary and
|
||||
textual ports, unlike the R6RS. Thus, this procedure returns true when
|
||||
@var{port} does not have an associated encoding---i.e., when
|
||||
@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
|
||||
port-encoding}). This is the case for ports returned by R6RS procedures
|
||||
such as @code{open-bytevector-input-port} and
|
||||
@code{make-custom-binary-output-port}.
|
||||
|
||||
However, Guile currently does not prevent use of textual I/O procedures
|
||||
such as @code{display} or @code{read-char} with binary ports. Doing so
|
||||
``upgrades'' the port from binary to textual, under the ISO-8859-1
|
||||
encoding. Likewise, Guile does not prevent use of
|
||||
@code{set-port-encoding!} on a binary port, which also turns it into a
|
||||
``textual'' port.
|
||||
@deffn {Scheme Procedure} input-port? obj@
|
||||
Returns @code{#t} if the argument is an input port (or a combined input
|
||||
and output port), and returns @code{#f} otherwise.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} textual-port? port
|
||||
Always return @var{#t}, as all ports can be used for textual I/O in
|
||||
Guile.
|
||||
@deffn {Scheme Procedure} port-eof? port
|
||||
Returns @code{#t}
|
||||
if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
|
||||
or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
|
||||
would return
|
||||
the end-of-file object, and @code{#f} otherwise.
|
||||
The operation may block indefinitely if no data is available
|
||||
but the port cannot be determined to be at end of file.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} open-file-input-port filename
|
||||
@deffnx {Scheme Procedure} open-file-input-port filename file-options
|
||||
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
|
||||
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
|
||||
@var{Maybe-transcoder} must be either a transcoder or @code{#f}.
|
||||
|
||||
The @code{open-file-input-port} procedure returns an
|
||||
input port for the named file. The @var{file-options} and
|
||||
@var{maybe-transcoder} arguments are optional.
|
||||
|
||||
The @var{file-options} argument, which may determine
|
||||
various aspects of the returned port (@pxref{R6RS File Options}),
|
||||
defaults to the value of @code{(file-options)}.
|
||||
|
||||
The @var{buffer-mode} argument, if supplied,
|
||||
must be one of the symbols that name a buffer mode.
|
||||
The @var{buffer-mode} argument defaults to @code{block}.
|
||||
|
||||
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
|
||||
with the returned port.
|
||||
|
||||
If @var{maybe-transcoder} is @code{#f} or absent,
|
||||
the port will be a binary port and will support the
|
||||
@code{port-position} and @code{set-port-position!} operations.
|
||||
Otherwise the port will be a textual port, and whether it supports
|
||||
the @code{port-position} and @code{set-port-position!} operations
|
||||
is implementation-dependent (and possibly transcoder-dependent).
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} standard-input-port
|
||||
Returns a fresh binary input port connected to standard input. Whether
|
||||
the port supports the @code{port-position} and @code{set-port-position!}
|
||||
operations is implementation-dependent.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} current-input-port
|
||||
This returns a default textual port for input. Normally, this default
|
||||
port is associated with standard input, but can be dynamically
|
||||
re-assigned using the @code{with-input-from-file} procedure from the
|
||||
@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
|
||||
may not have an associated transcoder; if it does, the transcoder is
|
||||
implementation-dependent.
|
||||
@end deffn
|
||||
|
||||
@node R6RS Binary Input
|
||||
@subsubsection Binary Input
|
||||
|
@ -1374,6 +1811,173 @@ reached. Return either a new bytevector containing the data read or the
|
|||
end-of-file object (if no data were available).
|
||||
@end deffn
|
||||
|
||||
@node R6RS Textual Input
|
||||
@subsubsection Textual Input
|
||||
|
||||
@deffn {Scheme Procedure} get-char port
|
||||
Reads from @var{textual-input-port}, blocking as necessary, until a
|
||||
complete character is available from @var{textual-input-port},
|
||||
or until an end of file is reached.
|
||||
|
||||
If a complete character is available before the next end of file,
|
||||
@code{get-char} returns that character and updates the input port to
|
||||
point past the character. If an end of file is reached before any
|
||||
character is read, @code{get-char} returns the end-of-file object.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} lookahead-char port
|
||||
The @code{lookahead-char} procedure is like @code{get-char}, but it does
|
||||
not update @var{textual-input-port} to point past the character.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} get-string-n port count
|
||||
|
||||
@var{Count} must be an exact, non-negative integer object, representing
|
||||
the number of characters to be read.
|
||||
|
||||
The @code{get-string-n} procedure reads from @var{textual-input-port},
|
||||
blocking as necessary, until @var{count} characters are available, or
|
||||
until an end of file is reached.
|
||||
|
||||
If @var{count} characters are available before end of file,
|
||||
@code{get-string-n} returns a string consisting of those @var{count}
|
||||
characters. If fewer characters are available before an end of file, but
|
||||
one or more characters can be read, @code{get-string-n} returns a string
|
||||
containing those characters. In either case, the input port is updated
|
||||
to point just past the characters read. If no characters can be read
|
||||
before an end of file, the end-of-file object is returned.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} get-string-n! port string start count
|
||||
|
||||
@var{Start} and @var{count} must be exact, non-negative integer objects,
|
||||
with @var{count} representing the number of characters to be read.
|
||||
@var{String} must be a string with at least $@var{start} + @var{count}$
|
||||
characters.
|
||||
|
||||
The @code{get-string-n!} procedure reads from @var{textual-input-port}
|
||||
in the same manner as @code{get-string-n}. If @var{count} characters
|
||||
are available before an end of file, they are written into @var{string}
|
||||
starting at index @var{start}, and @var{count} is returned. If fewer
|
||||
characters are available before an end of file, but one or more can be
|
||||
read, those characters are written into @var{string} starting at index
|
||||
@var{start} and the number of characters actually read is returned as an
|
||||
exact integer object. If no characters can be read before an end of
|
||||
file, the end-of-file object is returned.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} get-string-all port count
|
||||
Reads from @var{textual-input-port} until an end of file, decoding
|
||||
characters in the same manner as @code{get-string-n} and
|
||||
@code{get-string-n!}.
|
||||
|
||||
If characters are available before the end of file, a string containing
|
||||
all the characters decoded from that data are returned. If no character
|
||||
precedes the end of file, the end-of-file object is returned.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} get-line port
|
||||
Reads from @var{textual-input-port} up to and including the linefeed
|
||||
character or end of file, decoding characters in the same manner as
|
||||
@code{get-string-n} and @code{get-string-n!}.
|
||||
|
||||
If a linefeed character is read, a string containing all of the text up
|
||||
to (but not including) the linefeed character is returned, and the port
|
||||
is updated to point just past the linefeed character. If an end of file
|
||||
is encountered before any linefeed character is read, but some
|
||||
characters have been read and decoded as characters, a string containing
|
||||
those characters is returned. If an end of file is encountered before
|
||||
any characters are read, the end-of-file object is returned.
|
||||
|
||||
@quotation Note
|
||||
The end-of-line style, if not @code{none}, will cause all line endings
|
||||
to be read as linefeed characters. @xref{R6RS Transcoders}.
|
||||
@end quotation
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} get-datum port count
|
||||
Reads an external representation from @var{textual-input-port} and returns the
|
||||
datum it represents. The @code{get-datum} procedure returns the next
|
||||
datum that can be parsed from the given @var{textual-input-port}, updating
|
||||
@var{textual-input-port} to point exactly past the end of the external
|
||||
representation of the object.
|
||||
|
||||
Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
|
||||
Syntax}) in the input is first skipped. If an end of file occurs after
|
||||
the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
|
||||
is returned.
|
||||
|
||||
If a character inconsistent with an external representation is
|
||||
encountered in the input, an exception with condition types
|
||||
@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
|
||||
file is encountered after the beginning of an external representation,
|
||||
but the external representation is incomplete and therefore cannot be
|
||||
parsed, an exception with condition types @code{&lexical} and
|
||||
@code{&i/o-read} is raised.
|
||||
@end deffn
|
||||
|
||||
@node R6RS Output Ports
|
||||
@subsubsection Output Ports
|
||||
|
||||
@deffn {Scheme Procedure} output-port? obj
|
||||
Returns @code{#t} if the argument is an output port (or a
|
||||
combined input and output port), @code{#f} otherwise.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} flush-output-port port
|
||||
Flushes any buffered output from the buffer of @var{output-port} to the
|
||||
underlying file, device, or object. The @code{flush-output-port}
|
||||
procedure returns an unspecified values.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} open-file-output-port filename
|
||||
@deffnx {Scheme Procedure} open-file-output-port filename file-options
|
||||
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
|
||||
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
|
||||
|
||||
@var{maybe-transcoder} must be either a transcoder or @code{#f}.
|
||||
|
||||
The @code{open-file-output-port} procedure returns an output port for the named file.
|
||||
|
||||
The @var{file-options} argument, which may determine various aspects of
|
||||
the returned port (@pxref{R6RS File Options}), defaults to the value of
|
||||
@code{(file-options)}.
|
||||
|
||||
The @var{buffer-mode} argument, if supplied,
|
||||
must be one of the symbols that name a buffer mode.
|
||||
The @var{buffer-mode} argument defaults to @code{block}.
|
||||
|
||||
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
|
||||
associated with the port.
|
||||
|
||||
If @var{maybe-transcoder} is @code{#f} or absent,
|
||||
the port will be a binary port and will support the
|
||||
@code{port-position} and @code{set-port-position!} operations.
|
||||
Otherwise the port will be a textual port, and whether it supports
|
||||
the @code{port-position} and @code{set-port-position!} operations
|
||||
is implementation-dependent (and possibly transcoder-dependent).
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} standard-output-port
|
||||
@deffnx {Scheme Procedure} standard-error-port
|
||||
Returns a fresh binary output port connected to the standard output or
|
||||
standard error respectively. Whether the port supports the
|
||||
@code{port-position} and @code{set-port-position!} operations is
|
||||
implementation-dependent.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} current-output-port
|
||||
@deffnx {Scheme Procedure} current-error-port
|
||||
These return default textual ports for regular output and error output.
|
||||
Normally, these default ports are associated with standard output, and
|
||||
standard error, respectively. The return value of
|
||||
@code{current-output-port} can be dynamically re-assigned using the
|
||||
@code{with-output-to-file} procedure from the @code{io simple (6)}
|
||||
library (@pxref{rnrs io simple}). A port returned by one of these
|
||||
procedures may or may not have an associated transcoder; if it does, the
|
||||
transcoder is implementation-dependent.
|
||||
@end deffn
|
||||
|
||||
@node R6RS Binary Output
|
||||
@subsubsection Binary Output
|
||||
|
||||
|
@ -1432,6 +2036,50 @@ Write the contents of @var{bv} to @var{port}, optionally starting at
|
|||
index @var{start} and limiting to @var{count} octets.
|
||||
@end deffn
|
||||
|
||||
@node R6RS Textual Output
|
||||
@subsubsection Textual Output
|
||||
|
||||
@deffn {Scheme Procedure} put-char port char
|
||||
Writes @var{char} to the port. The @code{put-char} procedure returns
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} put-string port string
|
||||
@deffnx {Scheme Procedure} put-string port string start
|
||||
@deffnx {Scheme Procedure} put-string port string start count
|
||||
|
||||
@var{start} and @var{count} must be non-negative exact integer objects.
|
||||
@var{string} must have a length of at least @math{@var{start} +
|
||||
@var{count}}. @var{start} defaults to 0. @var{count} defaults to
|
||||
@math{@code{(string-length @var{string})} - @var{start}}$. The
|
||||
@code{put-string} procedure writes the @var{count} characters of
|
||||
@var{string} starting at index @var{start} to the port. The
|
||||
@code{put-string} procedure returns an unspecified value.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} put-datum port datum
|
||||
@var{datum} should be a datum value. The @code{put-datum} procedure
|
||||
writes an external representation of @var{datum} to
|
||||
@var{textual-output-port}. The specific external representation is
|
||||
implementation-dependent. However, whenever possible, an implementation
|
||||
should produce a representation for which @code{get-datum}, when reading
|
||||
the representation, will return an object equal (in the sense of
|
||||
@code{equal?}) to @var{datum}.
|
||||
|
||||
@quotation Note
|
||||
Not all datums may allow producing an external representation for which
|
||||
@code{get-datum} will produce an object that is equal to the
|
||||
original. Specifically, NaNs contained in @var{datum} may make
|
||||
this impossible.
|
||||
@end quotation
|
||||
|
||||
@quotation Note
|
||||
The @code{put-datum} procedure merely writes the external
|
||||
representation, but no trailing delimiter. If @code{put-datum} is
|
||||
used to write several subsequent external representations to an
|
||||
output port, care should be taken to delimit them properly so they can
|
||||
be read back in by subsequent calls to @code{get-datum}.
|
||||
@end quotation
|
||||
@end deffn
|
||||
|
||||
@node I/O Extensions
|
||||
@subsection Using and Extending Ports in C
|
||||
|
@ -1690,7 +2338,6 @@ Set using
|
|||
|
||||
@end table
|
||||
|
||||
|
||||
@c Local Variables:
|
||||
@c TeX-master: "guile.texi"
|
||||
@c End:
|
||||
|
|
|
@ -1428,8 +1428,21 @@ functionality is documented in its own section of the manual;
|
|||
|
||||
The @code{(rnrs io simple (6))} library provides convenience functions
|
||||
for performing textual I/O on ports. This library also exports all of
|
||||
the condition types and associated procedures described in
|
||||
(@pxref{I/O Conditions}).
|
||||
the condition types and associated procedures described in (@pxref{I/O
|
||||
Conditions}). In the context of this section, when stating that a
|
||||
procedure behaves ``identically'' to the corresponding procedure in
|
||||
Guile's core library, this is modulo the behavior wrt. conditions: such
|
||||
procedures raise the appropriate R6RS conditions in case of error, but
|
||||
otherwise behave identically.
|
||||
|
||||
@c FIXME: remove the following note when proper condition behavior has
|
||||
@c been verified.
|
||||
|
||||
@quotation Note
|
||||
There are still known issues regarding condition-correctness; some
|
||||
errors may still be thrown as native Guile exceptions instead of the
|
||||
appropriate R6RS conditions.
|
||||
@end quotation
|
||||
|
||||
@deffn {Scheme Procedure} eof-object
|
||||
@deffnx {Scheme Procedure} eof-object? obj
|
||||
|
|
|
@ -965,9 +965,12 @@ unpack (const ffi_type *type, void *loc, SCM x)
|
|||
}
|
||||
#undef FUNC_NAME
|
||||
|
||||
/* Return a Scheme representation of the foreign value at LOC of type TYPE. */
|
||||
/* Return a Scheme representation of the foreign value at LOC of type
|
||||
TYPE. When RETURN_VALUE_P is true, LOC is assumed to point to a
|
||||
return value buffer; otherwise LOC is assumed to point to an
|
||||
argument buffer. */
|
||||
static SCM
|
||||
pack (const ffi_type * type, const void *loc)
|
||||
pack (const ffi_type * type, const void *loc, int return_value_p)
|
||||
{
|
||||
switch (type->type)
|
||||
{
|
||||
|
@ -977,22 +980,48 @@ pack (const ffi_type * type, const void *loc)
|
|||
return scm_from_double (*(float *) loc);
|
||||
case FFI_TYPE_DOUBLE:
|
||||
return scm_from_double (*(double *) loc);
|
||||
|
||||
/* For integer return values smaller than `int', libffi stores the
|
||||
result in an `ffi_arg'-long buffer, of which only the
|
||||
significant bits must be kept---hence the pair of casts below.
|
||||
See <http://thread.gmane.org/gmane.comp.lib.ffi.general/406>
|
||||
for details. */
|
||||
|
||||
case FFI_TYPE_UINT8:
|
||||
return scm_from_uint8 (*(scm_t_uint8 *) loc);
|
||||
if (return_value_p)
|
||||
return scm_from_uint8 ((scm_t_uint8) *(ffi_arg *) loc);
|
||||
else
|
||||
return scm_from_uint8 (* (scm_t_uint8 *) loc);
|
||||
case FFI_TYPE_SINT8:
|
||||
return scm_from_int8 (*(scm_t_int8 *) loc);
|
||||
if (return_value_p)
|
||||
return scm_from_int8 ((scm_t_int8) *(ffi_arg *) loc);
|
||||
else
|
||||
return scm_from_int8 (* (scm_t_int8 *) loc);
|
||||
case FFI_TYPE_UINT16:
|
||||
return scm_from_uint16 (*(scm_t_uint16 *) loc);
|
||||
if (return_value_p)
|
||||
return scm_from_uint16 ((scm_t_uint16) *(ffi_arg *) loc);
|
||||
else
|
||||
return scm_from_uint16 (* (scm_t_uint16 *) loc);
|
||||
case FFI_TYPE_SINT16:
|
||||
return scm_from_int16 (*(scm_t_int16 *) loc);
|
||||
if (return_value_p)
|
||||
return scm_from_int16 ((scm_t_int16) *(ffi_arg *) loc);
|
||||
else
|
||||
return scm_from_int16 (* (scm_t_int16 *) loc);
|
||||
case FFI_TYPE_UINT32:
|
||||
return scm_from_uint32 (*(scm_t_uint32 *) loc);
|
||||
if (return_value_p)
|
||||
return scm_from_uint32 ((scm_t_uint32) *(ffi_arg *) loc);
|
||||
else
|
||||
return scm_from_uint32 (* (scm_t_uint32 *) loc);
|
||||
case FFI_TYPE_SINT32:
|
||||
return scm_from_int32 (*(scm_t_int32 *) loc);
|
||||
if (return_value_p)
|
||||
return scm_from_int32 ((scm_t_int32) *(ffi_arg *) loc);
|
||||
else
|
||||
return scm_from_int32 (* (scm_t_int32 *) loc);
|
||||
case FFI_TYPE_UINT64:
|
||||
return scm_from_uint64 (*(scm_t_uint64 *) loc);
|
||||
case FFI_TYPE_SINT64:
|
||||
return scm_from_int64 (*(scm_t_int64 *) loc);
|
||||
|
||||
case FFI_TYPE_STRUCT:
|
||||
{
|
||||
void *mem = scm_gc_malloc_pointerless (type->size, "foreign");
|
||||
|
@ -1060,7 +1089,7 @@ scm_i_foreign_call (SCM foreign, const SCM *argv)
|
|||
/* off we go! */
|
||||
ffi_call (cif, func, rvalue, args);
|
||||
|
||||
return pack (cif->rtype, rvalue);
|
||||
return pack (cif->rtype, rvalue, 1);
|
||||
}
|
||||
|
||||
|
||||
|
@ -1082,7 +1111,7 @@ invoke_closure (ffi_cif *cif, void *ret, void **args, void *data)
|
|||
|
||||
/* Pack ARGS to SCM values, setting ARGV pointers. */
|
||||
for (i = 0; i < cif->nargs; i++)
|
||||
argv[i] = pack (cif->arg_types[i], args[i]);
|
||||
argv[i] = pack (cif->arg_types[i], args[i], 0);
|
||||
|
||||
result = scm_call_n (proc, argv, cif->nargs);
|
||||
|
||||
|
|
|
@ -26,6 +26,7 @@
|
|||
#include <wchar.h>
|
||||
#endif
|
||||
|
||||
#include <math.h>
|
||||
#include <unistr.h>
|
||||
|
||||
#include "libguile/_scm.h"
|
||||
|
@ -192,7 +193,7 @@ scm_hasher(SCM obj, unsigned long n, size_t d)
|
|||
case scm_tc16_real:
|
||||
{
|
||||
double r = SCM_REAL_VALUE (obj);
|
||||
if (floor (r) == r)
|
||||
if (floor (r) == r && !isinf (r) && !isnan (r))
|
||||
{
|
||||
obj = scm_inexact_to_exact (obj);
|
||||
return scm_to_ulong (scm_modulo (obj, scm_from_ulong (n)));
|
||||
|
|
|
@ -3,7 +3,8 @@
|
|||
#ifndef SCM_INLINE_H
|
||||
#define SCM_INLINE_H
|
||||
|
||||
/* Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010 Free Software Foundation, Inc.
|
||||
/* Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010,
|
||||
* 2011 Free Software Foundation, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public License
|
||||
|
@ -98,6 +99,7 @@ SCM_API int scm_is_pair (SCM x);
|
|||
SCM_API int scm_is_string (SCM x);
|
||||
|
||||
SCM_API int scm_get_byte_or_eof (SCM port);
|
||||
SCM_API int scm_peek_byte_or_eof (SCM port);
|
||||
SCM_API void scm_putc (char c, SCM port);
|
||||
SCM_API void scm_puts (const char *str_data, SCM port);
|
||||
|
||||
|
@ -362,7 +364,7 @@ scm_get_byte_or_eof (SCM port)
|
|||
|
||||
if (pt->read_pos >= pt->read_end)
|
||||
{
|
||||
if (scm_fill_input (port) == EOF)
|
||||
if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
|
||||
return EOF;
|
||||
}
|
||||
|
||||
|
@ -371,6 +373,34 @@ scm_get_byte_or_eof (SCM port)
|
|||
return c;
|
||||
}
|
||||
|
||||
/* Like `scm_get_byte_or_eof' but does not change PORT's `read_pos'. */
|
||||
#ifndef SCM_INLINE_C_INCLUDING_INLINE_H
|
||||
SCM_C_EXTERN_INLINE
|
||||
#endif
|
||||
int
|
||||
scm_peek_byte_or_eof (SCM port)
|
||||
{
|
||||
int c;
|
||||
scm_t_port *pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
if (pt->rw_active == SCM_PORT_WRITE)
|
||||
/* may be marginally faster than calling scm_flush. */
|
||||
scm_ptobs[SCM_PTOBNUM (port)].flush (port);
|
||||
|
||||
if (pt->rw_random)
|
||||
pt->rw_active = SCM_PORT_READ;
|
||||
|
||||
if (pt->read_pos >= pt->read_end)
|
||||
{
|
||||
if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
|
||||
return EOF;
|
||||
}
|
||||
|
||||
c = *pt->read_pos;
|
||||
|
||||
return c;
|
||||
}
|
||||
|
||||
#ifndef SCM_INLINE_C_INCLUDING_INLINE_H
|
||||
SCM_C_EXTERN_INLINE
|
||||
#endif
|
||||
|
|
250
libguile/ports.c
250
libguile/ports.c
|
@ -1057,6 +1057,7 @@ update_port_lf (scm_t_wchar c, SCM port)
|
|||
switch (c)
|
||||
{
|
||||
case '\a':
|
||||
case EOF:
|
||||
break;
|
||||
case '\b':
|
||||
SCM_DECCOL (port);
|
||||
|
@ -1115,23 +1116,154 @@ utf8_to_codepoint (const scm_t_uint8 *utf8_buf, size_t size)
|
|||
return codepoint;
|
||||
}
|
||||
|
||||
/* Read a codepoint from PORT and return it in *CODEPOINT. Fill BUF
|
||||
with the byte representation of the codepoint in PORT's encoding, and
|
||||
set *LEN to the length in bytes of that representation. Return 0 on
|
||||
success and an errno value on error. */
|
||||
/* Read a UTF-8 sequence from PORT. On success, return 0 and set
|
||||
*CODEPOINT to the codepoint that was read, fill BUF with its UTF-8
|
||||
representation, and set *LEN to the length in bytes. Return
|
||||
`EILSEQ' on error. */
|
||||
static int
|
||||
get_codepoint (SCM port, scm_t_wchar *codepoint,
|
||||
char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
|
||||
get_utf8_codepoint (SCM port, scm_t_wchar *codepoint,
|
||||
scm_t_uint8 buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
|
||||
{
|
||||
#define ASSERT_NOT_EOF(b) \
|
||||
if (SCM_UNLIKELY ((b) == EOF)) \
|
||||
goto invalid_seq
|
||||
#define CONSUME_PEEKED_BYTE() \
|
||||
pt->read_pos++
|
||||
|
||||
int byte;
|
||||
scm_t_port *pt;
|
||||
|
||||
*len = 0;
|
||||
pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
byte = scm_get_byte_or_eof (port);
|
||||
if (byte == EOF)
|
||||
{
|
||||
*codepoint = EOF;
|
||||
return 0;
|
||||
}
|
||||
|
||||
buf[0] = (scm_t_uint8) byte;
|
||||
*len = 1;
|
||||
|
||||
if (buf[0] <= 0x7f)
|
||||
/* 1-byte form. */
|
||||
*codepoint = buf[0];
|
||||
else if (buf[0] >= 0xc2 && buf[0] <= 0xdf)
|
||||
{
|
||||
/* 2-byte form. */
|
||||
byte = scm_peek_byte_or_eof (port);
|
||||
ASSERT_NOT_EOF (byte);
|
||||
|
||||
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
|
||||
goto invalid_seq;
|
||||
|
||||
CONSUME_PEEKED_BYTE ();
|
||||
buf[1] = (scm_t_uint8) byte;
|
||||
*len = 2;
|
||||
|
||||
*codepoint = ((scm_t_wchar) buf[0] & 0x1f) << 6UL
|
||||
| (buf[1] & 0x3f);
|
||||
}
|
||||
else if ((buf[0] & 0xf0) == 0xe0)
|
||||
{
|
||||
/* 3-byte form. */
|
||||
byte = scm_peek_byte_or_eof (port);
|
||||
ASSERT_NOT_EOF (byte);
|
||||
|
||||
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80
|
||||
|| (buf[0] == 0xe0 && byte < 0xa0)
|
||||
|| (buf[0] == 0xed && byte > 0x9f)))
|
||||
goto invalid_seq;
|
||||
|
||||
CONSUME_PEEKED_BYTE ();
|
||||
buf[1] = (scm_t_uint8) byte;
|
||||
*len = 2;
|
||||
|
||||
byte = scm_peek_byte_or_eof (port);
|
||||
ASSERT_NOT_EOF (byte);
|
||||
|
||||
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
|
||||
goto invalid_seq;
|
||||
|
||||
CONSUME_PEEKED_BYTE ();
|
||||
buf[2] = (scm_t_uint8) byte;
|
||||
*len = 3;
|
||||
|
||||
*codepoint = ((scm_t_wchar) buf[0] & 0x0f) << 12UL
|
||||
| ((scm_t_wchar) buf[1] & 0x3f) << 6UL
|
||||
| (buf[2] & 0x3f);
|
||||
}
|
||||
else if (buf[0] >= 0xf0 && buf[0] <= 0xf4)
|
||||
{
|
||||
/* 4-byte form. */
|
||||
byte = scm_peek_byte_or_eof (port);
|
||||
ASSERT_NOT_EOF (byte);
|
||||
|
||||
if (SCM_UNLIKELY (((byte & 0xc0) != 0x80)
|
||||
|| (buf[0] == 0xf0 && byte < 0x90)
|
||||
|| (buf[0] == 0xf4 && byte > 0x8f)))
|
||||
goto invalid_seq;
|
||||
|
||||
CONSUME_PEEKED_BYTE ();
|
||||
buf[1] = (scm_t_uint8) byte;
|
||||
*len = 2;
|
||||
|
||||
byte = scm_peek_byte_or_eof (port);
|
||||
ASSERT_NOT_EOF (byte);
|
||||
|
||||
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
|
||||
goto invalid_seq;
|
||||
|
||||
CONSUME_PEEKED_BYTE ();
|
||||
buf[2] = (scm_t_uint8) byte;
|
||||
*len = 3;
|
||||
|
||||
byte = scm_peek_byte_or_eof (port);
|
||||
ASSERT_NOT_EOF (byte);
|
||||
|
||||
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
|
||||
goto invalid_seq;
|
||||
|
||||
CONSUME_PEEKED_BYTE ();
|
||||
buf[3] = (scm_t_uint8) byte;
|
||||
*len = 4;
|
||||
|
||||
*codepoint = ((scm_t_wchar) buf[0] & 0x07) << 18UL
|
||||
| ((scm_t_wchar) buf[1] & 0x3f) << 12UL
|
||||
| ((scm_t_wchar) buf[2] & 0x3f) << 6UL
|
||||
| (buf[3] & 0x3f);
|
||||
}
|
||||
else
|
||||
goto invalid_seq;
|
||||
|
||||
return 0;
|
||||
|
||||
invalid_seq:
|
||||
/* Here we could choose the consume the faulty byte when it's not a
|
||||
valid starting byte, but it's not a requirement. What Section 3.9
|
||||
of Unicode 6.0.0 mandates, though, is to not consume a byte that
|
||||
would otherwise be a valid starting byte. */
|
||||
|
||||
return EILSEQ;
|
||||
|
||||
#undef CONSUME_PEEKED_BYTE
|
||||
#undef ASSERT_NOT_EOF
|
||||
}
|
||||
|
||||
/* Likewise, read a byte sequence from PORT, passing it through its
|
||||
input conversion descriptor. */
|
||||
static int
|
||||
get_iconv_codepoint (SCM port, scm_t_wchar *codepoint,
|
||||
char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
|
||||
{
|
||||
scm_t_port *pt;
|
||||
int err, byte_read;
|
||||
size_t bytes_consumed, output_size;
|
||||
char *output;
|
||||
scm_t_uint8 utf8_buf[SCM_MBCHAR_BUF_SIZE];
|
||||
scm_t_port *pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
if (SCM_UNLIKELY (pt->input_cd == (iconv_t) -1))
|
||||
/* Initialize the conversion descriptors. */
|
||||
scm_i_set_port_encoding_x (port, pt->encoding);
|
||||
pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
for (output_size = 0, output = (char *) utf8_buf,
|
||||
bytes_consumed = 0, err = 0;
|
||||
|
@ -1177,30 +1309,45 @@ get_codepoint (SCM port, scm_t_wchar *codepoint,
|
|||
if (SCM_UNLIKELY (output_size == 0))
|
||||
/* An unterminated sequence. */
|
||||
err = EILSEQ;
|
||||
|
||||
if (SCM_UNLIKELY (err != 0))
|
||||
{
|
||||
/* Reset the `iconv' state. */
|
||||
iconv (pt->input_cd, NULL, NULL, NULL, NULL);
|
||||
|
||||
if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
|
||||
{
|
||||
*codepoint = '?';
|
||||
err = 0;
|
||||
}
|
||||
|
||||
/* Fail when the strategy is SCM_ICONVEH_ERROR or
|
||||
SCM_ICONVEH_ESCAPE_SEQUENCE (the latter doesn't make sense for
|
||||
input encoding errors.) */
|
||||
}
|
||||
else
|
||||
else if (SCM_LIKELY (err == 0))
|
||||
{
|
||||
/* Convert the UTF8_BUF sequence to a Unicode code point. */
|
||||
*codepoint = utf8_to_codepoint (utf8_buf, output_size);
|
||||
update_port_lf (*codepoint, port);
|
||||
*len = bytes_consumed;
|
||||
}
|
||||
|
||||
*len = bytes_consumed;
|
||||
return err;
|
||||
}
|
||||
|
||||
/* Read a codepoint from PORT and return it in *CODEPOINT. Fill BUF
|
||||
with the byte representation of the codepoint in PORT's encoding, and
|
||||
set *LEN to the length in bytes of that representation. Return 0 on
|
||||
success and an errno value on error. */
|
||||
static int
|
||||
get_codepoint (SCM port, scm_t_wchar *codepoint,
|
||||
char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
|
||||
{
|
||||
int err;
|
||||
scm_t_port *pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
if (pt->input_cd == (iconv_t) -1)
|
||||
/* Initialize the conversion descriptors, if needed. */
|
||||
scm_i_set_port_encoding_x (port, pt->encoding);
|
||||
|
||||
/* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */
|
||||
if (pt->input_cd == (iconv_t) -1)
|
||||
err = get_utf8_codepoint (port, codepoint, (scm_t_uint8 *) buf, len);
|
||||
else
|
||||
err = get_iconv_codepoint (port, codepoint, buf, len);
|
||||
|
||||
if (SCM_LIKELY (err == 0))
|
||||
update_port_lf (*codepoint, port);
|
||||
else if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
|
||||
{
|
||||
*codepoint = '?';
|
||||
err = 0;
|
||||
update_port_lf (*codepoint, port);
|
||||
}
|
||||
|
||||
return err;
|
||||
}
|
||||
|
@ -2031,28 +2178,35 @@ scm_i_set_port_encoding_x (SCM port, const char *encoding)
|
|||
if (encoding == NULL)
|
||||
encoding = "ISO-8859-1";
|
||||
|
||||
pt->encoding = scm_gc_strdup (encoding, "port");
|
||||
if (pt->encoding != encoding)
|
||||
pt->encoding = scm_gc_strdup (encoding, "port");
|
||||
|
||||
if (SCM_CELL_WORD_0 (port) & SCM_RDNG)
|
||||
/* If ENCODING is UTF-8, then no conversion descriptor is opened
|
||||
because we do I/O ourselves. This saves 100+ KiB for each
|
||||
descriptor. */
|
||||
if (strcmp (encoding, "UTF-8"))
|
||||
{
|
||||
/* Open an input iconv conversion descriptor, from ENCODING
|
||||
to UTF-8. We choose UTF-8, not UTF-32, because iconv
|
||||
implementations can typically convert from anything to
|
||||
UTF-8, but not to UTF-32 (see
|
||||
<http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>). */
|
||||
new_input_cd = iconv_open ("UTF-8", encoding);
|
||||
if (new_input_cd == (iconv_t) -1)
|
||||
goto invalid_encoding;
|
||||
}
|
||||
|
||||
if (SCM_CELL_WORD_0 (port) & SCM_WRTNG)
|
||||
{
|
||||
new_output_cd = iconv_open (encoding, "UTF-8");
|
||||
if (new_output_cd == (iconv_t) -1)
|
||||
if (SCM_CELL_WORD_0 (port) & SCM_RDNG)
|
||||
{
|
||||
if (new_input_cd != (iconv_t) -1)
|
||||
iconv_close (new_input_cd);
|
||||
goto invalid_encoding;
|
||||
/* Open an input iconv conversion descriptor, from ENCODING
|
||||
to UTF-8. We choose UTF-8, not UTF-32, because iconv
|
||||
implementations can typically convert from anything to
|
||||
UTF-8, but not to UTF-32 (see
|
||||
<http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>). */
|
||||
new_input_cd = iconv_open ("UTF-8", encoding);
|
||||
if (new_input_cd == (iconv_t) -1)
|
||||
goto invalid_encoding;
|
||||
}
|
||||
|
||||
if (SCM_CELL_WORD_0 (port) & SCM_WRTNG)
|
||||
{
|
||||
new_output_cd = iconv_open (encoding, "UTF-8");
|
||||
if (new_output_cd == (iconv_t) -1)
|
||||
{
|
||||
if (new_input_cd != (iconv_t) -1)
|
||||
iconv_close (new_input_cd);
|
||||
goto invalid_encoding;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -821,31 +821,57 @@ codepoint_to_utf8 (scm_t_wchar ch, scm_t_uint8 utf8[4])
|
|||
return len;
|
||||
}
|
||||
|
||||
/* Display the LEN codepoints in STR to PORT according to STRATEGY;
|
||||
return the number of codepoints successfully displayed. If NARROW_P,
|
||||
then STR is interpreted as a sequence of `char', denoting a Latin-1
|
||||
string; otherwise it's interpreted as a sequence of
|
||||
`scm_t_wchar'. */
|
||||
static size_t
|
||||
display_string (const void *str, int narrow_p,
|
||||
size_t len, SCM port,
|
||||
scm_t_string_failed_conversion_handler strategy)
|
||||
|
||||
{
|
||||
#define STR_REF(s, x) \
|
||||
(narrow_p \
|
||||
? (scm_t_wchar) ((unsigned char *) (s))[x] \
|
||||
: ((scm_t_wchar *) (s))[x])
|
||||
|
||||
/* Write STR to PORT as UTF-8. STR is a LEN-codepoint string; it is
|
||||
narrow if NARROW_P is true, wide otherwise. Return LEN. */
|
||||
static size_t
|
||||
display_string_as_utf8 (const void *str, int narrow_p, size_t len,
|
||||
SCM port)
|
||||
{
|
||||
size_t printed = 0;
|
||||
|
||||
while (len > printed)
|
||||
{
|
||||
size_t utf8_len, i;
|
||||
char *input, utf8_buf[256];
|
||||
|
||||
/* Convert STR to UTF-8. */
|
||||
for (i = printed, utf8_len = 0, input = utf8_buf;
|
||||
i < len && utf8_len + 4 < sizeof (utf8_buf);
|
||||
i++)
|
||||
{
|
||||
utf8_len += codepoint_to_utf8 (STR_REF (str, i),
|
||||
(scm_t_uint8 *) input);
|
||||
input = utf8_buf + utf8_len;
|
||||
}
|
||||
|
||||
/* INPUT was successfully converted, entirely; print the
|
||||
result. */
|
||||
scm_lfwrite (utf8_buf, utf8_len, port);
|
||||
printed += i - printed;
|
||||
}
|
||||
|
||||
assert (printed == len);
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
/* Convert STR through PORT's output conversion descriptor and write the
|
||||
output to PORT. Return the number of codepoints written. */
|
||||
static size_t
|
||||
display_string_using_iconv (const void *str, int narrow_p, size_t len,
|
||||
SCM port,
|
||||
scm_t_string_failed_conversion_handler strategy)
|
||||
{
|
||||
size_t printed;
|
||||
scm_t_port *pt;
|
||||
|
||||
pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
if (SCM_UNLIKELY (pt->output_cd == (iconv_t) -1))
|
||||
/* Initialize the conversion descriptors. */
|
||||
scm_i_set_port_encoding_x (port, pt->encoding);
|
||||
|
||||
printed = 0;
|
||||
|
||||
while (len > printed)
|
||||
|
@ -928,7 +954,35 @@ display_string (const void *str, int narrow_p,
|
|||
}
|
||||
|
||||
return printed;
|
||||
}
|
||||
|
||||
#undef STR_REF
|
||||
|
||||
/* Display the LEN codepoints in STR to PORT according to STRATEGY;
|
||||
return the number of codepoints successfully displayed. If NARROW_P,
|
||||
then STR is interpreted as a sequence of `char', denoting a Latin-1
|
||||
string; otherwise it's interpreted as a sequence of
|
||||
`scm_t_wchar'. */
|
||||
static size_t
|
||||
display_string (const void *str, int narrow_p,
|
||||
size_t len, SCM port,
|
||||
scm_t_string_failed_conversion_handler strategy)
|
||||
|
||||
{
|
||||
scm_t_port *pt;
|
||||
|
||||
pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
if (pt->output_cd == (iconv_t) -1)
|
||||
/* Initialize the conversion descriptors, if needed. */
|
||||
scm_i_set_port_encoding_x (port, pt->encoding);
|
||||
|
||||
/* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */
|
||||
if (pt->output_cd == (iconv_t) -1)
|
||||
return display_string_as_utf8 (str, narrow_p, len, port);
|
||||
else
|
||||
return display_string_using_iconv (str, narrow_p, len,
|
||||
port, strategy);
|
||||
}
|
||||
|
||||
/* Attempt to display CH to PORT according to STRATEGY. Return non-zero
|
||||
|
|
|
@ -460,14 +460,11 @@ SCM_DEFINE (scm_lookahead_u8, "lookahead-u8", 1, 0, 0,
|
|||
|
||||
SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
|
||||
|
||||
u8 = scm_get_byte_or_eof (port);
|
||||
u8 = scm_peek_byte_or_eof (port);
|
||||
if (u8 == EOF)
|
||||
result = SCM_EOF_VAL;
|
||||
else
|
||||
{
|
||||
scm_unget_byte (u8, port);
|
||||
result = SCM_I_MAKINUM ((scm_t_uint8) u8);
|
||||
}
|
||||
result = SCM_I_MAKINUM ((scm_t_uint8) u8);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
|
|
@ -1135,7 +1135,7 @@ scm_read_scsh_block_comment (scm_t_wchar chr, SCM port)
|
|||
return SCM_UNSPECIFIED;
|
||||
}
|
||||
|
||||
static inline SCM
|
||||
static SCM
|
||||
scm_read_shebang (scm_t_wchar chr, SCM port)
|
||||
{
|
||||
int c = 0;
|
||||
|
|
|
@ -33,7 +33,7 @@
|
|||
vhash? vhash-cons vhash-consq vhash-consv
|
||||
vhash-assoc vhash-assq vhash-assv
|
||||
vhash-delete vhash-delq vhash-delv
|
||||
vhash-fold
|
||||
vhash-fold vhash-fold-right
|
||||
vhash-fold* vhash-foldq* vhash-foldv*
|
||||
alist->vhash))
|
||||
|
||||
|
@ -245,7 +245,14 @@ tail."
|
|||
(define (vlist-fold-right proc init vlist)
|
||||
"Fold over @var{vlist}, calling @var{proc} for each element, starting from
|
||||
the last element."
|
||||
(vlist-fold proc init (vlist-reverse vlist)))
|
||||
(define len (vlist-length vlist))
|
||||
|
||||
(let loop ((index (1- len))
|
||||
(result init))
|
||||
(if (< index 0)
|
||||
result
|
||||
(loop (1- index)
|
||||
(proc (vlist-ref vlist index) result)))))
|
||||
|
||||
(define (vlist-reverse vlist)
|
||||
"Return a new @var{vlist} whose content are those of @var{vlist} in reverse
|
||||
|
@ -553,6 +560,16 @@ with @var{equal?}."
|
|||
seed
|
||||
vhash))
|
||||
|
||||
(define (vhash-fold-right proc seed vhash)
|
||||
"Fold over the key/pair elements of @var{vhash}, starting from the 0th
|
||||
element. For each pair call @var{proc} as @code{(@var{proc} key value
|
||||
result)}."
|
||||
(vlist-fold-right (lambda (key+value result)
|
||||
(proc (car key+value) (cdr key+value)
|
||||
result))
|
||||
seed
|
||||
vhash))
|
||||
|
||||
(define* (alist->vhash alist #:optional (hash hash))
|
||||
"Return the vhash corresponding to @var{alist}, an association list."
|
||||
(fold-right (lambda (pair result)
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
;;; Guile VM assembler
|
||||
|
||||
;; Copyright (C) 2001, 2009, 2010 Free Software Foundation, Inc.
|
||||
;; Copyright (C) 2001, 2009, 2010, 2011 Free Software Foundation, Inc.
|
||||
|
||||
;;;; This library is free software; you can redistribute it and/or
|
||||
;;;; modify it under the terms of the GNU Lesser General Public
|
||||
|
@ -26,10 +26,36 @@
|
|||
#:use-module (system vm instruction)
|
||||
#:use-module ((system vm program) #:select (make-binding))
|
||||
#:use-module (ice-9 receive)
|
||||
#:use-module (ice-9 vlist)
|
||||
#:use-module ((srfi srfi-1) #:select (fold))
|
||||
#:use-module (rnrs bytevectors)
|
||||
#:export (compile-assembly))
|
||||
|
||||
;; Traversal helpers
|
||||
;;
|
||||
(define (vhash-fold-right2 proc vhash s0 s1)
|
||||
(let lp ((i (vlist-length vhash)) (s0 s0) (s1 s1))
|
||||
(if (zero? i)
|
||||
(values s0 s1)
|
||||
(receive (s0 s1) (let ((pair (vlist-ref vhash (1- i))))
|
||||
(proc (car pair) (cdr pair) s0 s1))
|
||||
(lp (1- i) s0 s1)))))
|
||||
|
||||
(define (fold2 proc ls s0 s1)
|
||||
(let lp ((ls ls) (s0 s0) (s1 s1))
|
||||
(if (null? ls)
|
||||
(values s0 s1)
|
||||
(receive (s0 s1) (proc (car ls) s0 s1)
|
||||
(lp (cdr ls) s0 s1)))))
|
||||
|
||||
(define (vector-fold2 proc vect s0 s1)
|
||||
(let ((len (vector-length vect)))
|
||||
(let lp ((i 0) (s0 s0) (s1 s1))
|
||||
(if (< i len)
|
||||
(receive (s0 s1) (proc (vector-ref vect i) s0 s1)
|
||||
(lp (1+ i) s0 s1))
|
||||
(values s0 s1)))))
|
||||
|
||||
;; Variable cache cells go in the object table, and serialize as their
|
||||
;; keys. The reason we wrap the keys in these records is so they don't
|
||||
;; compare as `equal?' to other objects in the object table.
|
||||
|
@ -38,13 +64,6 @@
|
|||
|
||||
(define-record <variable-cache-cell> key)
|
||||
|
||||
;; Subprograms can be loaded into an object table as well. We need a
|
||||
;; disjoint type here too. (Subprograms have their own object tables --
|
||||
;; though probably we should just make one table per compilation unit.)
|
||||
|
||||
(define-record <subprogram> table prog)
|
||||
|
||||
|
||||
(define (limn-sources sources)
|
||||
(let lp ((in sources) (out '()) (filename #f))
|
||||
(if (null? in)
|
||||
|
@ -68,16 +87,132 @@
|
|||
(else
|
||||
(lp (cdr in) out filename)))))))
|
||||
|
||||
|
||||
;; Avoid going through the compiler so as to avoid adding to the
|
||||
;; constant store.
|
||||
(define (make-meta bindings sources arities tail)
|
||||
;; sounds silly, but the only case in which we have no arities is when
|
||||
;; compiling a meta procedure.
|
||||
(if (and (null? bindings) (null? sources) (null? arities) (null? tail))
|
||||
#f
|
||||
(compile-assembly
|
||||
(make-glil-program '()
|
||||
(list
|
||||
(make-glil-const `(,bindings ,sources ,arities ,@tail))
|
||||
(make-glil-call 'return 1))))))
|
||||
(let ((body `(,@(dump-object `(,bindings ,sources ,arities ,@tail) 0)
|
||||
(return))))
|
||||
`(load-program ()
|
||||
,(addr+ 0 body)
|
||||
#f
|
||||
,@body)))
|
||||
|
||||
;; If this is true, the object doesn't need to go in a constant table.
|
||||
;;
|
||||
(define (immediate? x)
|
||||
(object->assembly x))
|
||||
|
||||
;; Note: in all of these procedures that build up constant tables, the
|
||||
;; first (zeroth) index is reserved. At runtime it is replaced with the
|
||||
;; procedure's module. Hence all of this 1+ length business.
|
||||
|
||||
;; Build up a vhash of constant -> index, allowing us to build up a
|
||||
;; constant table for a whole compilation unit.
|
||||
;;
|
||||
(define (build-constant-store x)
|
||||
(define (add-to-store store x)
|
||||
(define (add-to-end store x)
|
||||
(vhash-cons x (1+ (vlist-length store)) store))
|
||||
(cond
|
||||
((vhash-assoc x store)
|
||||
;; Already in the store.
|
||||
store)
|
||||
((immediate? x)
|
||||
;; Immediates don't need to go in the constant table.
|
||||
store)
|
||||
((or (number? x)
|
||||
(string? x)
|
||||
(symbol? x)
|
||||
(keyword? x))
|
||||
;; Atoms.
|
||||
(add-to-end store x))
|
||||
((variable-cache-cell? x)
|
||||
;; Variable cache cells (see below).
|
||||
(add-to-end (add-to-store store (variable-cache-cell-key x))
|
||||
x))
|
||||
((list? x)
|
||||
;; Add the elements to the store, then the list itself. We could
|
||||
;; try hashing the cdrs as well, but that seems a bit overkill, and
|
||||
;; this way we do compress the bytecode a bit by allowing the use of
|
||||
;; the `list' opcode.
|
||||
(let ((store (fold (lambda (x store)
|
||||
(add-to-store store x))
|
||||
store
|
||||
x)))
|
||||
(add-to-end store x)))
|
||||
((pair? x)
|
||||
;; Non-lists get caching on both fields.
|
||||
(let ((store (add-to-store (add-to-store store (car x))
|
||||
(cdr x))))
|
||||
(add-to-end store x)))
|
||||
((and (vector? x)
|
||||
(equal? (array-shape x) (list (list 0 (1- (vector-length x))))))
|
||||
;; Likewise, add the elements to the store, then the vector itself.
|
||||
;; Important for the vectors produced by the psyntax expansion
|
||||
;; process.
|
||||
(let ((store (fold (lambda (x store)
|
||||
(add-to-store store x))
|
||||
store
|
||||
(vector->list x))))
|
||||
(add-to-end store x)))
|
||||
((array? x)
|
||||
;; Naive assumption that if folks are using arrays, that perhaps
|
||||
;; there's not much more duplication.
|
||||
(add-to-end store x))
|
||||
(else
|
||||
(error "build-constant-store: unrecognized object" x))))
|
||||
|
||||
(let walk ((x x) (store vlist-null))
|
||||
(record-case x
|
||||
((<glil-program> meta body)
|
||||
(fold walk store body))
|
||||
((<glil-const> obj)
|
||||
(add-to-store store obj))
|
||||
((<glil-kw-prelude> kw)
|
||||
(add-to-store store kw))
|
||||
((<glil-toplevel> op name)
|
||||
;; We don't add toplevel variable cache cells to the global
|
||||
;; constant table, because they are sensitive to changes in
|
||||
;; modules as the toplevel expressions are evaluated. So we just
|
||||
;; add the name.
|
||||
(add-to-store store name))
|
||||
((<glil-module> op mod name public?)
|
||||
;; However, it is fine add module variable cache cells to the
|
||||
;; global table, as their bindings are not dependent on the
|
||||
;; current module.
|
||||
(add-to-store store
|
||||
(make-variable-cache-cell (list mod name public?))))
|
||||
(else store))))
|
||||
|
||||
;; Analyze one <glil-program> to determine its object table. Produces a
|
||||
;; vhash of constant to index.
|
||||
;;
|
||||
(define (build-object-table x)
|
||||
(define (add store x)
|
||||
(if (vhash-assoc x store)
|
||||
store
|
||||
(vhash-cons x (1+ (vlist-length store)) store)))
|
||||
(record-case x
|
||||
((<glil-program> meta body)
|
||||
(fold (lambda (x table)
|
||||
(record-case x
|
||||
((<glil-program> meta body)
|
||||
;; Add the GLIL itself to the table.
|
||||
(add table x))
|
||||
((<glil-const> obj)
|
||||
(if (immediate? obj)
|
||||
table
|
||||
(add table obj)))
|
||||
((<glil-kw-prelude> kw)
|
||||
(add table kw))
|
||||
((<glil-toplevel> op name)
|
||||
(add table (make-variable-cache-cell name)))
|
||||
((<glil-module> op mod name public?)
|
||||
(add table (make-variable-cache-cell (list mod name public?))))
|
||||
(else table)))
|
||||
vlist-null
|
||||
body))))
|
||||
|
||||
;; A functional stack of names of live variables.
|
||||
(define (make-open-binding name boxed? index)
|
||||
|
@ -115,21 +250,6 @@
|
|||
(lambda (x y) (< (car x) (car y)))))
|
||||
(close-all-bindings (close-binding bindings end) end)))
|
||||
|
||||
;; A functional object table.
|
||||
(define *module* 1)
|
||||
(define (assoc-ref-or-acons alist x make-y)
|
||||
(cond ((assoc-ref alist x)
|
||||
=> (lambda (y) (values y alist)))
|
||||
(else
|
||||
(let ((y (make-y x alist)))
|
||||
(values y (acons x y alist))))))
|
||||
(define (object-index-and-alist x alist)
|
||||
(assoc-ref-or-acons alist x
|
||||
(lambda (x alist)
|
||||
(+ (length alist) *module*))))
|
||||
(define (make-object-table objects)
|
||||
(and (not (null? objects))
|
||||
(list->vector (cons #f objects))))
|
||||
|
||||
;; A functional arities thingamajiggy.
|
||||
;; arities := ((ip nreq [[nopt] [[rest] [kw]]]]) ...)
|
||||
|
@ -152,82 +272,151 @@
|
|||
(open-arity start nreq nopt rest kw (close-arity end arities)))
|
||||
|
||||
(define (compile-assembly glil)
|
||||
(receive (code . _)
|
||||
(glil->assembly glil #t '(()) '() '() #f '() -1)
|
||||
(car code)))
|
||||
(let* ((all-constants (build-constant-store glil))
|
||||
(prog (compile-program glil all-constants))
|
||||
(len (byte-length prog)))
|
||||
;; The top objcode thunk. We're going to wrap this thunk in
|
||||
;; a thunk -- yo dawgs -- with the goal being to lift all
|
||||
;; constants up to the top level. The store forms a DAG, so
|
||||
;; we can actually build up later elements in terms of
|
||||
;; earlier ones.
|
||||
;;
|
||||
(cond
|
||||
((vlist-null? all-constants)
|
||||
;; No constants: just emit the inner thunk.
|
||||
prog)
|
||||
(else
|
||||
;; We have an object store, so write it out, attach it
|
||||
;; to the inner thunk, and tail call.
|
||||
(receive (tablecode addr) (dump-constants all-constants)
|
||||
(let ((prog (align-program prog addr)))
|
||||
;; Outer thunk.
|
||||
`(load-program ()
|
||||
,(+ (addr+ addr prog)
|
||||
2 ; for (tail-call 0)
|
||||
)
|
||||
#f
|
||||
;; Load the table, build the inner
|
||||
;; thunk, then tail call.
|
||||
,@tablecode
|
||||
,@prog
|
||||
(tail-call 0))))))))
|
||||
|
||||
(define (glil->assembly glil toplevel? bindings
|
||||
source-alist label-alist object-alist arities addr)
|
||||
(define (compile-program glil constants)
|
||||
(record-case glil
|
||||
((<glil-program> meta body)
|
||||
(let lp ((body body) (code '()) (bindings '(())) (source-alist '())
|
||||
(label-alist '()) (arities '()) (addr 0))
|
||||
(cond
|
||||
((null? body)
|
||||
(let ((code (fold append '() code))
|
||||
(bindings (close-all-bindings bindings addr))
|
||||
(sources (limn-sources (reverse! source-alist)))
|
||||
(labels (reverse label-alist))
|
||||
(arities (reverse (close-arity addr arities)))
|
||||
(len addr))
|
||||
(let* ((meta (make-meta bindings sources arities meta))
|
||||
(meta-pad (if meta (modulo (- 8 (modulo len 8)) 8) 0)))
|
||||
`(load-program ,labels
|
||||
,(+ len meta-pad)
|
||||
,meta
|
||||
,@code
|
||||
,@(if meta
|
||||
(make-list meta-pad '(nop))
|
||||
'())))))
|
||||
(else
|
||||
(receive (subcode bindings source-alist label-alist arities)
|
||||
(glil->assembly (car body) bindings
|
||||
source-alist label-alist
|
||||
constants arities addr)
|
||||
(lp (cdr body) (cons subcode code)
|
||||
bindings source-alist label-alist arities
|
||||
(addr+ addr subcode)))))))))
|
||||
|
||||
(define (compile-objtable constants table addr)
|
||||
(define (load-constant idx)
|
||||
(if (< idx 256)
|
||||
(values `((object-ref ,idx))
|
||||
2)
|
||||
(values `((long-object-ref
|
||||
,(quotient idx 256) ,(modulo idx 256)))
|
||||
3)))
|
||||
(cond
|
||||
((vlist-null? table)
|
||||
;; Empty table; just return #f.
|
||||
(values '((make-false))
|
||||
(1+ addr)))
|
||||
(else
|
||||
(call-with-values
|
||||
(lambda ()
|
||||
(vhash-fold-right2
|
||||
(lambda (obj idx codes addr)
|
||||
(cond
|
||||
((vhash-assoc obj constants)
|
||||
=> (lambda (pair)
|
||||
(receive (load len) (load-constant (cdr pair))
|
||||
(values (cons load codes)
|
||||
(+ addr len)))))
|
||||
((variable-cache-cell? obj)
|
||||
(cond
|
||||
((vhash-assoc (variable-cache-cell-key obj) constants)
|
||||
=> (lambda (pair)
|
||||
(receive (load len) (load-constant (cdr pair))
|
||||
(values (cons load codes)
|
||||
(+ addr len)))))
|
||||
(else (error "vcache cell key not in table" obj))))
|
||||
((glil-program? obj)
|
||||
;; Programs are not cached in the global constants
|
||||
;; table because when a program is loaded, its module
|
||||
;; is bound, and we want to do that only after any
|
||||
;; preceding effectful statements.
|
||||
(let* ((table (build-object-table obj))
|
||||
(prog (compile-program obj table)))
|
||||
(receive (tablecode addr)
|
||||
(compile-objtable constants table addr)
|
||||
(let ((prog (align-program prog addr)))
|
||||
(values (cons `(,@tablecode ,@prog)
|
||||
codes)
|
||||
(addr+ addr prog))))))
|
||||
(else
|
||||
(error "unrecognized constant" obj))))
|
||||
table
|
||||
'(((make-false))) (1+ addr)))
|
||||
(lambda (elts addr)
|
||||
(let ((len (1+ (vlist-length table))))
|
||||
(values
|
||||
(fold append
|
||||
`((vector ,(quotient len 256) ,(modulo len 256)))
|
||||
elts)
|
||||
(+ addr 3))))))))
|
||||
|
||||
(define (glil->assembly glil bindings source-alist label-alist
|
||||
constants arities addr)
|
||||
(define (emit-code x)
|
||||
(values x bindings source-alist label-alist object-alist arities))
|
||||
(define (emit-code/object x object-alist)
|
||||
(values x bindings source-alist label-alist object-alist arities))
|
||||
(values x bindings source-alist label-alist arities))
|
||||
(define (emit-object-ref i)
|
||||
(values (if (< i 256)
|
||||
`((object-ref ,i))
|
||||
`((long-object-ref ,(quotient i 256) ,(modulo i 256))))
|
||||
bindings source-alist label-alist arities))
|
||||
(define (emit-code/arity x nreq nopt rest kw)
|
||||
(values x bindings source-alist label-alist object-alist
|
||||
(values x bindings source-alist label-alist
|
||||
(begin-arity addr (addr+ addr x) nreq nopt rest kw arities)))
|
||||
|
||||
(record-case glil
|
||||
((<glil-program> meta body)
|
||||
(define (process-body)
|
||||
(let lp ((body body) (code '()) (bindings '(())) (source-alist '())
|
||||
(label-alist '()) (object-alist (if toplevel? #f '()))
|
||||
(arities '()) (addr 0))
|
||||
(cond
|
||||
((null? body)
|
||||
(values (reverse code)
|
||||
(close-all-bindings bindings addr)
|
||||
(limn-sources (reverse! source-alist))
|
||||
(reverse label-alist)
|
||||
(and object-alist (map car (reverse object-alist)))
|
||||
(reverse (close-arity addr arities))
|
||||
addr))
|
||||
(else
|
||||
(receive (subcode bindings source-alist label-alist object-alist
|
||||
arities)
|
||||
(glil->assembly (car body) #f bindings
|
||||
source-alist label-alist object-alist
|
||||
arities addr)
|
||||
(lp (cdr body) (append (reverse subcode) code)
|
||||
bindings source-alist label-alist object-alist arities
|
||||
(addr+ addr subcode)))))))
|
||||
|
||||
(receive (code bindings sources labels objects arities len)
|
||||
(process-body)
|
||||
(let* ((meta (make-meta bindings sources arities meta))
|
||||
(meta-pad (if meta (modulo (- 8 (modulo len 8)) 8) 0))
|
||||
(prog `(load-program ,labels
|
||||
,(+ len meta-pad)
|
||||
,meta
|
||||
,@code
|
||||
,@(if meta
|
||||
(make-list meta-pad '(nop))
|
||||
'()))))
|
||||
(cond
|
||||
(toplevel?
|
||||
;; toplevel bytecode isn't loaded by the vm, no way to do
|
||||
;; object table or closure capture (not in the bytecode,
|
||||
;; anyway)
|
||||
(emit-code (align-program prog addr)))
|
||||
(else
|
||||
(let ((table (make-object-table objects)))
|
||||
(cond
|
||||
(object-alist
|
||||
;; if we are being compiled from something with an object
|
||||
;; table, cache the program there
|
||||
(receive (i object-alist)
|
||||
(object-index-and-alist (make-subprogram table prog)
|
||||
object-alist)
|
||||
(emit-code/object `(,(if (< i 256)
|
||||
`(object-ref ,i)
|
||||
`(long-object-ref ,(quotient i 256)
|
||||
,(modulo i 256))))
|
||||
object-alist)))
|
||||
(else
|
||||
;; otherwise emit a load directly
|
||||
(let ((table-code (dump-object table addr)))
|
||||
(emit-code
|
||||
`(,@table-code
|
||||
,@(align-program prog (addr+ addr table-code)))))))))))))
|
||||
(cond
|
||||
((vhash-assoc glil constants)
|
||||
;; We are cached in someone's objtable; just emit a load.
|
||||
=> (lambda (pair)
|
||||
(emit-object-ref (cdr pair))))
|
||||
(else
|
||||
;; Otherwise, build an objtable for the program, compile it, and
|
||||
;; emit a load-program.
|
||||
(let* ((table (build-object-table glil))
|
||||
(prog (compile-program glil table)))
|
||||
(receive (tablecode addr) (compile-objtable constants table addr)
|
||||
(emit-code `(,@tablecode ,@(align-program prog addr))))))))
|
||||
|
||||
((<glil-std-prelude> nreq nlocs else-label)
|
||||
(emit-code/arity
|
||||
|
@ -277,61 +466,60 @@
|
|||
nreq nopt rest #f)))
|
||||
|
||||
((<glil-kw-prelude> nreq nopt rest kw allow-other-keys? nlocs else-label)
|
||||
(receive (kw-idx object-alist)
|
||||
(object-index-and-alist kw object-alist)
|
||||
(let* ((bind-required
|
||||
(if else-label
|
||||
`((br-if-nargs-lt ,(quotient nreq 256)
|
||||
,(modulo nreq 256)
|
||||
,else-label))
|
||||
`((assert-nargs-ge ,(quotient nreq 256)
|
||||
,(modulo nreq 256)))))
|
||||
(ntotal (apply max (+ nreq nopt) (map 1+ (map cdr kw))))
|
||||
(bind-optionals-and-shuffle
|
||||
`((bind-optionals/shuffle
|
||||
,(quotient nreq 256)
|
||||
,(modulo nreq 256)
|
||||
,(quotient (+ nreq nopt) 256)
|
||||
,(modulo (+ nreq nopt) 256)
|
||||
,(quotient ntotal 256)
|
||||
,(modulo ntotal 256))))
|
||||
(bind-kw
|
||||
;; when this code gets called, all optionals are filled
|
||||
;; in, space has been made for kwargs, and the kwargs
|
||||
;; themselves have been shuffled above the slots for all
|
||||
;; req/opt/kwargs locals.
|
||||
`((bind-kwargs
|
||||
,(quotient kw-idx 256)
|
||||
,(modulo kw-idx 256)
|
||||
,(quotient ntotal 256)
|
||||
,(modulo ntotal 256)
|
||||
,(logior (if rest 2 0)
|
||||
(if allow-other-keys? 1 0)))))
|
||||
(bind-rest
|
||||
(if rest
|
||||
`((bind-rest ,(quotient ntotal 256)
|
||||
,(modulo ntotal 256)
|
||||
,(quotient rest 256)
|
||||
,(modulo rest 256)))
|
||||
'())))
|
||||
(let* ((kw-idx (or (and=> (vhash-assoc kw constants) cdr)
|
||||
(error "kw not in objtable")))
|
||||
(bind-required
|
||||
(if else-label
|
||||
`((br-if-nargs-lt ,(quotient nreq 256)
|
||||
,(modulo nreq 256)
|
||||
,else-label))
|
||||
`((assert-nargs-ge ,(quotient nreq 256)
|
||||
,(modulo nreq 256)))))
|
||||
(ntotal (apply max (+ nreq nopt) (map 1+ (map cdr kw))))
|
||||
(bind-optionals-and-shuffle
|
||||
`((bind-optionals/shuffle
|
||||
,(quotient nreq 256)
|
||||
,(modulo nreq 256)
|
||||
,(quotient (+ nreq nopt) 256)
|
||||
,(modulo (+ nreq nopt) 256)
|
||||
,(quotient ntotal 256)
|
||||
,(modulo ntotal 256))))
|
||||
(bind-kw
|
||||
;; when this code gets called, all optionals are filled
|
||||
;; in, space has been made for kwargs, and the kwargs
|
||||
;; themselves have been shuffled above the slots for all
|
||||
;; req/opt/kwargs locals.
|
||||
`((bind-kwargs
|
||||
,(quotient kw-idx 256)
|
||||
,(modulo kw-idx 256)
|
||||
,(quotient ntotal 256)
|
||||
,(modulo ntotal 256)
|
||||
,(logior (if rest 2 0)
|
||||
(if allow-other-keys? 1 0)))))
|
||||
(bind-rest
|
||||
(if rest
|
||||
`((bind-rest ,(quotient ntotal 256)
|
||||
,(modulo ntotal 256)
|
||||
,(quotient rest 256)
|
||||
,(modulo rest 256)))
|
||||
'())))
|
||||
|
||||
(let ((code `(,@bind-required
|
||||
,@bind-optionals-and-shuffle
|
||||
,@bind-kw
|
||||
,@bind-rest
|
||||
(reserve-locals ,(quotient nlocs 256)
|
||||
,(modulo nlocs 256)))))
|
||||
(values code bindings source-alist label-alist object-alist
|
||||
(begin-arity addr (addr+ addr code) nreq nopt rest
|
||||
(and kw (cons allow-other-keys? kw))
|
||||
arities))))))
|
||||
(let ((code `(,@bind-required
|
||||
,@bind-optionals-and-shuffle
|
||||
,@bind-kw
|
||||
,@bind-rest
|
||||
(reserve-locals ,(quotient nlocs 256)
|
||||
,(modulo nlocs 256)))))
|
||||
(values code bindings source-alist label-alist
|
||||
(begin-arity addr (addr+ addr code) nreq nopt rest
|
||||
(and kw (cons allow-other-keys? kw))
|
||||
arities)))))
|
||||
|
||||
((<glil-bind> vars)
|
||||
(values '()
|
||||
(open-binding bindings vars addr)
|
||||
source-alist
|
||||
label-alist
|
||||
object-alist
|
||||
arities))
|
||||
|
||||
((<glil-mv-bind> vars rest)
|
||||
|
@ -340,13 +528,11 @@
|
|||
bindings
|
||||
source-alist
|
||||
label-alist
|
||||
object-alist
|
||||
arities)
|
||||
(values `((truncate-values ,(length vars) ,(if rest 1 0)))
|
||||
(open-binding bindings vars addr)
|
||||
source-alist
|
||||
label-alist
|
||||
object-alist
|
||||
arities)))
|
||||
|
||||
((<glil-unbind>)
|
||||
|
@ -354,7 +540,6 @@
|
|||
(close-binding bindings addr)
|
||||
source-alist
|
||||
label-alist
|
||||
object-alist
|
||||
arities))
|
||||
|
||||
((<glil-source> props)
|
||||
|
@ -362,7 +547,6 @@
|
|||
bindings
|
||||
(acons addr props source-alist)
|
||||
label-alist
|
||||
object-alist
|
||||
arities))
|
||||
|
||||
((<glil-void>)
|
||||
|
@ -373,16 +557,10 @@
|
|||
((object->assembly obj)
|
||||
=> (lambda (code)
|
||||
(emit-code (list code))))
|
||||
((not object-alist)
|
||||
(emit-code (dump-object obj addr)))
|
||||
(else
|
||||
(receive (i object-alist)
|
||||
(object-index-and-alist obj object-alist)
|
||||
(emit-code/object (if (< i 256)
|
||||
`((object-ref ,i))
|
||||
`((long-object-ref ,(quotient i 256)
|
||||
,(modulo i 256))))
|
||||
object-alist)))))
|
||||
((vhash-assoc obj constants)
|
||||
=> (lambda (pair)
|
||||
(emit-object-ref (cdr pair))))
|
||||
(else (error "const not in table" obj))))
|
||||
|
||||
((<glil-lexical> local? boxed? op index)
|
||||
(emit-code
|
||||
|
@ -442,30 +620,38 @@
|
|||
(case op
|
||||
((ref set)
|
||||
(cond
|
||||
((not object-alist)
|
||||
(emit-code `(,@(dump-object name addr)
|
||||
(link-now)
|
||||
,(case op
|
||||
((ref) '(variable-ref))
|
||||
((set) '(variable-set))))))
|
||||
((and=> (vhash-assoc (make-variable-cache-cell name) constants)
|
||||
cdr)
|
||||
=> (lambda (i)
|
||||
(emit-code (if (< i 256)
|
||||
`((,(case op
|
||||
((ref) 'toplevel-ref)
|
||||
((set) 'toplevel-set))
|
||||
,i))
|
||||
`((,(case op
|
||||
((ref) 'long-toplevel-ref)
|
||||
((set) 'long-toplevel-set))
|
||||
,(quotient i 256)
|
||||
,(modulo i 256)))))))
|
||||
(else
|
||||
(receive (i object-alist)
|
||||
(object-index-and-alist (make-variable-cache-cell name)
|
||||
object-alist)
|
||||
(emit-code/object (if (< i 256)
|
||||
`((,(case op
|
||||
((ref) 'toplevel-ref)
|
||||
((set) 'toplevel-set))
|
||||
,i))
|
||||
`((,(case op
|
||||
((ref) 'long-toplevel-ref)
|
||||
((set) 'long-toplevel-set))
|
||||
,(quotient i 256)
|
||||
,(modulo i 256))))
|
||||
object-alist)))))
|
||||
(let ((i (or (and=> (vhash-assoc name constants) cdr)
|
||||
(error "toplevel name not in objtable" name))))
|
||||
(emit-code `(,(if (< i 256)
|
||||
`(object-ref ,i)
|
||||
`(long-object-ref ,(quotient i 256)
|
||||
,(modulo i 256)))
|
||||
(link-now)
|
||||
,(case op
|
||||
((ref) '(variable-ref))
|
||||
((set) '(variable-set)))))))))
|
||||
((define)
|
||||
(emit-code `(,@(dump-object name addr)
|
||||
(define))))
|
||||
(let ((i (or (and=> (vhash-assoc name constants) cdr)
|
||||
(error "toplevel name not in objtable" name))))
|
||||
(emit-code `(,(if (< i 256)
|
||||
`(object-ref ,i)
|
||||
`(long-object-ref ,(quotient i 256)
|
||||
,(modulo i 256)))
|
||||
(define)))))
|
||||
(else
|
||||
(error "unknown toplevel var kind" op name))))
|
||||
|
||||
|
@ -473,21 +659,19 @@
|
|||
(let ((key (list mod name public?)))
|
||||
(case op
|
||||
((ref set)
|
||||
(cond
|
||||
((not object-alist)
|
||||
(emit-code `(,@(dump-object key addr)
|
||||
(link-now)
|
||||
,(case op
|
||||
((ref) '(variable-ref))
|
||||
((set) '(variable-set))))))
|
||||
(else
|
||||
(receive (i object-alist)
|
||||
(object-index-and-alist (make-variable-cache-cell key)
|
||||
object-alist)
|
||||
(emit-code/object (case op
|
||||
((ref) `((toplevel-ref ,i)))
|
||||
((set) `((toplevel-set ,i))))
|
||||
object-alist)))))
|
||||
(let ((i (or (and=> (vhash-assoc (make-variable-cache-cell key)
|
||||
constants) cdr)
|
||||
(error "module vcache not in objtable" key))))
|
||||
(emit-code (if (< i 256)
|
||||
`((,(case op
|
||||
((ref) 'toplevel-ref)
|
||||
((set) 'toplevel-set))
|
||||
,i))
|
||||
`((,(case op
|
||||
((ref) 'long-toplevel-ref)
|
||||
((set) 'long-toplevel-set))
|
||||
,(quotient i 256)
|
||||
,(modulo i 256)))))))
|
||||
(else
|
||||
(error "unknown module var kind" op key)))))
|
||||
|
||||
|
@ -497,7 +681,6 @@
|
|||
bindings
|
||||
source-alist
|
||||
(acons label (addr+ addr code) label-alist)
|
||||
object-alist
|
||||
arities)))
|
||||
|
||||
((<glil-branch> inst label)
|
||||
|
@ -533,11 +716,6 @@
|
|||
(cond
|
||||
((object->assembly x) => list)
|
||||
((variable-cache-cell? x) (dump-object (variable-cache-cell-key x) addr))
|
||||
((subprogram? x)
|
||||
(let ((table-code (dump-object (subprogram-table x) addr)))
|
||||
`(,@table-code
|
||||
,@(align-program (subprogram-prog x)
|
||||
(addr+ addr table-code)))))
|
||||
((number? x)
|
||||
`((load-number ,(number->string x))))
|
||||
((string? x)
|
||||
|
@ -608,5 +786,153 @@
|
|||
,(logand #xff len)))
|
||||
codes)))))
|
||||
(else
|
||||
(error "assemble: unrecognized object" x))))
|
||||
(error "dump-object: unrecognized object" x))))
|
||||
|
||||
(define (dump-constants constants)
|
||||
(define (ref-or-dump x i addr)
|
||||
(let ((pair (vhash-assoc x constants)))
|
||||
(if (and pair (< (cdr pair) i))
|
||||
(let ((idx (cdr pair)))
|
||||
(if (< idx 256)
|
||||
(values `((object-ref ,idx))
|
||||
(+ addr 2))
|
||||
(values `((long-object-ref ,(quotient idx 256)
|
||||
,(modulo idx 256)))
|
||||
(+ addr 3))))
|
||||
(dump1 x i addr))))
|
||||
(define (dump1 x i addr)
|
||||
(cond
|
||||
((object->assembly x)
|
||||
=> (lambda (code)
|
||||
(values (list code)
|
||||
(+ (byte-length code) addr))))
|
||||
((or (number? x)
|
||||
(string? x)
|
||||
(symbol? x)
|
||||
(keyword? x))
|
||||
;; Atoms.
|
||||
(let ((code (dump-object x addr)))
|
||||
(values code (addr+ addr code))))
|
||||
((variable-cache-cell? x)
|
||||
(dump1 (variable-cache-cell-key x) i addr))
|
||||
((list? x)
|
||||
(receive (codes addr)
|
||||
(fold2 (lambda (x codes addr)
|
||||
(receive (subcode addr) (ref-or-dump x i addr)
|
||||
(values (cons subcode codes) addr)))
|
||||
x '() addr)
|
||||
(values (fold append
|
||||
(let ((len (length x)))
|
||||
`((list ,(quotient len 256) ,(modulo len 256))))
|
||||
codes)
|
||||
(+ addr 3))))
|
||||
((pair? x)
|
||||
(receive (car-code addr) (ref-or-dump (car x) i addr)
|
||||
(receive (cdr-code addr) (ref-or-dump (cdr x) i addr)
|
||||
(values `(,@car-code ,@cdr-code (cons))
|
||||
(1+ addr)))))
|
||||
((and (vector? x)
|
||||
(equal? (array-shape x) (list (list 0 (1- (vector-length x))))))
|
||||
(receive (codes addr)
|
||||
(vector-fold2 (lambda (x codes addr)
|
||||
(receive (subcode addr) (ref-or-dump x i addr)
|
||||
(values (cons subcode codes) addr)))
|
||||
x '() addr)
|
||||
(values (fold append
|
||||
(let ((len (vector-length x)))
|
||||
`((vector ,(quotient len 256) ,(modulo len 256))))
|
||||
codes)
|
||||
(+ addr 3))))
|
||||
((and (array? x) (symbol? (array-type x)))
|
||||
(receive (type addr) (ref-or-dump (array-type x) i addr)
|
||||
(receive (shape addr) (ref-or-dump (array-shape x) i addr)
|
||||
(let ((bv (align-code `(load-array ,(uniform-array->bytevector x))
|
||||
addr 8 4)))
|
||||
(values `(,@type ,@shape ,@bv)
|
||||
(addr+ addr bv))))))
|
||||
((array? x)
|
||||
(let ((contents (array-contents x)))
|
||||
(receive (codes addr)
|
||||
(vector-fold2 (lambda (x codes addr)
|
||||
(receive (subcode addr) (ref-or-dump x i addr)
|
||||
(values (cons subcode codes) addr)))
|
||||
x '() addr)
|
||||
(receive (shape addr) (ref-or-dump (array-shape x) i addr)
|
||||
(values (fold append
|
||||
(let ((len (vector-length contents)))
|
||||
`(,@shape
|
||||
(make-array ,(quotient (ash len -16) 256)
|
||||
,(logand #xff (ash len -8))
|
||||
,(logand #xff len))))
|
||||
codes)
|
||||
(+ addr 4))))))
|
||||
(else
|
||||
(error "write-table: unrecognized object" x))))
|
||||
|
||||
(receive (codes addr)
|
||||
(vhash-fold-right2 (lambda (obj idx code addr)
|
||||
;; The vector is on the stack. Dup it, push
|
||||
;; the index, push the val, then vector-set.
|
||||
(let ((pre `((dup)
|
||||
,(object->assembly idx))))
|
||||
(receive (valcode addr) (dump1 obj idx
|
||||
(addr+ addr pre))
|
||||
(values (cons* '((vector-set))
|
||||
valcode
|
||||
pre
|
||||
code)
|
||||
(1+ addr)))))
|
||||
constants
|
||||
'(((assert-nargs-ee/locals 1)
|
||||
;; Push the vector.
|
||||
(local-ref 0)))
|
||||
4)
|
||||
(let* ((len (1+ (vlist-length constants)))
|
||||
(pre-prog-addr (+ 2 ; reserve-locals
|
||||
len 3 ; empty vector
|
||||
2 ; local-set
|
||||
1 ; new-frame
|
||||
2 ; local-ref
|
||||
))
|
||||
(prog (align-program
|
||||
`(load-program ()
|
||||
,(+ addr 1)
|
||||
#f
|
||||
;; The `return' will be at the tail of the
|
||||
;; program. The vector is already pushed
|
||||
;; on the stack.
|
||||
. ,(fold append '((return)) codes))
|
||||
pre-prog-addr)))
|
||||
(values `(;; Reserve storage for the vector.
|
||||
(assert-nargs-ee/locals ,(logior 0 (ash 1 3)))
|
||||
;; Push the vector, and store it in slot 0.
|
||||
,@(make-list len '(make-false))
|
||||
(vector ,(quotient len 256) ,(modulo len 256))
|
||||
(local-set 0)
|
||||
;; Now we open the call frame.
|
||||
;;
|
||||
(new-frame)
|
||||
;; Now build a thunk to init the constants. It will
|
||||
;; have the unfinished constant table both as its
|
||||
;; argument and as its objtable. The former allows it
|
||||
;; to update the objtable, with vector-set!, and the
|
||||
;; latter allows init code to refer to previously set
|
||||
;; values.
|
||||
;;
|
||||
;; Grab the vector, to be the objtable.
|
||||
(local-ref 0)
|
||||
;; Now the load-program, properly aligned. Pops the vector.
|
||||
,@prog
|
||||
;; Grab the vector, as an argument this time.
|
||||
(local-ref 0)
|
||||
;; Call the init thunk with the vector as an arg.
|
||||
(call 1)
|
||||
;; The thunk also returns the vector. Leave it on the
|
||||
;; stack for compile-assembly to use.
|
||||
)
|
||||
;; The byte length of the init code, which we can
|
||||
;; determine without folding over the code again.
|
||||
(+ (addr+ pre-prog-addr prog) ; aligned program
|
||||
2 ; local-ref
|
||||
2 ; call
|
||||
)))))
|
||||
|
|
|
@ -110,7 +110,9 @@
|
|||
(rnrs files) ;for the condition types
|
||||
(srfi srfi-8)
|
||||
(ice-9 rdelim)
|
||||
(except (guile) raise))
|
||||
(except (guile) raise display)
|
||||
(prefix (only (guile) display)
|
||||
guile:))
|
||||
|
||||
|
||||
|
||||
|
@ -377,6 +379,12 @@ return the characters accumulated in that port."
|
|||
(else
|
||||
(display s port)))))
|
||||
|
||||
;; Defined here to be able to make use of `with-i/o-encoding-error', but
|
||||
;; not exported from here, but from `(rnrs io simple)'.
|
||||
(define* (display object #:optional (port (current-output-port)))
|
||||
(with-i/o-encoding-error
|
||||
(guile:display object port)))
|
||||
|
||||
|
||||
;;;
|
||||
;;; Textual input.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
;;; simple.scm --- The R6RS simple I/O library
|
||||
|
||||
;; Copyright (C) 2010 Free Software Foundation, Inc.
|
||||
;; Copyright (C) 2010, 2011 Free Software Foundation, Inc.
|
||||
;;
|
||||
;; This library is free software; you can redistribute it and/or
|
||||
;; modify it under the terms of the GNU Lesser General Public
|
||||
|
@ -85,42 +85,76 @@
|
|||
|
||||
(import (only (rnrs io ports)
|
||||
call-with-port
|
||||
close-port
|
||||
open-file-input-port
|
||||
open-file-output-port
|
||||
eof-object
|
||||
eof-object?
|
||||
|
||||
eof-object?
|
||||
file-options
|
||||
native-transcoder
|
||||
get-char
|
||||
lookahead-char
|
||||
get-datum
|
||||
put-char
|
||||
put-datum
|
||||
|
||||
input-port?
|
||||
output-port?)
|
||||
(only (guile) @@
|
||||
current-input-port
|
||||
current-output-port
|
||||
current-error-port
|
||||
(only (guile)
|
||||
@@
|
||||
current-input-port
|
||||
current-output-port
|
||||
current-error-port
|
||||
|
||||
with-input-from-file
|
||||
with-output-to-file
|
||||
define*
|
||||
|
||||
open-input-file
|
||||
open-output-file
|
||||
|
||||
close-input-port
|
||||
close-output-port
|
||||
|
||||
read-char
|
||||
peek-char
|
||||
read
|
||||
write-char
|
||||
newline
|
||||
display
|
||||
write)
|
||||
with-input-from-port
|
||||
with-output-to-port)
|
||||
(rnrs base (6))
|
||||
(rnrs files (6)) ;for the condition types
|
||||
)
|
||||
|
||||
(define display (@@ (rnrs io ports) display))
|
||||
|
||||
(define (call-with-input-file filename proc)
|
||||
(call-with-port (open-file-input-port filename) proc))
|
||||
|
||||
(define (call-with-output-file filename proc)
|
||||
(call-with-port (open-file-output-port filename) proc))
|
||||
|
||||
)
|
||||
|
||||
(define (with-input-from-file filename thunk)
|
||||
(call-with-input-file filename
|
||||
(lambda (port) (with-input-from-port port thunk))))
|
||||
|
||||
(define (with-output-to-file filename thunk)
|
||||
(call-with-output-file filename
|
||||
(lambda (port) (with-output-to-port port thunk))))
|
||||
|
||||
(define (open-input-file filename)
|
||||
(open-file-input-port filename (file-options) (native-transcoder)))
|
||||
|
||||
(define (open-output-file filename)
|
||||
(open-file-output-port filename (file-options) (native-transcoder)))
|
||||
|
||||
(define close-input-port close-port)
|
||||
(define close-output-port close-port)
|
||||
|
||||
(define* (read-char #:optional (port (current-input-port)))
|
||||
(get-char port))
|
||||
|
||||
(define* (peek-char #:optional (port (current-input-port)))
|
||||
(lookahead-char port))
|
||||
|
||||
(define* (read #:optional (port (current-input-port)))
|
||||
(get-datum port))
|
||||
|
||||
(define* (write-char char #:optional (port (current-output-port)))
|
||||
(put-char port char))
|
||||
|
||||
(define* (newline #:optional (port (current-output-port)))
|
||||
(put-char port #\newline))
|
||||
|
||||
(define* (write object #:optional (port (current-output-port)))
|
||||
(put-datum port object))
|
||||
|
||||
)
|
||||
|
|
|
@ -566,6 +566,8 @@ has just one element then that's the return value."
|
|||
(mapn (cdr l1) (map cdr rest) (1- len)
|
||||
(cons (apply f (car l1) (map car rest)) out))))))))
|
||||
|
||||
(define map-in-order map)
|
||||
|
||||
(define for-each
|
||||
(case-lambda
|
||||
((f l)
|
||||
|
|
|
@ -124,24 +124,32 @@
|
|||
|
||||
(pass-if "pointer from bits"
|
||||
(let* ((bytes (iota (sizeof '*)))
|
||||
(bv (u8-list->bytevector bytes)))
|
||||
(bv (u8-list->bytevector bytes))
|
||||
(fold (case (native-endianness)
|
||||
((little) fold-right)
|
||||
((big) fold)
|
||||
(else (error "unsupported endianness")))))
|
||||
(= (pointer-address
|
||||
(make-pointer (bytevector-uint-ref bv 0 (native-endianness)
|
||||
(sizeof '*))))
|
||||
(fold-right (lambda (byte address)
|
||||
(+ byte (* 256 address)))
|
||||
0
|
||||
bytes))))
|
||||
(fold (lambda (byte address)
|
||||
(+ byte (* 256 address)))
|
||||
0
|
||||
bytes))))
|
||||
|
||||
(pass-if "dereference-pointer"
|
||||
(let* ((bytes (iota (sizeof '*)))
|
||||
(bv (u8-list->bytevector bytes)))
|
||||
(bv (u8-list->bytevector bytes))
|
||||
(fold (case (native-endianness)
|
||||
((little) fold-right)
|
||||
((big) fold)
|
||||
(else (error "unsupported endianness")))))
|
||||
(= (pointer-address
|
||||
(dereference-pointer (bytevector->pointer bv)))
|
||||
(fold-right (lambda (byte address)
|
||||
(+ byte (* 256 address)))
|
||||
0
|
||||
bytes)))))
|
||||
(fold (lambda (byte address)
|
||||
(+ byte (* 256 address)))
|
||||
0
|
||||
bytes)))))
|
||||
|
||||
|
||||
(with-test-prefix "pointer<->string"
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
;;;; hash.test --- test guile hashing -*- scheme -*-
|
||||
;;;;
|
||||
;;;; Copyright (C) 2004, 2005, 2006, 2008 Free Software Foundation, Inc.
|
||||
;;;; Copyright (C) 2004, 2005, 2006, 2008, 2011 Free Software Foundation, Inc.
|
||||
;;;;
|
||||
;;;; This library is free software; you can redistribute it and/or
|
||||
;;;; modify it under the terms of the GNU Lesser General Public
|
||||
|
@ -32,7 +32,10 @@
|
|||
(hash #t 0))
|
||||
(pass-if (= 0 (hash #t 1)))
|
||||
(pass-if (= 0 (hash #f 1)))
|
||||
(pass-if (= 0 (hash noop 1))))
|
||||
(pass-if (= 0 (hash noop 1)))
|
||||
(pass-if (= 0 (hash +inf.0 1)))
|
||||
(pass-if (= 0 (hash -inf.0 1)))
|
||||
(pass-if (= 0 (hash +nan.0 1))))
|
||||
|
||||
;;;
|
||||
;;; hashv
|
||||
|
|
|
@ -572,21 +572,40 @@
|
|||
eof))
|
||||
|
||||
(test-decoding-error (#xc2 #x41 #x42) "UTF-8"
|
||||
(error ;; 41: should be in the 80..BF range
|
||||
;; Section 3.9 of Unicode 6.0.0 reads:
|
||||
;; "If the converter encounters an ill-formed UTF-8 code unit
|
||||
;; sequence which starts with a valid first byte, but which does
|
||||
;; not continue with valid successor bytes (see Table 3-7), it
|
||||
;; must not consume the successor bytes".
|
||||
;; Glibc/libiconv do not conform to it and instead swallow the
|
||||
;; #x41. This example appears literally in Section 3.9.
|
||||
(error ;; 41: invalid successor
|
||||
#\A ;; 41: valid starting byte
|
||||
#\B
|
||||
eof))
|
||||
|
||||
(test-decoding-error (#xe0 #x88 #x88) "UTF-8"
|
||||
(test-decoding-error (#xf0 #x80 #x80 #x41) "UTF-8"
|
||||
;; According to Unicode 6.0.0, Section 3.9, "the only formal
|
||||
;; requirement mandated by Unicode conformance for a converter is
|
||||
;; that the <41> be processed and correctly interpreted as
|
||||
;; <U+0041>".
|
||||
(error ;; 2nd byte should be in the A0..BF range
|
||||
error ;; 80: not a valid starting byte
|
||||
error ;; 80: not a valid starting byte
|
||||
#\A
|
||||
eof))
|
||||
|
||||
(test-decoding-error (#xe0 #xa0 #x41 #x42) "UTF-8"
|
||||
(error ;; 3rd byte should be in the 80..BF range
|
||||
#\A
|
||||
#\B
|
||||
eof))
|
||||
|
||||
(test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8"
|
||||
(error ;; 2nd byte should be in the 90..BF range
|
||||
error ;; 88: not a valid starting byte
|
||||
error ;; 88: not a valid starting byte
|
||||
error ;; 88: not a valid starting byte
|
||||
eof))))
|
||||
|
||||
(with-test-prefix "call-with-output-string"
|
||||
|
|
|
@ -301,6 +301,13 @@
|
|||
(alist (fold alist-cons '() keys values)))
|
||||
(equal? alist (reverse (vhash-fold alist-cons '() vh)))))
|
||||
|
||||
(pass-if "vhash-fold-right"
|
||||
(let* ((keys '(a b c d e f g d h i))
|
||||
(values '(1 2 3 4 5 6 7 0 8 9))
|
||||
(vh (fold vhash-cons vlist-null keys values))
|
||||
(alist (fold alist-cons '() keys values)))
|
||||
(equal? alist (vhash-fold-right alist-cons '() vh))))
|
||||
|
||||
(pass-if "alist->vhash"
|
||||
(let* ((keys '(a b c d e f g d h i))
|
||||
(values '(1 2 3 4 5 6 7 0 8 9))
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue