1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-06-10 14:00:21 +02:00

Merge remote-tracking branch 'origin/stable-2.0'

This commit is contained in:
Andy Wingo 2011-05-09 00:13:04 +02:00
commit e690a3cbf2
20 changed files with 1726 additions and 359 deletions

View file

@ -1,7 +1,7 @@
;;; -*- mode: scheme; coding: utf-8; -*-
;;; SRFI-1.
;;;
;;; Copyright 2010 Free Software Foundation, Inc.
;;; Copyright 2010, 2011 Free Software Foundation, Inc.
;;;
;;; This program is free software; you can redistribute it and/or
;;; modify it under the terms of the GNU Lesser General Public License
@ -45,3 +45,20 @@
(benchmark "small" 2000000
(drop-while (lambda (n) #t) %small-list)))
(with-benchmark-prefix "map"
(benchmark "big" 30
(map (lambda (x) x) %big-list))
(benchmark "small" 2000000
(map (lambda (x) x) %small-list)))
(with-benchmark-prefix "for-each"
(benchmark "big" 30
(for-each (lambda (x) #f) %big-list))
(benchmark "small" 2000000
(for-each (lambda (x) #f) %small-list)))

View file

@ -1,7 +1,7 @@
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010
@c Free Software Foundation, Inc.
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
@c 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@node Compound Data Types
@ -3294,8 +3294,9 @@ Again the choice of @var{hash-proc} must be consistent with previous calls to
@end deffn
@deffn {Scheme Procedure} vhash-fold proc vhash
Fold over the key/pair elements of @var{vhash}. For each pair call @var{proc}
as @code{(@var{proc} key value result)}.
@deffnx {Scheme Procedure} vhash-fold-right proc vhash
Fold over the key/value elements of @var{vhash} in the given direction.
For each pair call @var{proc} as @code{(@var{proc} key value result)}.
@end deffn
@deffn {Scheme Procedure} vhash-fold* proc init key vhash [equal? [hash]]

View file

@ -1152,22 +1152,364 @@ The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
io ports)} module. It provides features, such as binary I/O and Unicode
string I/O, that complement or refine Guile's historical port API
presented above (@pxref{Input and Output}).
presented above (@pxref{Input and Output}). Note that R6RS ports are not
disjoint from Guile's native ports, so Guile-specific procedures will
work on ports created using the R6RS API, and vice versa.
The text in this section is taken from the R6RS standard libraries
document, with only minor adaptions for inclusion in this manual. The
Guile developers offer their thanks to the R6RS editors for having
provided the report's text under permissive conditions making this
possible.
@c FIXME: Update description when implemented.
@emph{Note}: The implementation of this R6RS API is not complete yet.
@menu
* R6RS File Names:: File names.
* R6RS File Options:: Options for opening files.
* R6RS Buffer Modes:: Influencing buffering behavior.
* R6RS Transcoders:: Influencing port encoding.
* R6RS End-of-File:: The end-of-file object.
* R6RS Port Manipulation:: Manipulating R6RS ports.
* R6RS Input Ports:: Input Ports.
* R6RS Binary Input:: Binary input.
* R6RS Textual Input:: Textual input.
* R6RS Output Ports:: Output Ports.
* R6RS Binary Output:: Binary output.
* R6RS Textual Output:: Textual output.
@end menu
A subset of the @code{(rnrs io ports)} module is provided by the
@code{(ice-9 binary-ports)} module. It contains binary input/output
procedures and does not rely on R6RS support.
@node R6RS File Names
@subsubsection File Names
Some of the procedures described in this chapter accept a file name as an
argument. Valid values for such a file name include strings that name a file
using the native notation of filesystem paths on an implementation's
underlying operating system, and may include implementation-dependent
values as well.
A @var{filename} parameter name means that the
corresponding argument must be a file name.
@node R6RS File Options
@subsubsection File Options
@cindex file options
When opening a file, the various procedures in this library accept a
@code{file-options} object that encapsulates flags to specify how the
file is to be opened. A @code{file-options} object is an enum-set
(@pxref{rnrs enums}) over the symbols constituting valid file options.
A @var{file-options} parameter name means that the corresponding
argument must be a file-options object.
@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
Each @var{file-options-symbol} must be a symbol.
The @code{file-options} syntax returns a file-options object that
encapsulates the specified options.
When supplied to an operation that opens a file for output, the
file-options object returned by @code{(file-options)} specifies that the
file is created if it does not exist and an exception with condition
type @code{&i/o-file-already-exists} is raised if it does exist. The
following standard options can be included to modify the default
behavior.
@table @code
@item no-create
If the file does not already exist, it is not created;
instead, an exception with condition type @code{&i/o-file-does-not-exist}
is raised.
If the file already exists, the exception with condition type
@code{&i/o-file-already-exists} is not raised
and the file is truncated to zero length.
@item no-fail
If the file already exists, the exception with condition type
@code{&i/o-file-already-exists} is not raised,
even if @code{no-create} is not included,
and the file is truncated to zero length.
@item no-truncate
If the file already exists and the exception with condition type
@code{&i/o-file-already-exists} has been inhibited by inclusion of
@code{no-create} or @code{no-fail}, the file is not truncated, but
the port's current position is still set to the beginning of the
file.
@end table
These options have no effect when a file is opened only for input.
Symbols other than those listed above may be used as
@var{file-options-symbol}s; they have implementation-specific meaning,
if any.
@quotation Note
Only the name of @var{file-options-symbol} is significant.
@end quotation
@end deffn
@node R6RS Buffer Modes
@subsubsection Buffer Modes
Each port has an associated buffer mode. For an output port, the
buffer mode defines when an output operation flushes the buffer
associated with the output port. For an input port, the buffer mode
defines how much data will be read to satisfy read operations. The
possible buffer modes are the symbols @code{none} for no buffering,
@code{line} for flushing upon line endings and reading up to line
endings, or other implementation-dependent behavior,
and @code{block} for arbitrary buffering. This section uses
the parameter name @var{buffer-mode} for arguments that must be
buffer-mode symbols.
If two ports are connected to the same mutable source, both ports
are unbuffered, and reading a byte or character from that shared
source via one of the two ports would change the bytes or characters
seen via the other port, a lookahead operation on one port will
render the peeked byte or character inaccessible via the other port,
while a subsequent read operation on the peeked port will see the
peeked byte or character even though the port is otherwise unbuffered.
In other words, the semantics of buffering is defined in terms of side
effects on shared mutable sources, and a lookahead operation has the
same side effect on the shared source as a read operation.
@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
@var{buffer-mode-symbol} must be a symbol whose name is one of
@code{none}, @code{line}, and @code{block}. The result is the
corresponding symbol, and specifies the associated buffer mode.
@quotation Note
Only the name of @var{buffer-mode-symbol} is significant.
@end quotation
@end deffn
@deffn {Scheme Procedure} buffer-mode? obj
Returns @code{#t} if the argument is a valid buffer-mode symbol, and
returns @code{#f} otherwise.
@end deffn
@node R6RS Transcoders
@subsubsection Transcoders
@cindex codec
@cindex end-of-line style
@cindex transcoder
@cindex binary port
@cindex textual port
Several different Unicode encoding schemes describe standard ways to
encode characters and strings as byte sequences and to decode those
sequences. Within this document, a @dfn{codec} is an immutable Scheme
object that represents a Unicode or similar encoding scheme.
An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
describes how a textual port transcodes representations of line endings.
A @dfn{transcoder} is an immutable Scheme object that combines a codec
with an end-of-line style and a method for handling decoding errors.
Each transcoder represents some specific bidirectional (but not
necessarily lossless), possibly stateful translation between byte
sequences and Unicode characters and strings. Every transcoder can
operate in the input direction (bytes to characters) or in the output
direction (characters to bytes). A @var{transcoder} parameter name
means that the corresponding argument must be a transcoder.
A @dfn{binary port} is a port that supports binary I/O, does not have an
associated transcoder and does not support textual I/O. A @dfn{textual
port} is a port that supports textual I/O, and does not support binary
I/O. A textual port may or may not have an associated transcoder.
@deffn {Scheme Procedure} latin-1-codec
@deffnx {Scheme Procedure} utf-8-codec
@deffnx {Scheme Procedure} utf-16-codec
These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
encoding schemes.
A call to any of these procedures returns a value that is equal in the
sense of @code{eqv?} to the result of any other call to the same
procedure.
@end deffn
@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
@var{eol-style-symbol} should be a symbol whose name is one of
@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
and @code{none}.
The form evaluates to the corresponding symbol. If the name of
@var{eol-style-symbol} is not one of these symbols, the effect and
result are implementation-dependent; in particular, the result may be an
eol-style symbol acceptable as an @var{eol-style} argument to
@code{make-transcoder}. Otherwise, an exception is raised.
All eol-style symbols except @code{none} describe a specific
line-ending encoding:
@table @code
@item lf
linefeed
@item cr
carriage return
@item crlf
carriage return, linefeed
@item nel
next line
@item crnel
carriage return, next line
@item ls
line separator
@end table
For a textual port with a transcoder, and whose transcoder has an
eol-style symbol @code{none}, no conversion occurs. For a textual input
port, any eol-style symbol other than @code{none} means that all of the
above line-ending encodings are recognized and are translated into a
single linefeed. For a textual output port, @code{none} and @code{lf}
are equivalent. Linefeed characters are encoded according to the
specified eol-style symbol, and all other characters that participate in
possible line endings are encoded as is.
@quotation Note
Only the name of @var{eol-style-symbol} is significant.
@end quotation
@end deffn
@deffn {Scheme Procedure} native-eol-style
Returns the default end-of-line style of the underlying platform, e.g.,
@code{lf} on Unix and @code{crlf} on Windows.
@end deffn
@deffn {Condition Type} &i/o-decoding
@deffnx {Scheme Procedure} make-i/o-decoding-error port
@deffnx {Scheme Procedure} i/o-decoding-error? obj
This condition type could be defined by
@lisp
(define-condition-type &i/o-decoding &i/o-port
make-i/o-decoding-error i/o-decoding-error?)
@end lisp
An exception with this type is raised when one of the operations for
textual input from a port encounters a sequence of bytes that cannot be
translated into a character or string by the input direction of the
port's transcoder.
When such an exception is raised, the port's position is past the
invalid encoding.
@end deffn
@deffn {Condition Type} &i/o-encoding
@deffnx {Scheme Procedure} make-i/o-encoding-error port char
@deffnx {Scheme Procedure} i/o-encoding-error? obj
@deffnx {Scheme Procedure} i/o-encoding-error-char condition
This condition type could be defined by
@lisp
(define-condition-type &i/o-encoding &i/o-port
make-i/o-encoding-error i/o-encoding-error?
(char i/o-encoding-error-char))
@end lisp
An exception with this type is raised when one of the operations for
textual output to a port encounters a character that cannot be
translated into bytes by the output direction of the port's transcoder.
@var{Char} is the character that could not be encoded.
@end deffn
@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
@var{error-handling-mode-symbol} should be a symbol whose name is one of
@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
the corresponding symbol. If @var{error-handling-mode-symbol} is not
one of these identifiers, effect and result are
implementation-dependent: The result may be an error-handling-mode
symbol acceptable as a @var{handling-mode} argument to
@code{make-transcoder}. If it is not acceptable as a
@var{handling-mode} argument to @code{make-transcoder}, an exception is
raised.
@quotation Note
Only the name of @var{error-handling-style-symbol} is significant.
@end quotation
The error-handling mode of a transcoder specifies the behavior
of textual I/O operations in the presence of encoding or decoding
errors.
If a textual input operation encounters an invalid or incomplete
character encoding, and the error-handling mode is @code{ignore}, an
appropriate number of bytes of the invalid encoding are ignored and
decoding continues with the following bytes.
If the error-handling mode is @code{replace}, the replacement
character U+FFFD is injected into the data stream, an appropriate
number of bytes are ignored, and decoding
continues with the following bytes.
If the error-handling mode is @code{raise}, an exception with condition
type @code{&i/o-decoding} is raised.
If a textual output operation encounters a character it cannot encode,
and the error-handling mode is @code{ignore}, the character is ignored
and encoding continues with the next character. If the error-handling
mode is @code{replace}, a codec-specific replacement character is
emitted by the transcoder, and encoding continues with the next
character. The replacement character is U+FFFD for transcoders whose
codec is one of the Unicode encodings, but is the @code{?} character
for the Latin-1 encoding. If the error-handling mode is @code{raise},
an exception with condition type @code{&i/o-encoding} is raised.
@end deffn
@deffn {Scheme Procedure} make-transcoder codec
@deffnx {Scheme Procedure} make-transcoder codec eol-style
@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
symbol; and @var{handling-mode}, if present, an error-handling-mode
symbol.
@var{eol-style} may be omitted, in which case it defaults to the native
end-of-line style of the underlying platform. @var{Handling-mode} may
be omitted, in which case it defaults to @code{replace}. The result is
a transcoder with the behavior specified by its arguments.
@end deffn
@deffn {Scheme procedure} native-transcoder
Returns an implementation-dependent transcoder that represents a
possibly locale-dependent ``native'' transcoding.
@end deffn
@deffn {Scheme Procedure} transcoder-codec transcoder
@deffnx {Scheme Procedure} transcoder-eol-style transcoder
@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
These are accessors for transcoder objects; when applied to a
transcoder returned by @code{make-transcoder}, they return the
@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
respectively.
@end deffn
@deffn {Scheme Procedure} bytevector->string bytevector transcoder
Returns the string that results from transcoding the
@var{bytevector} according to the input direction of the transcoder.
@end deffn
@deffn {Scheme Procedure} string->bytevector string transcoder
Returns the bytevector that results from transcoding the
@var{string} according to the output direction of the transcoder.
@end deffn
@node R6RS End-of-File
@subsubsection The End-of-File Object
@ -1200,6 +1542,65 @@ Return the end-of-file (EOF) object.
The procedures listed below operate on any kind of R6RS I/O port.
@deffn {Scheme Procedure} port? obj
Returns @code{#t} if the argument is a port, and returns @code{#f}
otherwise.
@end deffn
@deffn {Scheme Procedure} port-transcoder port
Returns the transcoder associated with @var{port} if @var{port} is
textual and has an associated transcoder, and returns @code{#f} if
@var{port} is binary or does not have an associated transcoder.
@end deffn
@deffn {Scheme Procedure} binary-port? port
Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
binary data input/output.
Note that internally Guile does not differentiate between binary and
textual ports, unlike the R6RS. Thus, this procedure returns true when
@var{port} does not have an associated encoding---i.e., when
@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
port-encoding}). This is the case for ports returned by R6RS procedures
such as @code{open-bytevector-input-port} and
@code{make-custom-binary-output-port}.
However, Guile currently does not prevent use of textual I/O procedures
such as @code{display} or @code{read-char} with binary ports. Doing so
``upgrades'' the port from binary to textual, under the ISO-8859-1
encoding. Likewise, Guile does not prevent use of
@code{set-port-encoding!} on a binary port, which also turns it into a
``textual'' port.
@end deffn
@deffn {Scheme Procedure} textual-port? port
Always return @var{#t}, as all ports can be used for textual I/O in
Guile.
@end deffn
@deffn {Scheme Procedure} transcoded-port obj
The @code{transcoded-port} procedure
returns a new textual port with the specified @var{transcoder}.
Otherwise the new textual port's state is largely the same as
that of @var{binary-port}.
If @var{binary-port} is an input port, the new textual
port will be an input port and
will transcode the bytes that have not yet been read from
@var{binary-port}.
If @var{binary-port} is an output port, the new textual
port will be an output port and
will transcode output characters into bytes that are
written to the byte sink represented by @var{binary-port}.
As a side effect, however, @code{transcoded-port}
closes @var{binary-port} in
a special way that allows the new textual port to continue to
use the byte source or sink represented by @var{binary-port},
even though @var{binary-port} itself is closed and cannot
be used by the input and output operations described in this
chapter.
@end deffn
@deffn {Scheme Procedure} port-position port
If @var{port} supports it (see below), return the offset (an integer)
indicating where the next octet will be read from/written to in
@ -1233,31 +1634,67 @@ Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
of @var{proc}. Return the return values of @var{proc}.
@end deffn
@deffn {Scheme Procedure} binary-port? port
Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
binary data input/output.
@node R6RS Input Ports
@subsubsection Input Ports
Note that internally Guile does not differentiate between binary and
textual ports, unlike the R6RS. Thus, this procedure returns true when
@var{port} does not have an associated encoding---i.e., when
@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
port-encoding}). This is the case for ports returned by R6RS procedures
such as @code{open-bytevector-input-port} and
@code{make-custom-binary-output-port}.
However, Guile currently does not prevent use of textual I/O procedures
such as @code{display} or @code{read-char} with binary ports. Doing so
``upgrades'' the port from binary to textual, under the ISO-8859-1
encoding. Likewise, Guile does not prevent use of
@code{set-port-encoding!} on a binary port, which also turns it into a
``textual'' port.
@deffn {Scheme Procedure} input-port? obj@
Returns @code{#t} if the argument is an input port (or a combined input
and output port), and returns @code{#f} otherwise.
@end deffn
@deffn {Scheme Procedure} textual-port? port
Always return @var{#t}, as all ports can be used for textual I/O in
Guile.
@deffn {Scheme Procedure} port-eof? port
Returns @code{#t}
if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
would return
the end-of-file object, and @code{#f} otherwise.
The operation may block indefinitely if no data is available
but the port cannot be determined to be at end of file.
@end deffn
@deffn {Scheme Procedure} open-file-input-port filename
@deffnx {Scheme Procedure} open-file-input-port filename file-options
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
@var{Maybe-transcoder} must be either a transcoder or @code{#f}.
The @code{open-file-input-port} procedure returns an
input port for the named file. The @var{file-options} and
@var{maybe-transcoder} arguments are optional.
The @var{file-options} argument, which may determine
various aspects of the returned port (@pxref{R6RS File Options}),
defaults to the value of @code{(file-options)}.
The @var{buffer-mode} argument, if supplied,
must be one of the symbols that name a buffer mode.
The @var{buffer-mode} argument defaults to @code{block}.
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
with the returned port.
If @var{maybe-transcoder} is @code{#f} or absent,
the port will be a binary port and will support the
@code{port-position} and @code{set-port-position!} operations.
Otherwise the port will be a textual port, and whether it supports
the @code{port-position} and @code{set-port-position!} operations
is implementation-dependent (and possibly transcoder-dependent).
@end deffn
@deffn {Scheme Procedure} standard-input-port
Returns a fresh binary input port connected to standard input. Whether
the port supports the @code{port-position} and @code{set-port-position!}
operations is implementation-dependent.
@end deffn
@deffn {Scheme Procedure} current-input-port
This returns a default textual port for input. Normally, this default
port is associated with standard input, but can be dynamically
re-assigned using the @code{with-input-from-file} procedure from the
@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
may not have an associated transcoder; if it does, the transcoder is
implementation-dependent.
@end deffn
@node R6RS Binary Input
@subsubsection Binary Input
@ -1374,6 +1811,173 @@ reached. Return either a new bytevector containing the data read or the
end-of-file object (if no data were available).
@end deffn
@node R6RS Textual Input
@subsubsection Textual Input
@deffn {Scheme Procedure} get-char port
Reads from @var{textual-input-port}, blocking as necessary, until a
complete character is available from @var{textual-input-port},
or until an end of file is reached.
If a complete character is available before the next end of file,
@code{get-char} returns that character and updates the input port to
point past the character. If an end of file is reached before any
character is read, @code{get-char} returns the end-of-file object.
@end deffn
@deffn {Scheme Procedure} lookahead-char port
The @code{lookahead-char} procedure is like @code{get-char}, but it does
not update @var{textual-input-port} to point past the character.
@end deffn
@deffn {Scheme Procedure} get-string-n port count
@var{Count} must be an exact, non-negative integer object, representing
the number of characters to be read.
The @code{get-string-n} procedure reads from @var{textual-input-port},
blocking as necessary, until @var{count} characters are available, or
until an end of file is reached.
If @var{count} characters are available before end of file,
@code{get-string-n} returns a string consisting of those @var{count}
characters. If fewer characters are available before an end of file, but
one or more characters can be read, @code{get-string-n} returns a string
containing those characters. In either case, the input port is updated
to point just past the characters read. If no characters can be read
before an end of file, the end-of-file object is returned.
@end deffn
@deffn {Scheme Procedure} get-string-n! port string start count
@var{Start} and @var{count} must be exact, non-negative integer objects,
with @var{count} representing the number of characters to be read.
@var{String} must be a string with at least $@var{start} + @var{count}$
characters.
The @code{get-string-n!} procedure reads from @var{textual-input-port}
in the same manner as @code{get-string-n}. If @var{count} characters
are available before an end of file, they are written into @var{string}
starting at index @var{start}, and @var{count} is returned. If fewer
characters are available before an end of file, but one or more can be
read, those characters are written into @var{string} starting at index
@var{start} and the number of characters actually read is returned as an
exact integer object. If no characters can be read before an end of
file, the end-of-file object is returned.
@end deffn
@deffn {Scheme Procedure} get-string-all port count
Reads from @var{textual-input-port} until an end of file, decoding
characters in the same manner as @code{get-string-n} and
@code{get-string-n!}.
If characters are available before the end of file, a string containing
all the characters decoded from that data are returned. If no character
precedes the end of file, the end-of-file object is returned.
@end deffn
@deffn {Scheme Procedure} get-line port
Reads from @var{textual-input-port} up to and including the linefeed
character or end of file, decoding characters in the same manner as
@code{get-string-n} and @code{get-string-n!}.
If a linefeed character is read, a string containing all of the text up
to (but not including) the linefeed character is returned, and the port
is updated to point just past the linefeed character. If an end of file
is encountered before any linefeed character is read, but some
characters have been read and decoded as characters, a string containing
those characters is returned. If an end of file is encountered before
any characters are read, the end-of-file object is returned.
@quotation Note
The end-of-line style, if not @code{none}, will cause all line endings
to be read as linefeed characters. @xref{R6RS Transcoders}.
@end quotation
@end deffn
@deffn {Scheme Procedure} get-datum port count
Reads an external representation from @var{textual-input-port} and returns the
datum it represents. The @code{get-datum} procedure returns the next
datum that can be parsed from the given @var{textual-input-port}, updating
@var{textual-input-port} to point exactly past the end of the external
representation of the object.
Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
Syntax}) in the input is first skipped. If an end of file occurs after
the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
is returned.
If a character inconsistent with an external representation is
encountered in the input, an exception with condition types
@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
file is encountered after the beginning of an external representation,
but the external representation is incomplete and therefore cannot be
parsed, an exception with condition types @code{&lexical} and
@code{&i/o-read} is raised.
@end deffn
@node R6RS Output Ports
@subsubsection Output Ports
@deffn {Scheme Procedure} output-port? obj
Returns @code{#t} if the argument is an output port (or a
combined input and output port), @code{#f} otherwise.
@end deffn
@deffn {Scheme Procedure} flush-output-port port
Flushes any buffered output from the buffer of @var{output-port} to the
underlying file, device, or object. The @code{flush-output-port}
procedure returns an unspecified values.
@end deffn
@deffn {Scheme Procedure} open-file-output-port filename
@deffnx {Scheme Procedure} open-file-output-port filename file-options
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
@var{maybe-transcoder} must be either a transcoder or @code{#f}.
The @code{open-file-output-port} procedure returns an output port for the named file.
The @var{file-options} argument, which may determine various aspects of
the returned port (@pxref{R6RS File Options}), defaults to the value of
@code{(file-options)}.
The @var{buffer-mode} argument, if supplied,
must be one of the symbols that name a buffer mode.
The @var{buffer-mode} argument defaults to @code{block}.
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
associated with the port.
If @var{maybe-transcoder} is @code{#f} or absent,
the port will be a binary port and will support the
@code{port-position} and @code{set-port-position!} operations.
Otherwise the port will be a textual port, and whether it supports
the @code{port-position} and @code{set-port-position!} operations
is implementation-dependent (and possibly transcoder-dependent).
@end deffn
@deffn {Scheme Procedure} standard-output-port
@deffnx {Scheme Procedure} standard-error-port
Returns a fresh binary output port connected to the standard output or
standard error respectively. Whether the port supports the
@code{port-position} and @code{set-port-position!} operations is
implementation-dependent.
@end deffn
@deffn {Scheme Procedure} current-output-port
@deffnx {Scheme Procedure} current-error-port
These return default textual ports for regular output and error output.
Normally, these default ports are associated with standard output, and
standard error, respectively. The return value of
@code{current-output-port} can be dynamically re-assigned using the
@code{with-output-to-file} procedure from the @code{io simple (6)}
library (@pxref{rnrs io simple}). A port returned by one of these
procedures may or may not have an associated transcoder; if it does, the
transcoder is implementation-dependent.
@end deffn
@node R6RS Binary Output
@subsubsection Binary Output
@ -1432,6 +2036,50 @@ Write the contents of @var{bv} to @var{port}, optionally starting at
index @var{start} and limiting to @var{count} octets.
@end deffn
@node R6RS Textual Output
@subsubsection Textual Output
@deffn {Scheme Procedure} put-char port char
Writes @var{char} to the port. The @code{put-char} procedure returns
@end deffn
@deffn {Scheme Procedure} put-string port string
@deffnx {Scheme Procedure} put-string port string start
@deffnx {Scheme Procedure} put-string port string start count
@var{start} and @var{count} must be non-negative exact integer objects.
@var{string} must have a length of at least @math{@var{start} +
@var{count}}. @var{start} defaults to 0. @var{count} defaults to
@math{@code{(string-length @var{string})} - @var{start}}$. The
@code{put-string} procedure writes the @var{count} characters of
@var{string} starting at index @var{start} to the port. The
@code{put-string} procedure returns an unspecified value.
@end deffn
@deffn {Scheme Procedure} put-datum port datum
@var{datum} should be a datum value. The @code{put-datum} procedure
writes an external representation of @var{datum} to
@var{textual-output-port}. The specific external representation is
implementation-dependent. However, whenever possible, an implementation
should produce a representation for which @code{get-datum}, when reading
the representation, will return an object equal (in the sense of
@code{equal?}) to @var{datum}.
@quotation Note
Not all datums may allow producing an external representation for which
@code{get-datum} will produce an object that is equal to the
original. Specifically, NaNs contained in @var{datum} may make
this impossible.
@end quotation
@quotation Note
The @code{put-datum} procedure merely writes the external
representation, but no trailing delimiter. If @code{put-datum} is
used to write several subsequent external representations to an
output port, care should be taken to delimit them properly so they can
be read back in by subsequent calls to @code{get-datum}.
@end quotation
@end deffn
@node I/O Extensions
@subsection Using and Extending Ports in C
@ -1690,7 +2338,6 @@ Set using
@end table
@c Local Variables:
@c TeX-master: "guile.texi"
@c End:

View file

@ -1428,8 +1428,21 @@ functionality is documented in its own section of the manual;
The @code{(rnrs io simple (6))} library provides convenience functions
for performing textual I/O on ports. This library also exports all of
the condition types and associated procedures described in
(@pxref{I/O Conditions}).
the condition types and associated procedures described in (@pxref{I/O
Conditions}). In the context of this section, when stating that a
procedure behaves ``identically'' to the corresponding procedure in
Guile's core library, this is modulo the behavior wrt. conditions: such
procedures raise the appropriate R6RS conditions in case of error, but
otherwise behave identically.
@c FIXME: remove the following note when proper condition behavior has
@c been verified.
@quotation Note
There are still known issues regarding condition-correctness; some
errors may still be thrown as native Guile exceptions instead of the
appropriate R6RS conditions.
@end quotation
@deffn {Scheme Procedure} eof-object
@deffnx {Scheme Procedure} eof-object? obj

View file

@ -965,9 +965,12 @@ unpack (const ffi_type *type, void *loc, SCM x)
}
#undef FUNC_NAME
/* Return a Scheme representation of the foreign value at LOC of type TYPE. */
/* Return a Scheme representation of the foreign value at LOC of type
TYPE. When RETURN_VALUE_P is true, LOC is assumed to point to a
return value buffer; otherwise LOC is assumed to point to an
argument buffer. */
static SCM
pack (const ffi_type * type, const void *loc)
pack (const ffi_type * type, const void *loc, int return_value_p)
{
switch (type->type)
{
@ -977,22 +980,48 @@ pack (const ffi_type * type, const void *loc)
return scm_from_double (*(float *) loc);
case FFI_TYPE_DOUBLE:
return scm_from_double (*(double *) loc);
/* For integer return values smaller than `int', libffi stores the
result in an `ffi_arg'-long buffer, of which only the
significant bits must be kept---hence the pair of casts below.
See <http://thread.gmane.org/gmane.comp.lib.ffi.general/406>
for details. */
case FFI_TYPE_UINT8:
return scm_from_uint8 (*(scm_t_uint8 *) loc);
if (return_value_p)
return scm_from_uint8 ((scm_t_uint8) *(ffi_arg *) loc);
else
return scm_from_uint8 (* (scm_t_uint8 *) loc);
case FFI_TYPE_SINT8:
return scm_from_int8 (*(scm_t_int8 *) loc);
if (return_value_p)
return scm_from_int8 ((scm_t_int8) *(ffi_arg *) loc);
else
return scm_from_int8 (* (scm_t_int8 *) loc);
case FFI_TYPE_UINT16:
return scm_from_uint16 (*(scm_t_uint16 *) loc);
if (return_value_p)
return scm_from_uint16 ((scm_t_uint16) *(ffi_arg *) loc);
else
return scm_from_uint16 (* (scm_t_uint16 *) loc);
case FFI_TYPE_SINT16:
return scm_from_int16 (*(scm_t_int16 *) loc);
if (return_value_p)
return scm_from_int16 ((scm_t_int16) *(ffi_arg *) loc);
else
return scm_from_int16 (* (scm_t_int16 *) loc);
case FFI_TYPE_UINT32:
return scm_from_uint32 (*(scm_t_uint32 *) loc);
if (return_value_p)
return scm_from_uint32 ((scm_t_uint32) *(ffi_arg *) loc);
else
return scm_from_uint32 (* (scm_t_uint32 *) loc);
case FFI_TYPE_SINT32:
return scm_from_int32 (*(scm_t_int32 *) loc);
if (return_value_p)
return scm_from_int32 ((scm_t_int32) *(ffi_arg *) loc);
else
return scm_from_int32 (* (scm_t_int32 *) loc);
case FFI_TYPE_UINT64:
return scm_from_uint64 (*(scm_t_uint64 *) loc);
case FFI_TYPE_SINT64:
return scm_from_int64 (*(scm_t_int64 *) loc);
case FFI_TYPE_STRUCT:
{
void *mem = scm_gc_malloc_pointerless (type->size, "foreign");
@ -1060,7 +1089,7 @@ scm_i_foreign_call (SCM foreign, const SCM *argv)
/* off we go! */
ffi_call (cif, func, rvalue, args);
return pack (cif->rtype, rvalue);
return pack (cif->rtype, rvalue, 1);
}
@ -1082,7 +1111,7 @@ invoke_closure (ffi_cif *cif, void *ret, void **args, void *data)
/* Pack ARGS to SCM values, setting ARGV pointers. */
for (i = 0; i < cif->nargs; i++)
argv[i] = pack (cif->arg_types[i], args[i]);
argv[i] = pack (cif->arg_types[i], args[i], 0);
result = scm_call_n (proc, argv, cif->nargs);

View file

@ -26,6 +26,7 @@
#include <wchar.h>
#endif
#include <math.h>
#include <unistr.h>
#include "libguile/_scm.h"
@ -192,7 +193,7 @@ scm_hasher(SCM obj, unsigned long n, size_t d)
case scm_tc16_real:
{
double r = SCM_REAL_VALUE (obj);
if (floor (r) == r)
if (floor (r) == r && !isinf (r) && !isnan (r))
{
obj = scm_inexact_to_exact (obj);
return scm_to_ulong (scm_modulo (obj, scm_from_ulong (n)));

View file

@ -3,7 +3,8 @@
#ifndef SCM_INLINE_H
#define SCM_INLINE_H
/* Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010 Free Software Foundation, Inc.
/* Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010,
* 2011 Free Software Foundation, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public License
@ -98,6 +99,7 @@ SCM_API int scm_is_pair (SCM x);
SCM_API int scm_is_string (SCM x);
SCM_API int scm_get_byte_or_eof (SCM port);
SCM_API int scm_peek_byte_or_eof (SCM port);
SCM_API void scm_putc (char c, SCM port);
SCM_API void scm_puts (const char *str_data, SCM port);
@ -362,7 +364,7 @@ scm_get_byte_or_eof (SCM port)
if (pt->read_pos >= pt->read_end)
{
if (scm_fill_input (port) == EOF)
if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
return EOF;
}
@ -371,6 +373,34 @@ scm_get_byte_or_eof (SCM port)
return c;
}
/* Like `scm_get_byte_or_eof' but does not change PORT's `read_pos'. */
#ifndef SCM_INLINE_C_INCLUDING_INLINE_H
SCM_C_EXTERN_INLINE
#endif
int
scm_peek_byte_or_eof (SCM port)
{
int c;
scm_t_port *pt = SCM_PTAB_ENTRY (port);
if (pt->rw_active == SCM_PORT_WRITE)
/* may be marginally faster than calling scm_flush. */
scm_ptobs[SCM_PTOBNUM (port)].flush (port);
if (pt->rw_random)
pt->rw_active = SCM_PORT_READ;
if (pt->read_pos >= pt->read_end)
{
if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
return EOF;
}
c = *pt->read_pos;
return c;
}
#ifndef SCM_INLINE_C_INCLUDING_INLINE_H
SCM_C_EXTERN_INLINE
#endif

View file

@ -1057,6 +1057,7 @@ update_port_lf (scm_t_wchar c, SCM port)
switch (c)
{
case '\a':
case EOF:
break;
case '\b':
SCM_DECCOL (port);
@ -1115,23 +1116,154 @@ utf8_to_codepoint (const scm_t_uint8 *utf8_buf, size_t size)
return codepoint;
}
/* Read a codepoint from PORT and return it in *CODEPOINT. Fill BUF
with the byte representation of the codepoint in PORT's encoding, and
set *LEN to the length in bytes of that representation. Return 0 on
success and an errno value on error. */
/* Read a UTF-8 sequence from PORT. On success, return 0 and set
*CODEPOINT to the codepoint that was read, fill BUF with its UTF-8
representation, and set *LEN to the length in bytes. Return
`EILSEQ' on error. */
static int
get_codepoint (SCM port, scm_t_wchar *codepoint,
char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
get_utf8_codepoint (SCM port, scm_t_wchar *codepoint,
scm_t_uint8 buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
{
#define ASSERT_NOT_EOF(b) \
if (SCM_UNLIKELY ((b) == EOF)) \
goto invalid_seq
#define CONSUME_PEEKED_BYTE() \
pt->read_pos++
int byte;
scm_t_port *pt;
*len = 0;
pt = SCM_PTAB_ENTRY (port);
byte = scm_get_byte_or_eof (port);
if (byte == EOF)
{
*codepoint = EOF;
return 0;
}
buf[0] = (scm_t_uint8) byte;
*len = 1;
if (buf[0] <= 0x7f)
/* 1-byte form. */
*codepoint = buf[0];
else if (buf[0] >= 0xc2 && buf[0] <= 0xdf)
{
/* 2-byte form. */
byte = scm_peek_byte_or_eof (port);
ASSERT_NOT_EOF (byte);
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
goto invalid_seq;
CONSUME_PEEKED_BYTE ();
buf[1] = (scm_t_uint8) byte;
*len = 2;
*codepoint = ((scm_t_wchar) buf[0] & 0x1f) << 6UL
| (buf[1] & 0x3f);
}
else if ((buf[0] & 0xf0) == 0xe0)
{
/* 3-byte form. */
byte = scm_peek_byte_or_eof (port);
ASSERT_NOT_EOF (byte);
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80
|| (buf[0] == 0xe0 && byte < 0xa0)
|| (buf[0] == 0xed && byte > 0x9f)))
goto invalid_seq;
CONSUME_PEEKED_BYTE ();
buf[1] = (scm_t_uint8) byte;
*len = 2;
byte = scm_peek_byte_or_eof (port);
ASSERT_NOT_EOF (byte);
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
goto invalid_seq;
CONSUME_PEEKED_BYTE ();
buf[2] = (scm_t_uint8) byte;
*len = 3;
*codepoint = ((scm_t_wchar) buf[0] & 0x0f) << 12UL
| ((scm_t_wchar) buf[1] & 0x3f) << 6UL
| (buf[2] & 0x3f);
}
else if (buf[0] >= 0xf0 && buf[0] <= 0xf4)
{
/* 4-byte form. */
byte = scm_peek_byte_or_eof (port);
ASSERT_NOT_EOF (byte);
if (SCM_UNLIKELY (((byte & 0xc0) != 0x80)
|| (buf[0] == 0xf0 && byte < 0x90)
|| (buf[0] == 0xf4 && byte > 0x8f)))
goto invalid_seq;
CONSUME_PEEKED_BYTE ();
buf[1] = (scm_t_uint8) byte;
*len = 2;
byte = scm_peek_byte_or_eof (port);
ASSERT_NOT_EOF (byte);
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
goto invalid_seq;
CONSUME_PEEKED_BYTE ();
buf[2] = (scm_t_uint8) byte;
*len = 3;
byte = scm_peek_byte_or_eof (port);
ASSERT_NOT_EOF (byte);
if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
goto invalid_seq;
CONSUME_PEEKED_BYTE ();
buf[3] = (scm_t_uint8) byte;
*len = 4;
*codepoint = ((scm_t_wchar) buf[0] & 0x07) << 18UL
| ((scm_t_wchar) buf[1] & 0x3f) << 12UL
| ((scm_t_wchar) buf[2] & 0x3f) << 6UL
| (buf[3] & 0x3f);
}
else
goto invalid_seq;
return 0;
invalid_seq:
/* Here we could choose the consume the faulty byte when it's not a
valid starting byte, but it's not a requirement. What Section 3.9
of Unicode 6.0.0 mandates, though, is to not consume a byte that
would otherwise be a valid starting byte. */
return EILSEQ;
#undef CONSUME_PEEKED_BYTE
#undef ASSERT_NOT_EOF
}
/* Likewise, read a byte sequence from PORT, passing it through its
input conversion descriptor. */
static int
get_iconv_codepoint (SCM port, scm_t_wchar *codepoint,
char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
{
scm_t_port *pt;
int err, byte_read;
size_t bytes_consumed, output_size;
char *output;
scm_t_uint8 utf8_buf[SCM_MBCHAR_BUF_SIZE];
scm_t_port *pt = SCM_PTAB_ENTRY (port);
if (SCM_UNLIKELY (pt->input_cd == (iconv_t) -1))
/* Initialize the conversion descriptors. */
scm_i_set_port_encoding_x (port, pt->encoding);
pt = SCM_PTAB_ENTRY (port);
for (output_size = 0, output = (char *) utf8_buf,
bytes_consumed = 0, err = 0;
@ -1177,30 +1309,45 @@ get_codepoint (SCM port, scm_t_wchar *codepoint,
if (SCM_UNLIKELY (output_size == 0))
/* An unterminated sequence. */
err = EILSEQ;
if (SCM_UNLIKELY (err != 0))
{
/* Reset the `iconv' state. */
iconv (pt->input_cd, NULL, NULL, NULL, NULL);
if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
{
*codepoint = '?';
err = 0;
}
/* Fail when the strategy is SCM_ICONVEH_ERROR or
SCM_ICONVEH_ESCAPE_SEQUENCE (the latter doesn't make sense for
input encoding errors.) */
}
else
else if (SCM_LIKELY (err == 0))
{
/* Convert the UTF8_BUF sequence to a Unicode code point. */
*codepoint = utf8_to_codepoint (utf8_buf, output_size);
update_port_lf (*codepoint, port);
*len = bytes_consumed;
}
*len = bytes_consumed;
return err;
}
/* Read a codepoint from PORT and return it in *CODEPOINT. Fill BUF
with the byte representation of the codepoint in PORT's encoding, and
set *LEN to the length in bytes of that representation. Return 0 on
success and an errno value on error. */
static int
get_codepoint (SCM port, scm_t_wchar *codepoint,
char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
{
int err;
scm_t_port *pt = SCM_PTAB_ENTRY (port);
if (pt->input_cd == (iconv_t) -1)
/* Initialize the conversion descriptors, if needed. */
scm_i_set_port_encoding_x (port, pt->encoding);
/* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */
if (pt->input_cd == (iconv_t) -1)
err = get_utf8_codepoint (port, codepoint, (scm_t_uint8 *) buf, len);
else
err = get_iconv_codepoint (port, codepoint, buf, len);
if (SCM_LIKELY (err == 0))
update_port_lf (*codepoint, port);
else if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
{
*codepoint = '?';
err = 0;
update_port_lf (*codepoint, port);
}
return err;
}
@ -2031,28 +2178,35 @@ scm_i_set_port_encoding_x (SCM port, const char *encoding)
if (encoding == NULL)
encoding = "ISO-8859-1";
pt->encoding = scm_gc_strdup (encoding, "port");
if (pt->encoding != encoding)
pt->encoding = scm_gc_strdup (encoding, "port");
if (SCM_CELL_WORD_0 (port) & SCM_RDNG)
/* If ENCODING is UTF-8, then no conversion descriptor is opened
because we do I/O ourselves. This saves 100+ KiB for each
descriptor. */
if (strcmp (encoding, "UTF-8"))
{
/* Open an input iconv conversion descriptor, from ENCODING
to UTF-8. We choose UTF-8, not UTF-32, because iconv
implementations can typically convert from anything to
UTF-8, but not to UTF-32 (see
<http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>). */
new_input_cd = iconv_open ("UTF-8", encoding);
if (new_input_cd == (iconv_t) -1)
goto invalid_encoding;
}
if (SCM_CELL_WORD_0 (port) & SCM_WRTNG)
{
new_output_cd = iconv_open (encoding, "UTF-8");
if (new_output_cd == (iconv_t) -1)
if (SCM_CELL_WORD_0 (port) & SCM_RDNG)
{
if (new_input_cd != (iconv_t) -1)
iconv_close (new_input_cd);
goto invalid_encoding;
/* Open an input iconv conversion descriptor, from ENCODING
to UTF-8. We choose UTF-8, not UTF-32, because iconv
implementations can typically convert from anything to
UTF-8, but not to UTF-32 (see
<http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>). */
new_input_cd = iconv_open ("UTF-8", encoding);
if (new_input_cd == (iconv_t) -1)
goto invalid_encoding;
}
if (SCM_CELL_WORD_0 (port) & SCM_WRTNG)
{
new_output_cd = iconv_open (encoding, "UTF-8");
if (new_output_cd == (iconv_t) -1)
{
if (new_input_cd != (iconv_t) -1)
iconv_close (new_input_cd);
goto invalid_encoding;
}
}
}

View file

@ -821,31 +821,57 @@ codepoint_to_utf8 (scm_t_wchar ch, scm_t_uint8 utf8[4])
return len;
}
/* Display the LEN codepoints in STR to PORT according to STRATEGY;
return the number of codepoints successfully displayed. If NARROW_P,
then STR is interpreted as a sequence of `char', denoting a Latin-1
string; otherwise it's interpreted as a sequence of
`scm_t_wchar'. */
static size_t
display_string (const void *str, int narrow_p,
size_t len, SCM port,
scm_t_string_failed_conversion_handler strategy)
{
#define STR_REF(s, x) \
(narrow_p \
? (scm_t_wchar) ((unsigned char *) (s))[x] \
: ((scm_t_wchar *) (s))[x])
/* Write STR to PORT as UTF-8. STR is a LEN-codepoint string; it is
narrow if NARROW_P is true, wide otherwise. Return LEN. */
static size_t
display_string_as_utf8 (const void *str, int narrow_p, size_t len,
SCM port)
{
size_t printed = 0;
while (len > printed)
{
size_t utf8_len, i;
char *input, utf8_buf[256];
/* Convert STR to UTF-8. */
for (i = printed, utf8_len = 0, input = utf8_buf;
i < len && utf8_len + 4 < sizeof (utf8_buf);
i++)
{
utf8_len += codepoint_to_utf8 (STR_REF (str, i),
(scm_t_uint8 *) input);
input = utf8_buf + utf8_len;
}
/* INPUT was successfully converted, entirely; print the
result. */
scm_lfwrite (utf8_buf, utf8_len, port);
printed += i - printed;
}
assert (printed == len);
return len;
}
/* Convert STR through PORT's output conversion descriptor and write the
output to PORT. Return the number of codepoints written. */
static size_t
display_string_using_iconv (const void *str, int narrow_p, size_t len,
SCM port,
scm_t_string_failed_conversion_handler strategy)
{
size_t printed;
scm_t_port *pt;
pt = SCM_PTAB_ENTRY (port);
if (SCM_UNLIKELY (pt->output_cd == (iconv_t) -1))
/* Initialize the conversion descriptors. */
scm_i_set_port_encoding_x (port, pt->encoding);
printed = 0;
while (len > printed)
@ -928,7 +954,35 @@ display_string (const void *str, int narrow_p,
}
return printed;
}
#undef STR_REF
/* Display the LEN codepoints in STR to PORT according to STRATEGY;
return the number of codepoints successfully displayed. If NARROW_P,
then STR is interpreted as a sequence of `char', denoting a Latin-1
string; otherwise it's interpreted as a sequence of
`scm_t_wchar'. */
static size_t
display_string (const void *str, int narrow_p,
size_t len, SCM port,
scm_t_string_failed_conversion_handler strategy)
{
scm_t_port *pt;
pt = SCM_PTAB_ENTRY (port);
if (pt->output_cd == (iconv_t) -1)
/* Initialize the conversion descriptors, if needed. */
scm_i_set_port_encoding_x (port, pt->encoding);
/* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */
if (pt->output_cd == (iconv_t) -1)
return display_string_as_utf8 (str, narrow_p, len, port);
else
return display_string_using_iconv (str, narrow_p, len,
port, strategy);
}
/* Attempt to display CH to PORT according to STRATEGY. Return non-zero

View file

@ -460,14 +460,11 @@ SCM_DEFINE (scm_lookahead_u8, "lookahead-u8", 1, 0, 0,
SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
u8 = scm_get_byte_or_eof (port);
u8 = scm_peek_byte_or_eof (port);
if (u8 == EOF)
result = SCM_EOF_VAL;
else
{
scm_unget_byte (u8, port);
result = SCM_I_MAKINUM ((scm_t_uint8) u8);
}
result = SCM_I_MAKINUM ((scm_t_uint8) u8);
return result;
}

View file

@ -1135,7 +1135,7 @@ scm_read_scsh_block_comment (scm_t_wchar chr, SCM port)
return SCM_UNSPECIFIED;
}
static inline SCM
static SCM
scm_read_shebang (scm_t_wchar chr, SCM port)
{
int c = 0;

View file

@ -33,7 +33,7 @@
vhash? vhash-cons vhash-consq vhash-consv
vhash-assoc vhash-assq vhash-assv
vhash-delete vhash-delq vhash-delv
vhash-fold
vhash-fold vhash-fold-right
vhash-fold* vhash-foldq* vhash-foldv*
alist->vhash))
@ -245,7 +245,14 @@ tail."
(define (vlist-fold-right proc init vlist)
"Fold over @var{vlist}, calling @var{proc} for each element, starting from
the last element."
(vlist-fold proc init (vlist-reverse vlist)))
(define len (vlist-length vlist))
(let loop ((index (1- len))
(result init))
(if (< index 0)
result
(loop (1- index)
(proc (vlist-ref vlist index) result)))))
(define (vlist-reverse vlist)
"Return a new @var{vlist} whose content are those of @var{vlist} in reverse
@ -553,6 +560,16 @@ with @var{equal?}."
seed
vhash))
(define (vhash-fold-right proc seed vhash)
"Fold over the key/pair elements of @var{vhash}, starting from the 0th
element. For each pair call @var{proc} as @code{(@var{proc} key value
result)}."
(vlist-fold-right (lambda (key+value result)
(proc (car key+value) (cdr key+value)
result))
seed
vhash))
(define* (alist->vhash alist #:optional (hash hash))
"Return the vhash corresponding to @var{alist}, an association list."
(fold-right (lambda (pair result)

View file

@ -1,6 +1,6 @@
;;; Guile VM assembler
;; Copyright (C) 2001, 2009, 2010 Free Software Foundation, Inc.
;; Copyright (C) 2001, 2009, 2010, 2011 Free Software Foundation, Inc.
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
@ -26,10 +26,36 @@
#:use-module (system vm instruction)
#:use-module ((system vm program) #:select (make-binding))
#:use-module (ice-9 receive)
#:use-module (ice-9 vlist)
#:use-module ((srfi srfi-1) #:select (fold))
#:use-module (rnrs bytevectors)
#:export (compile-assembly))
;; Traversal helpers
;;
(define (vhash-fold-right2 proc vhash s0 s1)
(let lp ((i (vlist-length vhash)) (s0 s0) (s1 s1))
(if (zero? i)
(values s0 s1)
(receive (s0 s1) (let ((pair (vlist-ref vhash (1- i))))
(proc (car pair) (cdr pair) s0 s1))
(lp (1- i) s0 s1)))))
(define (fold2 proc ls s0 s1)
(let lp ((ls ls) (s0 s0) (s1 s1))
(if (null? ls)
(values s0 s1)
(receive (s0 s1) (proc (car ls) s0 s1)
(lp (cdr ls) s0 s1)))))
(define (vector-fold2 proc vect s0 s1)
(let ((len (vector-length vect)))
(let lp ((i 0) (s0 s0) (s1 s1))
(if (< i len)
(receive (s0 s1) (proc (vector-ref vect i) s0 s1)
(lp (1+ i) s0 s1))
(values s0 s1)))))
;; Variable cache cells go in the object table, and serialize as their
;; keys. The reason we wrap the keys in these records is so they don't
;; compare as `equal?' to other objects in the object table.
@ -38,13 +64,6 @@
(define-record <variable-cache-cell> key)
;; Subprograms can be loaded into an object table as well. We need a
;; disjoint type here too. (Subprograms have their own object tables --
;; though probably we should just make one table per compilation unit.)
(define-record <subprogram> table prog)
(define (limn-sources sources)
(let lp ((in sources) (out '()) (filename #f))
(if (null? in)
@ -68,16 +87,132 @@
(else
(lp (cdr in) out filename)))))))
;; Avoid going through the compiler so as to avoid adding to the
;; constant store.
(define (make-meta bindings sources arities tail)
;; sounds silly, but the only case in which we have no arities is when
;; compiling a meta procedure.
(if (and (null? bindings) (null? sources) (null? arities) (null? tail))
#f
(compile-assembly
(make-glil-program '()
(list
(make-glil-const `(,bindings ,sources ,arities ,@tail))
(make-glil-call 'return 1))))))
(let ((body `(,@(dump-object `(,bindings ,sources ,arities ,@tail) 0)
(return))))
`(load-program ()
,(addr+ 0 body)
#f
,@body)))
;; If this is true, the object doesn't need to go in a constant table.
;;
(define (immediate? x)
(object->assembly x))
;; Note: in all of these procedures that build up constant tables, the
;; first (zeroth) index is reserved. At runtime it is replaced with the
;; procedure's module. Hence all of this 1+ length business.
;; Build up a vhash of constant -> index, allowing us to build up a
;; constant table for a whole compilation unit.
;;
(define (build-constant-store x)
(define (add-to-store store x)
(define (add-to-end store x)
(vhash-cons x (1+ (vlist-length store)) store))
(cond
((vhash-assoc x store)
;; Already in the store.
store)
((immediate? x)
;; Immediates don't need to go in the constant table.
store)
((or (number? x)
(string? x)
(symbol? x)
(keyword? x))
;; Atoms.
(add-to-end store x))
((variable-cache-cell? x)
;; Variable cache cells (see below).
(add-to-end (add-to-store store (variable-cache-cell-key x))
x))
((list? x)
;; Add the elements to the store, then the list itself. We could
;; try hashing the cdrs as well, but that seems a bit overkill, and
;; this way we do compress the bytecode a bit by allowing the use of
;; the `list' opcode.
(let ((store (fold (lambda (x store)
(add-to-store store x))
store
x)))
(add-to-end store x)))
((pair? x)
;; Non-lists get caching on both fields.
(let ((store (add-to-store (add-to-store store (car x))
(cdr x))))
(add-to-end store x)))
((and (vector? x)
(equal? (array-shape x) (list (list 0 (1- (vector-length x))))))
;; Likewise, add the elements to the store, then the vector itself.
;; Important for the vectors produced by the psyntax expansion
;; process.
(let ((store (fold (lambda (x store)
(add-to-store store x))
store
(vector->list x))))
(add-to-end store x)))
((array? x)
;; Naive assumption that if folks are using arrays, that perhaps
;; there's not much more duplication.
(add-to-end store x))
(else
(error "build-constant-store: unrecognized object" x))))
(let walk ((x x) (store vlist-null))
(record-case x
((<glil-program> meta body)
(fold walk store body))
((<glil-const> obj)
(add-to-store store obj))
((<glil-kw-prelude> kw)
(add-to-store store kw))
((<glil-toplevel> op name)
;; We don't add toplevel variable cache cells to the global
;; constant table, because they are sensitive to changes in
;; modules as the toplevel expressions are evaluated. So we just
;; add the name.
(add-to-store store name))
((<glil-module> op mod name public?)
;; However, it is fine add module variable cache cells to the
;; global table, as their bindings are not dependent on the
;; current module.
(add-to-store store
(make-variable-cache-cell (list mod name public?))))
(else store))))
;; Analyze one <glil-program> to determine its object table. Produces a
;; vhash of constant to index.
;;
(define (build-object-table x)
(define (add store x)
(if (vhash-assoc x store)
store
(vhash-cons x (1+ (vlist-length store)) store)))
(record-case x
((<glil-program> meta body)
(fold (lambda (x table)
(record-case x
((<glil-program> meta body)
;; Add the GLIL itself to the table.
(add table x))
((<glil-const> obj)
(if (immediate? obj)
table
(add table obj)))
((<glil-kw-prelude> kw)
(add table kw))
((<glil-toplevel> op name)
(add table (make-variable-cache-cell name)))
((<glil-module> op mod name public?)
(add table (make-variable-cache-cell (list mod name public?))))
(else table)))
vlist-null
body))))
;; A functional stack of names of live variables.
(define (make-open-binding name boxed? index)
@ -115,21 +250,6 @@
(lambda (x y) (< (car x) (car y)))))
(close-all-bindings (close-binding bindings end) end)))
;; A functional object table.
(define *module* 1)
(define (assoc-ref-or-acons alist x make-y)
(cond ((assoc-ref alist x)
=> (lambda (y) (values y alist)))
(else
(let ((y (make-y x alist)))
(values y (acons x y alist))))))
(define (object-index-and-alist x alist)
(assoc-ref-or-acons alist x
(lambda (x alist)
(+ (length alist) *module*))))
(define (make-object-table objects)
(and (not (null? objects))
(list->vector (cons #f objects))))
;; A functional arities thingamajiggy.
;; arities := ((ip nreq [[nopt] [[rest] [kw]]]]) ...)
@ -152,82 +272,151 @@
(open-arity start nreq nopt rest kw (close-arity end arities)))
(define (compile-assembly glil)
(receive (code . _)
(glil->assembly glil #t '(()) '() '() #f '() -1)
(car code)))
(let* ((all-constants (build-constant-store glil))
(prog (compile-program glil all-constants))
(len (byte-length prog)))
;; The top objcode thunk. We're going to wrap this thunk in
;; a thunk -- yo dawgs -- with the goal being to lift all
;; constants up to the top level. The store forms a DAG, so
;; we can actually build up later elements in terms of
;; earlier ones.
;;
(cond
((vlist-null? all-constants)
;; No constants: just emit the inner thunk.
prog)
(else
;; We have an object store, so write it out, attach it
;; to the inner thunk, and tail call.
(receive (tablecode addr) (dump-constants all-constants)
(let ((prog (align-program prog addr)))
;; Outer thunk.
`(load-program ()
,(+ (addr+ addr prog)
2 ; for (tail-call 0)
)
#f
;; Load the table, build the inner
;; thunk, then tail call.
,@tablecode
,@prog
(tail-call 0))))))))
(define (glil->assembly glil toplevel? bindings
source-alist label-alist object-alist arities addr)
(define (compile-program glil constants)
(record-case glil
((<glil-program> meta body)
(let lp ((body body) (code '()) (bindings '(())) (source-alist '())
(label-alist '()) (arities '()) (addr 0))
(cond
((null? body)
(let ((code (fold append '() code))
(bindings (close-all-bindings bindings addr))
(sources (limn-sources (reverse! source-alist)))
(labels (reverse label-alist))
(arities (reverse (close-arity addr arities)))
(len addr))
(let* ((meta (make-meta bindings sources arities meta))
(meta-pad (if meta (modulo (- 8 (modulo len 8)) 8) 0)))
`(load-program ,labels
,(+ len meta-pad)
,meta
,@code
,@(if meta
(make-list meta-pad '(nop))
'())))))
(else
(receive (subcode bindings source-alist label-alist arities)
(glil->assembly (car body) bindings
source-alist label-alist
constants arities addr)
(lp (cdr body) (cons subcode code)
bindings source-alist label-alist arities
(addr+ addr subcode)))))))))
(define (compile-objtable constants table addr)
(define (load-constant idx)
(if (< idx 256)
(values `((object-ref ,idx))
2)
(values `((long-object-ref
,(quotient idx 256) ,(modulo idx 256)))
3)))
(cond
((vlist-null? table)
;; Empty table; just return #f.
(values '((make-false))
(1+ addr)))
(else
(call-with-values
(lambda ()
(vhash-fold-right2
(lambda (obj idx codes addr)
(cond
((vhash-assoc obj constants)
=> (lambda (pair)
(receive (load len) (load-constant (cdr pair))
(values (cons load codes)
(+ addr len)))))
((variable-cache-cell? obj)
(cond
((vhash-assoc (variable-cache-cell-key obj) constants)
=> (lambda (pair)
(receive (load len) (load-constant (cdr pair))
(values (cons load codes)
(+ addr len)))))
(else (error "vcache cell key not in table" obj))))
((glil-program? obj)
;; Programs are not cached in the global constants
;; table because when a program is loaded, its module
;; is bound, and we want to do that only after any
;; preceding effectful statements.
(let* ((table (build-object-table obj))
(prog (compile-program obj table)))
(receive (tablecode addr)
(compile-objtable constants table addr)
(let ((prog (align-program prog addr)))
(values (cons `(,@tablecode ,@prog)
codes)
(addr+ addr prog))))))
(else
(error "unrecognized constant" obj))))
table
'(((make-false))) (1+ addr)))
(lambda (elts addr)
(let ((len (1+ (vlist-length table))))
(values
(fold append
`((vector ,(quotient len 256) ,(modulo len 256)))
elts)
(+ addr 3))))))))
(define (glil->assembly glil bindings source-alist label-alist
constants arities addr)
(define (emit-code x)
(values x bindings source-alist label-alist object-alist arities))
(define (emit-code/object x object-alist)
(values x bindings source-alist label-alist object-alist arities))
(values x bindings source-alist label-alist arities))
(define (emit-object-ref i)
(values (if (< i 256)
`((object-ref ,i))
`((long-object-ref ,(quotient i 256) ,(modulo i 256))))
bindings source-alist label-alist arities))
(define (emit-code/arity x nreq nopt rest kw)
(values x bindings source-alist label-alist object-alist
(values x bindings source-alist label-alist
(begin-arity addr (addr+ addr x) nreq nopt rest kw arities)))
(record-case glil
((<glil-program> meta body)
(define (process-body)
(let lp ((body body) (code '()) (bindings '(())) (source-alist '())
(label-alist '()) (object-alist (if toplevel? #f '()))
(arities '()) (addr 0))
(cond
((null? body)
(values (reverse code)
(close-all-bindings bindings addr)
(limn-sources (reverse! source-alist))
(reverse label-alist)
(and object-alist (map car (reverse object-alist)))
(reverse (close-arity addr arities))
addr))
(else
(receive (subcode bindings source-alist label-alist object-alist
arities)
(glil->assembly (car body) #f bindings
source-alist label-alist object-alist
arities addr)
(lp (cdr body) (append (reverse subcode) code)
bindings source-alist label-alist object-alist arities
(addr+ addr subcode)))))))
(receive (code bindings sources labels objects arities len)
(process-body)
(let* ((meta (make-meta bindings sources arities meta))
(meta-pad (if meta (modulo (- 8 (modulo len 8)) 8) 0))
(prog `(load-program ,labels
,(+ len meta-pad)
,meta
,@code
,@(if meta
(make-list meta-pad '(nop))
'()))))
(cond
(toplevel?
;; toplevel bytecode isn't loaded by the vm, no way to do
;; object table or closure capture (not in the bytecode,
;; anyway)
(emit-code (align-program prog addr)))
(else
(let ((table (make-object-table objects)))
(cond
(object-alist
;; if we are being compiled from something with an object
;; table, cache the program there
(receive (i object-alist)
(object-index-and-alist (make-subprogram table prog)
object-alist)
(emit-code/object `(,(if (< i 256)
`(object-ref ,i)
`(long-object-ref ,(quotient i 256)
,(modulo i 256))))
object-alist)))
(else
;; otherwise emit a load directly
(let ((table-code (dump-object table addr)))
(emit-code
`(,@table-code
,@(align-program prog (addr+ addr table-code)))))))))))))
(cond
((vhash-assoc glil constants)
;; We are cached in someone's objtable; just emit a load.
=> (lambda (pair)
(emit-object-ref (cdr pair))))
(else
;; Otherwise, build an objtable for the program, compile it, and
;; emit a load-program.
(let* ((table (build-object-table glil))
(prog (compile-program glil table)))
(receive (tablecode addr) (compile-objtable constants table addr)
(emit-code `(,@tablecode ,@(align-program prog addr))))))))
((<glil-std-prelude> nreq nlocs else-label)
(emit-code/arity
@ -277,61 +466,60 @@
nreq nopt rest #f)))
((<glil-kw-prelude> nreq nopt rest kw allow-other-keys? nlocs else-label)
(receive (kw-idx object-alist)
(object-index-and-alist kw object-alist)
(let* ((bind-required
(if else-label
`((br-if-nargs-lt ,(quotient nreq 256)
,(modulo nreq 256)
,else-label))
`((assert-nargs-ge ,(quotient nreq 256)
,(modulo nreq 256)))))
(ntotal (apply max (+ nreq nopt) (map 1+ (map cdr kw))))
(bind-optionals-and-shuffle
`((bind-optionals/shuffle
,(quotient nreq 256)
,(modulo nreq 256)
,(quotient (+ nreq nopt) 256)
,(modulo (+ nreq nopt) 256)
,(quotient ntotal 256)
,(modulo ntotal 256))))
(bind-kw
;; when this code gets called, all optionals are filled
;; in, space has been made for kwargs, and the kwargs
;; themselves have been shuffled above the slots for all
;; req/opt/kwargs locals.
`((bind-kwargs
,(quotient kw-idx 256)
,(modulo kw-idx 256)
,(quotient ntotal 256)
,(modulo ntotal 256)
,(logior (if rest 2 0)
(if allow-other-keys? 1 0)))))
(bind-rest
(if rest
`((bind-rest ,(quotient ntotal 256)
,(modulo ntotal 256)
,(quotient rest 256)
,(modulo rest 256)))
'())))
(let* ((kw-idx (or (and=> (vhash-assoc kw constants) cdr)
(error "kw not in objtable")))
(bind-required
(if else-label
`((br-if-nargs-lt ,(quotient nreq 256)
,(modulo nreq 256)
,else-label))
`((assert-nargs-ge ,(quotient nreq 256)
,(modulo nreq 256)))))
(ntotal (apply max (+ nreq nopt) (map 1+ (map cdr kw))))
(bind-optionals-and-shuffle
`((bind-optionals/shuffle
,(quotient nreq 256)
,(modulo nreq 256)
,(quotient (+ nreq nopt) 256)
,(modulo (+ nreq nopt) 256)
,(quotient ntotal 256)
,(modulo ntotal 256))))
(bind-kw
;; when this code gets called, all optionals are filled
;; in, space has been made for kwargs, and the kwargs
;; themselves have been shuffled above the slots for all
;; req/opt/kwargs locals.
`((bind-kwargs
,(quotient kw-idx 256)
,(modulo kw-idx 256)
,(quotient ntotal 256)
,(modulo ntotal 256)
,(logior (if rest 2 0)
(if allow-other-keys? 1 0)))))
(bind-rest
(if rest
`((bind-rest ,(quotient ntotal 256)
,(modulo ntotal 256)
,(quotient rest 256)
,(modulo rest 256)))
'())))
(let ((code `(,@bind-required
,@bind-optionals-and-shuffle
,@bind-kw
,@bind-rest
(reserve-locals ,(quotient nlocs 256)
,(modulo nlocs 256)))))
(values code bindings source-alist label-alist object-alist
(begin-arity addr (addr+ addr code) nreq nopt rest
(and kw (cons allow-other-keys? kw))
arities))))))
(let ((code `(,@bind-required
,@bind-optionals-and-shuffle
,@bind-kw
,@bind-rest
(reserve-locals ,(quotient nlocs 256)
,(modulo nlocs 256)))))
(values code bindings source-alist label-alist
(begin-arity addr (addr+ addr code) nreq nopt rest
(and kw (cons allow-other-keys? kw))
arities)))))
((<glil-bind> vars)
(values '()
(open-binding bindings vars addr)
source-alist
label-alist
object-alist
arities))
((<glil-mv-bind> vars rest)
@ -340,13 +528,11 @@
bindings
source-alist
label-alist
object-alist
arities)
(values `((truncate-values ,(length vars) ,(if rest 1 0)))
(open-binding bindings vars addr)
source-alist
label-alist
object-alist
arities)))
((<glil-unbind>)
@ -354,7 +540,6 @@
(close-binding bindings addr)
source-alist
label-alist
object-alist
arities))
((<glil-source> props)
@ -362,7 +547,6 @@
bindings
(acons addr props source-alist)
label-alist
object-alist
arities))
((<glil-void>)
@ -373,16 +557,10 @@
((object->assembly obj)
=> (lambda (code)
(emit-code (list code))))
((not object-alist)
(emit-code (dump-object obj addr)))
(else
(receive (i object-alist)
(object-index-and-alist obj object-alist)
(emit-code/object (if (< i 256)
`((object-ref ,i))
`((long-object-ref ,(quotient i 256)
,(modulo i 256))))
object-alist)))))
((vhash-assoc obj constants)
=> (lambda (pair)
(emit-object-ref (cdr pair))))
(else (error "const not in table" obj))))
((<glil-lexical> local? boxed? op index)
(emit-code
@ -442,30 +620,38 @@
(case op
((ref set)
(cond
((not object-alist)
(emit-code `(,@(dump-object name addr)
(link-now)
,(case op
((ref) '(variable-ref))
((set) '(variable-set))))))
((and=> (vhash-assoc (make-variable-cache-cell name) constants)
cdr)
=> (lambda (i)
(emit-code (if (< i 256)
`((,(case op
((ref) 'toplevel-ref)
((set) 'toplevel-set))
,i))
`((,(case op
((ref) 'long-toplevel-ref)
((set) 'long-toplevel-set))
,(quotient i 256)
,(modulo i 256)))))))
(else
(receive (i object-alist)
(object-index-and-alist (make-variable-cache-cell name)
object-alist)
(emit-code/object (if (< i 256)
`((,(case op
((ref) 'toplevel-ref)
((set) 'toplevel-set))
,i))
`((,(case op
((ref) 'long-toplevel-ref)
((set) 'long-toplevel-set))
,(quotient i 256)
,(modulo i 256))))
object-alist)))))
(let ((i (or (and=> (vhash-assoc name constants) cdr)
(error "toplevel name not in objtable" name))))
(emit-code `(,(if (< i 256)
`(object-ref ,i)
`(long-object-ref ,(quotient i 256)
,(modulo i 256)))
(link-now)
,(case op
((ref) '(variable-ref))
((set) '(variable-set)))))))))
((define)
(emit-code `(,@(dump-object name addr)
(define))))
(let ((i (or (and=> (vhash-assoc name constants) cdr)
(error "toplevel name not in objtable" name))))
(emit-code `(,(if (< i 256)
`(object-ref ,i)
`(long-object-ref ,(quotient i 256)
,(modulo i 256)))
(define)))))
(else
(error "unknown toplevel var kind" op name))))
@ -473,21 +659,19 @@
(let ((key (list mod name public?)))
(case op
((ref set)
(cond
((not object-alist)
(emit-code `(,@(dump-object key addr)
(link-now)
,(case op
((ref) '(variable-ref))
((set) '(variable-set))))))
(else
(receive (i object-alist)
(object-index-and-alist (make-variable-cache-cell key)
object-alist)
(emit-code/object (case op
((ref) `((toplevel-ref ,i)))
((set) `((toplevel-set ,i))))
object-alist)))))
(let ((i (or (and=> (vhash-assoc (make-variable-cache-cell key)
constants) cdr)
(error "module vcache not in objtable" key))))
(emit-code (if (< i 256)
`((,(case op
((ref) 'toplevel-ref)
((set) 'toplevel-set))
,i))
`((,(case op
((ref) 'long-toplevel-ref)
((set) 'long-toplevel-set))
,(quotient i 256)
,(modulo i 256)))))))
(else
(error "unknown module var kind" op key)))))
@ -497,7 +681,6 @@
bindings
source-alist
(acons label (addr+ addr code) label-alist)
object-alist
arities)))
((<glil-branch> inst label)
@ -533,11 +716,6 @@
(cond
((object->assembly x) => list)
((variable-cache-cell? x) (dump-object (variable-cache-cell-key x) addr))
((subprogram? x)
(let ((table-code (dump-object (subprogram-table x) addr)))
`(,@table-code
,@(align-program (subprogram-prog x)
(addr+ addr table-code)))))
((number? x)
`((load-number ,(number->string x))))
((string? x)
@ -608,5 +786,153 @@
,(logand #xff len)))
codes)))))
(else
(error "assemble: unrecognized object" x))))
(error "dump-object: unrecognized object" x))))
(define (dump-constants constants)
(define (ref-or-dump x i addr)
(let ((pair (vhash-assoc x constants)))
(if (and pair (< (cdr pair) i))
(let ((idx (cdr pair)))
(if (< idx 256)
(values `((object-ref ,idx))
(+ addr 2))
(values `((long-object-ref ,(quotient idx 256)
,(modulo idx 256)))
(+ addr 3))))
(dump1 x i addr))))
(define (dump1 x i addr)
(cond
((object->assembly x)
=> (lambda (code)
(values (list code)
(+ (byte-length code) addr))))
((or (number? x)
(string? x)
(symbol? x)
(keyword? x))
;; Atoms.
(let ((code (dump-object x addr)))
(values code (addr+ addr code))))
((variable-cache-cell? x)
(dump1 (variable-cache-cell-key x) i addr))
((list? x)
(receive (codes addr)
(fold2 (lambda (x codes addr)
(receive (subcode addr) (ref-or-dump x i addr)
(values (cons subcode codes) addr)))
x '() addr)
(values (fold append
(let ((len (length x)))
`((list ,(quotient len 256) ,(modulo len 256))))
codes)
(+ addr 3))))
((pair? x)
(receive (car-code addr) (ref-or-dump (car x) i addr)
(receive (cdr-code addr) (ref-or-dump (cdr x) i addr)
(values `(,@car-code ,@cdr-code (cons))
(1+ addr)))))
((and (vector? x)
(equal? (array-shape x) (list (list 0 (1- (vector-length x))))))
(receive (codes addr)
(vector-fold2 (lambda (x codes addr)
(receive (subcode addr) (ref-or-dump x i addr)
(values (cons subcode codes) addr)))
x '() addr)
(values (fold append
(let ((len (vector-length x)))
`((vector ,(quotient len 256) ,(modulo len 256))))
codes)
(+ addr 3))))
((and (array? x) (symbol? (array-type x)))
(receive (type addr) (ref-or-dump (array-type x) i addr)
(receive (shape addr) (ref-or-dump (array-shape x) i addr)
(let ((bv (align-code `(load-array ,(uniform-array->bytevector x))
addr 8 4)))
(values `(,@type ,@shape ,@bv)
(addr+ addr bv))))))
((array? x)
(let ((contents (array-contents x)))
(receive (codes addr)
(vector-fold2 (lambda (x codes addr)
(receive (subcode addr) (ref-or-dump x i addr)
(values (cons subcode codes) addr)))
x '() addr)
(receive (shape addr) (ref-or-dump (array-shape x) i addr)
(values (fold append
(let ((len (vector-length contents)))
`(,@shape
(make-array ,(quotient (ash len -16) 256)
,(logand #xff (ash len -8))
,(logand #xff len))))
codes)
(+ addr 4))))))
(else
(error "write-table: unrecognized object" x))))
(receive (codes addr)
(vhash-fold-right2 (lambda (obj idx code addr)
;; The vector is on the stack. Dup it, push
;; the index, push the val, then vector-set.
(let ((pre `((dup)
,(object->assembly idx))))
(receive (valcode addr) (dump1 obj idx
(addr+ addr pre))
(values (cons* '((vector-set))
valcode
pre
code)
(1+ addr)))))
constants
'(((assert-nargs-ee/locals 1)
;; Push the vector.
(local-ref 0)))
4)
(let* ((len (1+ (vlist-length constants)))
(pre-prog-addr (+ 2 ; reserve-locals
len 3 ; empty vector
2 ; local-set
1 ; new-frame
2 ; local-ref
))
(prog (align-program
`(load-program ()
,(+ addr 1)
#f
;; The `return' will be at the tail of the
;; program. The vector is already pushed
;; on the stack.
. ,(fold append '((return)) codes))
pre-prog-addr)))
(values `(;; Reserve storage for the vector.
(assert-nargs-ee/locals ,(logior 0 (ash 1 3)))
;; Push the vector, and store it in slot 0.
,@(make-list len '(make-false))
(vector ,(quotient len 256) ,(modulo len 256))
(local-set 0)
;; Now we open the call frame.
;;
(new-frame)
;; Now build a thunk to init the constants. It will
;; have the unfinished constant table both as its
;; argument and as its objtable. The former allows it
;; to update the objtable, with vector-set!, and the
;; latter allows init code to refer to previously set
;; values.
;;
;; Grab the vector, to be the objtable.
(local-ref 0)
;; Now the load-program, properly aligned. Pops the vector.
,@prog
;; Grab the vector, as an argument this time.
(local-ref 0)
;; Call the init thunk with the vector as an arg.
(call 1)
;; The thunk also returns the vector. Leave it on the
;; stack for compile-assembly to use.
)
;; The byte length of the init code, which we can
;; determine without folding over the code again.
(+ (addr+ pre-prog-addr prog) ; aligned program
2 ; local-ref
2 ; call
)))))

View file

@ -110,7 +110,9 @@
(rnrs files) ;for the condition types
(srfi srfi-8)
(ice-9 rdelim)
(except (guile) raise))
(except (guile) raise display)
(prefix (only (guile) display)
guile:))
@ -377,6 +379,12 @@ return the characters accumulated in that port."
(else
(display s port)))))
;; Defined here to be able to make use of `with-i/o-encoding-error', but
;; not exported from here, but from `(rnrs io simple)'.
(define* (display object #:optional (port (current-output-port)))
(with-i/o-encoding-error
(guile:display object port)))
;;;
;;; Textual input.

View file

@ -1,6 +1,6 @@
;;; simple.scm --- The R6RS simple I/O library
;; Copyright (C) 2010 Free Software Foundation, Inc.
;; Copyright (C) 2010, 2011 Free Software Foundation, Inc.
;;
;; This library is free software; you can redistribute it and/or
;; modify it under the terms of the GNU Lesser General Public
@ -85,42 +85,76 @@
(import (only (rnrs io ports)
call-with-port
close-port
open-file-input-port
open-file-output-port
eof-object
eof-object?
eof-object?
file-options
native-transcoder
get-char
lookahead-char
get-datum
put-char
put-datum
input-port?
output-port?)
(only (guile) @@
current-input-port
current-output-port
current-error-port
(only (guile)
@@
current-input-port
current-output-port
current-error-port
with-input-from-file
with-output-to-file
define*
open-input-file
open-output-file
close-input-port
close-output-port
read-char
peek-char
read
write-char
newline
display
write)
with-input-from-port
with-output-to-port)
(rnrs base (6))
(rnrs files (6)) ;for the condition types
)
(define display (@@ (rnrs io ports) display))
(define (call-with-input-file filename proc)
(call-with-port (open-file-input-port filename) proc))
(define (call-with-output-file filename proc)
(call-with-port (open-file-output-port filename) proc))
)
(define (with-input-from-file filename thunk)
(call-with-input-file filename
(lambda (port) (with-input-from-port port thunk))))
(define (with-output-to-file filename thunk)
(call-with-output-file filename
(lambda (port) (with-output-to-port port thunk))))
(define (open-input-file filename)
(open-file-input-port filename (file-options) (native-transcoder)))
(define (open-output-file filename)
(open-file-output-port filename (file-options) (native-transcoder)))
(define close-input-port close-port)
(define close-output-port close-port)
(define* (read-char #:optional (port (current-input-port)))
(get-char port))
(define* (peek-char #:optional (port (current-input-port)))
(lookahead-char port))
(define* (read #:optional (port (current-input-port)))
(get-datum port))
(define* (write-char char #:optional (port (current-output-port)))
(put-char port char))
(define* (newline #:optional (port (current-output-port)))
(put-char port #\newline))
(define* (write object #:optional (port (current-output-port)))
(put-datum port object))
)

View file

@ -566,6 +566,8 @@ has just one element then that's the return value."
(mapn (cdr l1) (map cdr rest) (1- len)
(cons (apply f (car l1) (map car rest)) out))))))))
(define map-in-order map)
(define for-each
(case-lambda
((f l)

View file

@ -124,24 +124,32 @@
(pass-if "pointer from bits"
(let* ((bytes (iota (sizeof '*)))
(bv (u8-list->bytevector bytes)))
(bv (u8-list->bytevector bytes))
(fold (case (native-endianness)
((little) fold-right)
((big) fold)
(else (error "unsupported endianness")))))
(= (pointer-address
(make-pointer (bytevector-uint-ref bv 0 (native-endianness)
(sizeof '*))))
(fold-right (lambda (byte address)
(+ byte (* 256 address)))
0
bytes))))
(fold (lambda (byte address)
(+ byte (* 256 address)))
0
bytes))))
(pass-if "dereference-pointer"
(let* ((bytes (iota (sizeof '*)))
(bv (u8-list->bytevector bytes)))
(bv (u8-list->bytevector bytes))
(fold (case (native-endianness)
((little) fold-right)
((big) fold)
(else (error "unsupported endianness")))))
(= (pointer-address
(dereference-pointer (bytevector->pointer bv)))
(fold-right (lambda (byte address)
(+ byte (* 256 address)))
0
bytes)))))
(fold (lambda (byte address)
(+ byte (* 256 address)))
0
bytes)))))
(with-test-prefix "pointer<->string"

View file

@ -1,6 +1,6 @@
;;;; hash.test --- test guile hashing -*- scheme -*-
;;;;
;;;; Copyright (C) 2004, 2005, 2006, 2008 Free Software Foundation, Inc.
;;;; Copyright (C) 2004, 2005, 2006, 2008, 2011 Free Software Foundation, Inc.
;;;;
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
@ -32,7 +32,10 @@
(hash #t 0))
(pass-if (= 0 (hash #t 1)))
(pass-if (= 0 (hash #f 1)))
(pass-if (= 0 (hash noop 1))))
(pass-if (= 0 (hash noop 1)))
(pass-if (= 0 (hash +inf.0 1)))
(pass-if (= 0 (hash -inf.0 1)))
(pass-if (= 0 (hash +nan.0 1))))
;;;
;;; hashv

View file

@ -572,21 +572,40 @@
eof))
(test-decoding-error (#xc2 #x41 #x42) "UTF-8"
(error ;; 41: should be in the 80..BF range
;; Section 3.9 of Unicode 6.0.0 reads:
;; "If the converter encounters an ill-formed UTF-8 code unit
;; sequence which starts with a valid first byte, but which does
;; not continue with valid successor bytes (see Table 3-7), it
;; must not consume the successor bytes".
;; Glibc/libiconv do not conform to it and instead swallow the
;; #x41. This example appears literally in Section 3.9.
(error ;; 41: invalid successor
#\A ;; 41: valid starting byte
#\B
eof))
(test-decoding-error (#xe0 #x88 #x88) "UTF-8"
(test-decoding-error (#xf0 #x80 #x80 #x41) "UTF-8"
;; According to Unicode 6.0.0, Section 3.9, "the only formal
;; requirement mandated by Unicode conformance for a converter is
;; that the <41> be processed and correctly interpreted as
;; <U+0041>".
(error ;; 2nd byte should be in the A0..BF range
error ;; 80: not a valid starting byte
error ;; 80: not a valid starting byte
#\A
eof))
(test-decoding-error (#xe0 #xa0 #x41 #x42) "UTF-8"
(error ;; 3rd byte should be in the 80..BF range
#\A
#\B
eof))
(test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8"
(error ;; 2nd byte should be in the 90..BF range
error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte
eof))))
(with-test-prefix "call-with-output-string"

View file

@ -301,6 +301,13 @@
(alist (fold alist-cons '() keys values)))
(equal? alist (reverse (vhash-fold alist-cons '() vh)))))
(pass-if "vhash-fold-right"
(let* ((keys '(a b c d e f g d h i))
(values '(1 2 3 4 5 6 7 0 8 9))
(vh (fold vhash-cons vlist-null keys values))
(alist (fold alist-cons '() keys values)))
(equal? alist (vhash-fold-right alist-cons '() vh))))
(pass-if "alist->vhash"
(let* ((keys '(a b c d e f g d h i))
(values '(1 2 3 4 5 6 7 0 8 9))