mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-04-30 03:40:34 +02:00
* doc/ref/api-io.texi (I/O Extensions): Document read_wait_fd / write_wait_fd members. (Non-Blocking I/O): New section. * libguile/fports.c (fport_read, fport_write): Return -1 if the operation would block. (fport_wait_fd, scm_make_fptob): Add read/write wait-fd implementation. * libguile/ports-internal.h (scm_t_port_type): Add read_wait_fd / write_wait_fd. * libguile/ports.c (default_read_wait_fd, default_write_wait_fd): New functions. (scm_make_port_type): Initialize default read/write wait fd impls. (trampoline_to_c_read, trampoline_to_scm_read) (trampoline_to_c_write, trampoline_to_scm_write): To Scheme, a return of #f indicates EWOULDBLOCk. (scm_set_port_read_wait_fd, scm_set_port_write_wait_fd): New functions. (port_read_wait_fd, port_write_wait_fd, scm_port_read_wait_fd) (scm_port_write_wait_fd, port_poll, scm_port_poll): New functions. (scm_i_read_bytes, scm_i_write_bytes): Poll if the read or write would block. * libguile/ports.h (scm_set_port_read_wait_fd) (scm_set_port_write_wait_fd): Add declarations. * module/ice-9/ports.scm: Shunt port-poll and port-{read,write}-wait-fd to the internals module. * module/ice-9/sports.scm (current-write-waiter): (current-read-waiter): Implement. * test-suite/tests/ports.test: Adapt non-blocking test to new behavior. * NEWS: Add entry.
2554 lines
103 KiB
Text
2554 lines
103 KiB
Text
@c -*-texinfo-*-
|
|
@c This is part of the GNU Guile Reference Manual.
|
|
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
|
|
@c 2010, 2011, 2013 Free Software Foundation, Inc.
|
|
@c See the file guile.texi for copying conditions.
|
|
|
|
@node Input and Output
|
|
@section Input and Output
|
|
|
|
@menu
|
|
* Ports:: The idea of the port abstraction.
|
|
* Reading:: Procedures for reading from a port.
|
|
* Writing:: Procedures for writing to a port.
|
|
* Closing:: Procedures to close a port.
|
|
* Buffering:: Controlling when data is written to ports.
|
|
* Random Access:: Moving around a random access port.
|
|
* Line/Delimited:: Read and write lines or delimited text.
|
|
* Block Reading and Writing:: Reading and writing blocks of text.
|
|
* Default Ports:: Defaults for input, output and errors.
|
|
* Port Types:: Types of port and how to make them.
|
|
* R6RS I/O Ports:: The R6RS port API.
|
|
* I/O Extensions:: Implementing new port types in C.
|
|
* Non-Blocking I/O:: How Guile deals with EWOULDBLOCK.
|
|
* BOM Handling:: Handling of Unicode byte order marks.
|
|
@end menu
|
|
|
|
|
|
@node Ports
|
|
@subsection Ports
|
|
@cindex Port
|
|
|
|
Sequential input/output in Scheme is represented by operations on a
|
|
@dfn{port}. This chapter explains the operations that Guile provides
|
|
for working with ports.
|
|
|
|
Ports are created by opening, for instance @code{open-file} for a file
|
|
(@pxref{File Ports}). Other kinds of ports include @dfn{soft ports} and
|
|
@dfn{string ports} (@pxref{Soft Ports}, and @ref{String Ports}).
|
|
Characters or bytes can be read from an input port and written to an
|
|
output port, or both on an input/output port. A port can be closed
|
|
(@pxref{Closing}) when no longer required, after which any attempt to
|
|
read or write is an error.
|
|
|
|
Ports are garbage collected in the usual way (@pxref{Memory
|
|
Management}), and will be closed at that time if not already closed. In
|
|
this case any errors occurring in the close will not be reported.
|
|
Usually a program will want to explicitly close so as to be sure all its
|
|
operations have been successful, including any buffered writes
|
|
(@pxref{Buffering}). Of course if a program has abandoned something due
|
|
to an error or other condition then closing problems are probably not of
|
|
interest.
|
|
|
|
It is strongly recommended that file ports be closed explicitly when
|
|
no longer required. Most systems have limits on how many files can be
|
|
open, both on a per-process and a system-wide basis. A program that
|
|
uses many files should take care not to hit those limits. The same
|
|
applies to similar system resources such as pipes and sockets.
|
|
|
|
Note that automatic garbage collection is triggered only by memory
|
|
consumption, not by file or other resource usage, so a program cannot
|
|
rely on that to keep it away from system limits. An explicit call to
|
|
@code{gc} can of course be relied on to pick up unreferenced ports.
|
|
If program flow makes it hard to be certain when to close then this
|
|
may be an acceptable way to control resource usage.
|
|
|
|
All file access uses the ``LFS'' large file support functions when
|
|
available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
|
|
read and written on a 32-bit system.
|
|
|
|
Each port has an associated character encoding that controls how bytes
|
|
read from the port are converted to characters and controls how
|
|
characters written to the port are converted to bytes. When ports are
|
|
created, they inherit their character encoding from the current locale,
|
|
but, that can be modified after the port is created.
|
|
|
|
Currently, the ports only work with @emph{non-modal} encodings. Most
|
|
encodings are non-modal, meaning that the conversion of bytes to a
|
|
string doesn't depend on its context: the same byte sequence will always
|
|
return the same string. A couple of modal encodings are in common use,
|
|
like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
|
|
|
|
@cindex port conversion strategy
|
|
@cindex conversion strategy, port
|
|
@cindex decoding error
|
|
@cindex encoding error
|
|
Each port also has an associated conversion strategy, which determines
|
|
what to do when a Guile character can't be converted to the port's
|
|
encoded character representation for output. There are three possible
|
|
strategies: to raise an error, to replace the character with a hex
|
|
escape, or to replace the character with a substitute character. Port
|
|
conversion strategies are also used when decoding characters from an
|
|
input port.
|
|
|
|
Finally, all ports have associated input and output buffers, as
|
|
appropriate. Buffering is a common strategy to limit the overhead of
|
|
small reads and writes: without buffering, each character fetched from a
|
|
file would involve at least one call into the kernel, and maybe more
|
|
depending on the character and the encoding. Instead, Guile will batch
|
|
reads and writes into internal buffers. However, sometimes you want to
|
|
make output on a port show up immediately. @xref{Buffering}, for more
|
|
on interfaces to control port buffering.
|
|
|
|
@rnindex input-port?
|
|
@deffn {Scheme Procedure} input-port? x
|
|
@deffnx {C Function} scm_input_port_p (x)
|
|
Return @code{#t} if @var{x} is an input port, otherwise return
|
|
@code{#f}. Any object satisfying this predicate also satisfies
|
|
@code{port?}.
|
|
@end deffn
|
|
|
|
@rnindex output-port?
|
|
@deffn {Scheme Procedure} output-port? x
|
|
@deffnx {C Function} scm_output_port_p (x)
|
|
Return @code{#t} if @var{x} is an output port, otherwise return
|
|
@code{#f}. Any object satisfying this predicate also satisfies
|
|
@code{port?}.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port? x
|
|
@deffnx {C Function} scm_port_p (x)
|
|
Return a boolean indicating whether @var{x} is a port.
|
|
Equivalent to @code{(or (input-port? @var{x}) (output-port?
|
|
@var{x}))}.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} set-port-encoding! port enc
|
|
@deffnx {C Function} scm_set_port_encoding_x (port, enc)
|
|
Sets the character encoding that will be used to interpret all port I/O.
|
|
@var{enc} is a string containing the name of an encoding. Valid
|
|
encoding names are those
|
|
@url{http://www.iana.org/assignments/character-sets, defined by IANA}.
|
|
@end deffn
|
|
|
|
@defvr {Scheme Variable} %default-port-encoding
|
|
A fluid containing @code{#f} or the name of the encoding to
|
|
be used by default for newly created ports (@pxref{Fluids and Dynamic
|
|
States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
|
|
|
|
New ports are created with the encoding appropriate for the current
|
|
locale if @code{setlocale} has been called or the value specified by
|
|
this fluid otherwise.
|
|
@end defvr
|
|
|
|
@deffn {Scheme Procedure} port-encoding port
|
|
@deffnx {C Function} scm_port_encoding (port)
|
|
Returns, as a string, the character encoding that @var{port} uses to interpret
|
|
its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} set-port-conversion-strategy! port sym
|
|
@deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
|
|
Sets the behavior of Guile when outputting a character that is not
|
|
representable in the port's current encoding, or when Guile encounters a
|
|
decoding error when trying to read a character. @var{sym} can be either
|
|
@code{error}, @code{substitute}, or @code{escape}.
|
|
|
|
If @var{port} is an open port, the conversion error behavior
|
|
is set for that port. If it is @code{#f}, it is set as the
|
|
default behavior for any future ports that get created in
|
|
this thread.
|
|
@end deffn
|
|
|
|
For an output port, a there are three possible port conversion
|
|
strategies. The @code{error} strategy will throw an error when a
|
|
nonconvertible character is encountered. The @code{substitute} strategy
|
|
will replace nonconvertible characters with a question mark (@samp{?}).
|
|
Finally the @code{escape} strategy will print nonconvertible characters
|
|
as a hex escape, using the escaping that is recognized by Guile's string
|
|
syntax. Note that if the port's encoding is a Unicode encoding, like
|
|
@code{UTF-8}, then encoding errors are impossible.
|
|
|
|
For an input port, the @code{error} strategy will cause Guile to throw
|
|
an error if it encounters an invalid encoding, such as might happen if
|
|
you tried to read @code{ISO-8859-1} as @code{UTF-8}. The error is
|
|
thrown before advancing the read position. The @code{substitute}
|
|
strategy will replace the bad bytes with a U+FFFD replacement character,
|
|
in accordance with Unicode recommendations. When reading from an input
|
|
port, the @code{escape} strategy is treated as if it were @code{error}.
|
|
|
|
@deffn {Scheme Procedure} port-conversion-strategy port
|
|
@deffnx {C Function} scm_port_conversion_strategy (port)
|
|
Returns the behavior of the port when outputting a character that is not
|
|
representable in the port's current encoding.
|
|
|
|
If @var{port} is @code{#f}, then the current default behavior will be
|
|
returned. New ports will have this default behavior when they are
|
|
created.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Variable} %default-port-conversion-strategy
|
|
The fluid that defines the conversion strategy for newly created ports,
|
|
and for other conversion routines such as @code{scm_to_stringn},
|
|
@code{scm_from_stringn}, @code{string->pointer}, and
|
|
@code{pointer->string}.
|
|
|
|
Its value must be one of the symbols described above, with the same
|
|
semantics: @code{error}, @code{substitute}, or @code{escape}.
|
|
|
|
When Guile starts, its value is @code{substitute}.
|
|
|
|
Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
|
|
equivalent to @code{(fluid-set! %default-port-conversion-strategy
|
|
@var{sym})}.
|
|
@end deffn
|
|
|
|
|
|
@node Reading
|
|
@subsection Reading
|
|
@cindex Reading
|
|
|
|
These procedures pertain to reading characters and strings from
|
|
ports. To read general S-expressions from ports, @xref{Scheme Read}.
|
|
|
|
@rnindex eof-object?
|
|
@cindex End of file object
|
|
@deffn {Scheme Procedure} eof-object? x
|
|
@deffnx {C Function} scm_eof_object_p (x)
|
|
Return @code{#t} if @var{x} is an end-of-file object; otherwise
|
|
return @code{#f}.
|
|
@end deffn
|
|
|
|
@rnindex char-ready?
|
|
@deffn {Scheme Procedure} char-ready? [port]
|
|
@deffnx {C Function} scm_char_ready_p (port)
|
|
Return @code{#t} if a character is ready on input @var{port}
|
|
and return @code{#f} otherwise. If @code{char-ready?} returns
|
|
@code{#t} then the next @code{read-char} operation on
|
|
@var{port} is guaranteed not to hang. If @var{port} is a file
|
|
port at end of file then @code{char-ready?} returns @code{#t}.
|
|
|
|
@code{char-ready?} exists to make it possible for a
|
|
program to accept characters from interactive ports without
|
|
getting stuck waiting for input. Any input editors associated
|
|
with such ports must make sure that characters whose existence
|
|
has been asserted by @code{char-ready?} cannot be rubbed out.
|
|
If @code{char-ready?} were to return @code{#f} at end of file,
|
|
a port at end of file would be indistinguishable from an
|
|
interactive port that has no ready characters.
|
|
@end deffn
|
|
|
|
@rnindex read-char
|
|
@deffn {Scheme Procedure} read-char [port]
|
|
@deffnx {C Function} scm_read_char (port)
|
|
Return the next character available from @var{port}, updating @var{port}
|
|
to point to the following character. If no more characters are
|
|
available, the end-of-file object is returned. A decoding error, if
|
|
any, is handled in accordance with the port's conversion strategy.
|
|
@end deffn
|
|
|
|
@deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
|
|
Read up to @var{size} bytes from @var{port} and store them in
|
|
@var{buffer}. The return value is the number of bytes actually read,
|
|
which can be less than @var{size} if end-of-file has been reached.
|
|
|
|
Note that this function does not update @code{port-line} and
|
|
@code{port-column} below.
|
|
@end deftypefn
|
|
|
|
@rnindex peek-char
|
|
@deffn {Scheme Procedure} peek-char [port]
|
|
@deffnx {C Function} scm_peek_char (port)
|
|
Return the next character available from @var{port},
|
|
@emph{without} updating @var{port} to point to the following
|
|
character. If no more characters are available, the
|
|
end-of-file object is returned.
|
|
|
|
The value returned by
|
|
a call to @code{peek-char} is the same as the value that would
|
|
have been returned by a call to @code{read-char} on the same
|
|
port. The only difference is that the very next call to
|
|
@code{read-char} or @code{peek-char} on that @var{port} will
|
|
return the value returned by the preceding call to
|
|
@code{peek-char}. In particular, a call to @code{peek-char} on
|
|
an interactive port will hang waiting for input whenever a call
|
|
to @code{read-char} would have hung.
|
|
|
|
As for @code{read-char}, decoding errors are handled in accordance with
|
|
the port's conversion strategy.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} unread-char cobj [port]
|
|
@deffnx {C Function} scm_unread_char (cobj, port)
|
|
Place character @var{cobj} in @var{port} so that it will be read by the
|
|
next read operation. If called multiple times, the unread characters
|
|
will be read again in last-in first-out order. If @var{port} is
|
|
not supplied, the current input port is used.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} unread-string str port
|
|
@deffnx {C Function} scm_unread_string (str, port)
|
|
Place the string @var{str} in @var{port} so that its characters will
|
|
be read from left-to-right as the next characters from @var{port}
|
|
during subsequent read operations. If called multiple times, the
|
|
unread characters will be read again in last-in first-out order. If
|
|
@var{port} is not supplied, the @code{current-input-port} is used.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} drain-input port
|
|
@deffnx {C Function} scm_drain_input (port)
|
|
This procedure clears a port's input buffers, similar
|
|
to the way that force-output clears the output buffer. The
|
|
contents of the buffers are returned as a single string, e.g.,
|
|
|
|
@lisp
|
|
(define p (open-input-file ...))
|
|
(drain-input p) => empty string, nothing buffered yet.
|
|
(unread-char (read-char p) p)
|
|
(drain-input p) => initial chars from p, up to the buffer size.
|
|
@end lisp
|
|
|
|
Draining the buffers may be useful for cleanly finishing
|
|
buffered I/O so that the file descriptor can be used directly
|
|
for further input.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-column port
|
|
@deffnx {Scheme Procedure} port-line port
|
|
@deffnx {C Function} scm_port_column (port)
|
|
@deffnx {C Function} scm_port_line (port)
|
|
Return the current column number or line number of @var{port}.
|
|
If the number is
|
|
unknown, the result is #f. Otherwise, the result is a 0-origin integer
|
|
- i.e.@: the first character of the first line is line 0, column 0.
|
|
(However, when you display a file position, for example in an error
|
|
message, we recommend you add 1 to get 1-origin integers. This is
|
|
because lines and column numbers traditionally start with 1, and that is
|
|
what non-programmers will find most natural.)
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} set-port-column! port column
|
|
@deffnx {Scheme Procedure} set-port-line! port line
|
|
@deffnx {C Function} scm_set_port_column_x (port, column)
|
|
@deffnx {C Function} scm_set_port_line_x (port, line)
|
|
Set the current column or line number of @var{port}.
|
|
@end deffn
|
|
|
|
@node Writing
|
|
@subsection Writing
|
|
@cindex Writing
|
|
|
|
These procedures are for writing characters and strings to
|
|
ports. For more information on writing arbitrary Scheme objects to
|
|
ports, @xref{Scheme Write}.
|
|
|
|
@deffn {Scheme Procedure} get-print-state port
|
|
@deffnx {C Function} scm_get_print_state (port)
|
|
Return the print state of the port @var{port}. If @var{port}
|
|
has no associated print state, @code{#f} is returned.
|
|
@end deffn
|
|
|
|
@rnindex newline
|
|
@deffn {Scheme Procedure} newline [port]
|
|
@deffnx {C Function} scm_newline (port)
|
|
Send a newline to @var{port}.
|
|
If @var{port} is omitted, send to the current output port.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-with-print-state port [pstate]
|
|
@deffnx {C Function} scm_port_with_print_state (port, pstate)
|
|
Create a new port which behaves like @var{port}, but with an
|
|
included print state @var{pstate}. @var{pstate} is optional.
|
|
If @var{pstate} isn't supplied and @var{port} already has
|
|
a print state, the old print state is reused.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} simple-format destination message . args
|
|
@deffnx {C Function} scm_simple_format (destination, message, args)
|
|
Write @var{message} to @var{destination}, defaulting to the current
|
|
output port. @var{message} can contain @code{~A} and @code{~S} escapes.
|
|
When printed, the escapes are replaced with corresponding members of
|
|
@var{args}: @code{~A} formats using @code{display} and @code{~S} formats
|
|
using @code{write}. If @var{destination} is @code{#t}, then use the
|
|
current output port, if @var{destination} is @code{#f}, then return a
|
|
string containing the formatted text. Does not add a trailing newline.
|
|
@end deffn
|
|
|
|
@rnindex write-char
|
|
@deffn {Scheme Procedure} write-char chr [port]
|
|
@deffnx {C Function} scm_write_char (chr, port)
|
|
Send character @var{chr} to @var{port}.
|
|
@end deffn
|
|
|
|
@deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
|
|
Write @var{size} bytes at @var{buffer} to @var{port}.
|
|
|
|
Note that this function does not update @code{port-line} and
|
|
@code{port-column} (@pxref{Reading}).
|
|
@end deftypefn
|
|
|
|
@deftypefn {C Function} void scm_lfwrite (const char *buffer, size_t size, SCM port)
|
|
Write @var{size} bytes at @var{buffer} to @var{port}. The @code{lf}
|
|
indicates that unlike @code{scm_c_write}, this function updates the
|
|
port's @code{port-line} and @code{port-column}, and also flushes the
|
|
port if the data contains a newline (@code{\n}) and the port is
|
|
line-buffered.
|
|
@end deftypefn
|
|
|
|
@findex fflush
|
|
@deffn {Scheme Procedure} force-output [port]
|
|
@deffnx {C Function} scm_force_output (port)
|
|
Flush the specified output port, or the current output port if @var{port}
|
|
is omitted. The current output buffer contents are passed to the
|
|
underlying port implementation (e.g., in the case of fports, the
|
|
data will be written to the file and the output buffer will be cleared.)
|
|
It has no effect on an unbuffered port.
|
|
|
|
The return value is unspecified.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} flush-all-ports
|
|
@deffnx {C Function} scm_flush_all_ports ()
|
|
Equivalent to calling @code{force-output} on
|
|
all open output ports. The return value is unspecified.
|
|
@end deffn
|
|
|
|
|
|
@node Closing
|
|
@subsection Closing
|
|
@cindex Closing ports
|
|
@cindex Port, close
|
|
|
|
@deffn {Scheme Procedure} close-port port
|
|
@deffnx {C Function} scm_close_port (port)
|
|
Close the specified port object. Return @code{#t} if it successfully
|
|
closes a port or @code{#f} if it was already closed. An exception may
|
|
be raised if an error occurs, for example when flushing buffered output.
|
|
@xref{Buffering}, for more on buffered output. See also @ref{Ports and
|
|
File Descriptors, close}, for a procedure which can close file
|
|
descriptors.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} close-input-port port
|
|
@deffnx {Scheme Procedure} close-output-port port
|
|
@deffnx {C Function} scm_close_input_port (port)
|
|
@deffnx {C Function} scm_close_output_port (port)
|
|
@rnindex close-input-port
|
|
@rnindex close-output-port
|
|
Close the specified input or output @var{port}. An exception may be
|
|
raised if an error occurs while closing. If @var{port} is already
|
|
closed, nothing is done. The return value is unspecified.
|
|
|
|
See also @ref{Ports and File Descriptors, close}, for a procedure
|
|
which can close file descriptors.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-closed? port
|
|
@deffnx {C Function} scm_port_closed_p (port)
|
|
Return @code{#t} if @var{port} is closed or @code{#f} if it is
|
|
open.
|
|
@end deffn
|
|
|
|
|
|
@node Buffering
|
|
@subsection Buffering
|
|
@cindex Port, buffering
|
|
|
|
Every port has associated input and output buffers. You can think of
|
|
ports as being backed by some mutable store, and that store might be far
|
|
away. For example, ports backed by file descriptors have to go all the
|
|
way to the kernel to read and write their data. To avoid this
|
|
round-trip cost, Guile usually reads in data from the mutable store in
|
|
chunks, and then services small requests like @code{get-char} out of
|
|
that intermediate buffer. Similarly, small writes like
|
|
@code{write-char} first go to a buffer, and are sent to the store when
|
|
the buffer is full (or when port is flushed). Buffered ports speed up
|
|
your program by reducing the number of round-trips to the mutable store,
|
|
and the do so in a way that is mostly transparent to the user.
|
|
|
|
There are two major ways, however, in which buffering affects program
|
|
semantics. Building correct, performant programs requires understanding
|
|
these situations.
|
|
|
|
The first case is in random-access read/write ports (@pxref{Random
|
|
Access}). These ports, usually backed by a file, logically operate over
|
|
the same mutable store when both reading and writing. So, if you read a
|
|
character, causing the buffer to fill, then write a character, the bytes
|
|
you filled in your read buffer are now invalid. Every time you switch
|
|
between reading and writing, Guile has to flush any pending buffer. If
|
|
this happens frequently, the cost can be high. In that case you should
|
|
reduce the amount that you buffer, in both directions. Similarly, Guile
|
|
has to flush buffers before seeking. None of these considerations apply
|
|
to sockets, which don't logically read from and write to the same
|
|
mutable store, and are not seekable. Note also that sockets are
|
|
unbuffered by default. @xref{Network Sockets and Communication}.
|
|
|
|
The second case is the more pernicious one. If you write data to a
|
|
buffered port, it probably hasn't gone out to the mutable store yet.
|
|
(This ``probably'' introduces some indeterminism in your program: what
|
|
goes to the store, and when, depends on how full the buffer is. It is
|
|
something that the user needs to explicitly be aware of.) The data is
|
|
written to the store later -- when the buffer fills up due to another
|
|
write, or when @code{force-output} is called, or when @code{close-port}
|
|
is called, or when the program exits, or even when the garbage collector
|
|
runs. The salient point is, @emph{the errors are signalled then too}.
|
|
Buffered writes defer error detection (and defer the side effects to the
|
|
mutable store), perhaps indefinitely if the port type does not need to
|
|
be closed at GC.
|
|
|
|
One common heuristic that works well for textual ports is to flush
|
|
output when a newline (@code{\n}) is written. This @dfn{line buffering}
|
|
mode is on by default for TTY ports. Most other ports are @dfn{block
|
|
buffered}, meaning that once the output buffer reaches the block size,
|
|
which depends on the port and its configuration, the output is flushed
|
|
as a block, without regard to what is in the block. Likewise reads are
|
|
read in at the block size, though if there are fewer bytes available to
|
|
read, the buffer may not be entirely filled.
|
|
|
|
Note that binary reads or writes that are larger than the buffer size go
|
|
directly to the mutable store without passing through the buffers. If
|
|
your access pattern involves many big reads or writes, buffering might
|
|
not matter so much to you.
|
|
|
|
To control the buffering behavior of a port, use @code{setvbuf}.
|
|
|
|
@deffn {Scheme Procedure} setvbuf port mode [size]
|
|
@deffnx {C Function} scm_setvbuf (port, mode, size)
|
|
@cindex port buffering
|
|
Set the buffering mode for @var{port}. @var{mode} can be one of the
|
|
following symbols:
|
|
|
|
@table @code
|
|
@item none
|
|
non-buffered
|
|
@item line
|
|
line buffered
|
|
@item block
|
|
block buffered, using a newly allocated buffer of @var{size} bytes.
|
|
If @var{size} is omitted, a default size will be used.
|
|
@end table
|
|
@end deffn
|
|
|
|
Another way to set the buffering, for file ports, is to open the file
|
|
with @code{0} or @code{l} as part of the mode string, for unbuffered or
|
|
line-buffered ports, respectively. @xref{File Ports}, for more.
|
|
|
|
All of these considerations are very similar to those of streams in the
|
|
C library, although Guile's ports are not built on top of C streams.
|
|
Still, it is useful to read what other systems do.
|
|
@xref{Streams,,,libc,The GNU C Library Reference Manual}, for more
|
|
discussion on C streams.
|
|
|
|
|
|
@node Random Access
|
|
@subsection Random Access
|
|
@cindex Random access, ports
|
|
@cindex Port, random access
|
|
|
|
@deffn {Scheme Procedure} seek fd_port offset whence
|
|
@deffnx {C Function} scm_seek (fd_port, offset, whence)
|
|
Sets the current position of @var{fd_port} to the integer
|
|
@var{offset}. For a file port, @var{offset} is expressed
|
|
as a number of bytes; for other types of ports, such as string
|
|
ports, @var{offset} is an abstract representation of the
|
|
position within the port's data, not necessarily expressed
|
|
as a number of bytes. @var{offset} is interpreted according to
|
|
the value of @var{whence}.
|
|
|
|
One of the following variables should be supplied for
|
|
@var{whence}:
|
|
@defvar SEEK_SET
|
|
Seek from the beginning of the file.
|
|
@end defvar
|
|
@defvar SEEK_CUR
|
|
Seek from the current position.
|
|
@end defvar
|
|
@defvar SEEK_END
|
|
Seek from the end of the file.
|
|
@end defvar
|
|
If @var{fd_port} is a file descriptor, the underlying system
|
|
call is @code{lseek}. @var{port} may be a string port.
|
|
|
|
The value returned is the new position in @var{fd_port}. This means
|
|
that the current position of a port can be obtained using:
|
|
@lisp
|
|
(seek port 0 SEEK_CUR)
|
|
@end lisp
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} ftell fd_port
|
|
@deffnx {C Function} scm_ftell (fd_port)
|
|
Return an integer representing the current position of
|
|
@var{fd_port}, measured from the beginning. Equivalent to:
|
|
|
|
@lisp
|
|
(seek port 0 SEEK_CUR)
|
|
@end lisp
|
|
@end deffn
|
|
|
|
@findex truncate
|
|
@findex ftruncate
|
|
@deffn {Scheme Procedure} truncate-file file [length]
|
|
@deffnx {C Function} scm_truncate_file (file, length)
|
|
Truncate @var{file} to @var{length} bytes. @var{file} can be a
|
|
filename string, a port object, or an integer file descriptor. The
|
|
return value is unspecified.
|
|
|
|
For a port or file descriptor @var{length} can be omitted, in which
|
|
case the file is truncated at the current position (per @code{ftell}
|
|
above).
|
|
|
|
On most systems a file can be extended by giving a length greater than
|
|
the current size, but this is not mandatory in the POSIX standard.
|
|
@end deffn
|
|
|
|
@node Line/Delimited
|
|
@subsection Line Oriented and Delimited Text
|
|
@cindex Line input/output
|
|
@cindex Port, line input/output
|
|
|
|
The delimited-I/O module can be accessed with:
|
|
|
|
@lisp
|
|
(use-modules (ice-9 rdelim))
|
|
@end lisp
|
|
|
|
It can be used to read or write lines of text, or read text delimited by
|
|
a specified set of characters. It's similar to the @code{(scsh rdelim)}
|
|
module from guile-scsh, but does not use multiple values or character
|
|
sets and has an extra procedure @code{write-line}.
|
|
|
|
@c begin (scm-doc-string "rdelim.scm" "read-line")
|
|
@deffn {Scheme Procedure} read-line [port] [handle-delim]
|
|
Return a line of text from @var{port} if specified, otherwise from the
|
|
value returned by @code{(current-input-port)}. Under Unix, a line of text
|
|
is terminated by the first end-of-line character or by end-of-file.
|
|
|
|
If @var{handle-delim} is specified, it should be one of the following
|
|
symbols:
|
|
@table @code
|
|
@item trim
|
|
Discard the terminating delimiter. This is the default, but it will
|
|
be impossible to tell whether the read terminated with a delimiter or
|
|
end-of-file.
|
|
@item concat
|
|
Append the terminating delimiter (if any) to the returned string.
|
|
@item peek
|
|
Push the terminating delimiter (if any) back on to the port.
|
|
@item split
|
|
Return a pair containing the string read from the port and the
|
|
terminating delimiter or end-of-file object.
|
|
@end table
|
|
@end deffn
|
|
|
|
@c begin (scm-doc-string "rdelim.scm" "read-line!")
|
|
@deffn {Scheme Procedure} read-line! buf [port]
|
|
Read a line of text into the supplied string @var{buf} and return the
|
|
number of characters added to @var{buf}. If @var{buf} is filled, then
|
|
@code{#f} is returned.
|
|
Read from @var{port} if
|
|
specified, otherwise from the value returned by @code{(current-input-port)}.
|
|
@end deffn
|
|
|
|
@c begin (scm-doc-string "rdelim.scm" "read-delimited")
|
|
@deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
|
|
Read text until one of the characters in the string @var{delims} is found
|
|
or end-of-file is reached. Read from @var{port} if supplied, otherwise
|
|
from the value returned by @code{(current-input-port)}.
|
|
@var{handle-delim} takes the same values as described for @code{read-line}.
|
|
@end deffn
|
|
|
|
@c begin (scm-doc-string "rdelim.scm" "read-delimited!")
|
|
@deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
|
|
Read text into the supplied string @var{buf}.
|
|
|
|
If a delimiter was found, return the number of characters written,
|
|
except if @var{handle-delim} is @code{split}, in which case the return
|
|
value is a pair, as noted above.
|
|
|
|
As a special case, if @var{port} was already at end-of-stream, the EOF
|
|
object is returned. Also, if no characters were written because the
|
|
buffer was full, @code{#f} is returned.
|
|
|
|
It's something of a wacky interface, to be honest.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} write-line obj [port]
|
|
@deffnx {C Function} scm_write_line (obj, port)
|
|
Display @var{obj} and a newline character to @var{port}. If
|
|
@var{port} is not specified, @code{(current-output-port)} is
|
|
used. This function is equivalent to:
|
|
@lisp
|
|
(display obj [port])
|
|
(newline [port])
|
|
@end lisp
|
|
@end deffn
|
|
|
|
In the past, Guile did not have a procedure that would just read out all
|
|
of the characters from a port. As a workaround, many people just called
|
|
@code{read-delimited} with no delimiters, knowing that would produce the
|
|
behavior they wanted. This prompted Guile developers to add some
|
|
routines that would read all characters from a port. So it is that
|
|
@code{(ice-9 rdelim)} is also the home for procedures that can reading
|
|
undelimited text:
|
|
|
|
@deffn {Scheme Procedure} read-string [port] [count]
|
|
Read all of the characters out of @var{port} and return them as a
|
|
string. If the @var{count} is present, treat it as a limit to the
|
|
number of characters to read.
|
|
|
|
By default, read from the current input port, with no size limit on the
|
|
result. This procedure always returns a string, even if no characters
|
|
were read.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} read-string! buf [port] [start] [end]
|
|
Fill @var{buf} with characters read from @var{port}, defaulting to the
|
|
current input port. Return the number of characters read.
|
|
|
|
If @var{start} or @var{end} are specified, store data only into the
|
|
substring of @var{str} bounded by @var{start} and @var{end} (which
|
|
default to the beginning and end of the string, respectively).
|
|
@end deffn
|
|
|
|
Some of the aforementioned I/O functions rely on the following C
|
|
primitives. These will mainly be of interest to people hacking Guile
|
|
internals.
|
|
|
|
@deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
|
|
@deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
|
|
Read characters from @var{port} into @var{str} until one of the
|
|
characters in the @var{delims} string is encountered. If
|
|
@var{gobble} is true, discard the delimiter character;
|
|
otherwise, leave it in the input stream for the next read. If
|
|
@var{port} is not specified, use the value of
|
|
@code{(current-input-port)}. If @var{start} or @var{end} are
|
|
specified, store data only into the substring of @var{str}
|
|
bounded by @var{start} and @var{end} (which default to the
|
|
beginning and end of the string, respectively).
|
|
|
|
Return a pair consisting of the delimiter that terminated the
|
|
string and the number of characters read. If reading stopped
|
|
at the end of file, the delimiter returned is the
|
|
@var{eof-object}; if the string was filled without encountering
|
|
a delimiter, this value is @code{#f}.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} %read-line [port]
|
|
@deffnx {C Function} scm_read_line (port)
|
|
Read a newline-terminated line from @var{port}, allocating storage as
|
|
necessary. The newline terminator (if any) is removed from the string,
|
|
and a pair consisting of the line and its delimiter is returned. The
|
|
delimiter may be either a newline or the @var{eof-object}; if
|
|
@code{%read-line} is called at the end of file, it returns the pair
|
|
@code{(#<eof> . #<eof>)}.
|
|
@end deffn
|
|
|
|
@node Block Reading and Writing
|
|
@subsection Block reading and writing
|
|
@cindex Block read/write
|
|
@cindex Port, block read/write
|
|
|
|
The Block-string-I/O module can be accessed with:
|
|
|
|
@lisp
|
|
(use-modules (ice-9 rw))
|
|
@end lisp
|
|
|
|
It currently contains procedures that help to implement the
|
|
@code{(scsh rw)} module in guile-scsh.
|
|
|
|
@deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
|
|
@deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
|
|
Read characters from a port or file descriptor into a
|
|
string @var{str}. A port must have an underlying file
|
|
descriptor --- a so-called fport. This procedure is
|
|
scsh-compatible and can efficiently read large strings.
|
|
It will:
|
|
|
|
@itemize
|
|
@item
|
|
attempt to fill the entire string, unless the @var{start}
|
|
and/or @var{end} arguments are supplied. i.e., @var{start}
|
|
defaults to 0 and @var{end} defaults to
|
|
@code{(string-length str)}
|
|
@item
|
|
use the current input port if @var{port_or_fdes} is not
|
|
supplied.
|
|
@item
|
|
return fewer than the requested number of characters in some
|
|
cases, e.g., on end of file, if interrupted by a signal, or if
|
|
not all the characters are immediately available.
|
|
@item
|
|
wait indefinitely for some input if no characters are
|
|
currently available,
|
|
unless the port is in non-blocking mode.
|
|
@item
|
|
read characters from the port's input buffers if available,
|
|
instead from the underlying file descriptor.
|
|
@item
|
|
return @code{#f} if end-of-file is encountered before reading
|
|
any characters, otherwise return the number of characters
|
|
read.
|
|
@item
|
|
return 0 if the port is in non-blocking mode and no characters
|
|
are immediately available.
|
|
@item
|
|
return 0 if the request is for 0 bytes, with no
|
|
end-of-file check.
|
|
@end itemize
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
|
|
@deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
|
|
Write characters from a string @var{str} to a port or file
|
|
descriptor. A port must have an underlying file descriptor
|
|
--- a so-called fport. This procedure is
|
|
scsh-compatible and can efficiently write large strings.
|
|
It will:
|
|
|
|
@itemize
|
|
@item
|
|
attempt to write the entire string, unless the @var{start}
|
|
and/or @var{end} arguments are supplied. i.e., @var{start}
|
|
defaults to 0 and @var{end} defaults to
|
|
@code{(string-length str)}
|
|
@item
|
|
use the current output port if @var{port_of_fdes} is not
|
|
supplied.
|
|
@item
|
|
in the case of a buffered port, store the characters in the
|
|
port's output buffer, if all will fit. If they will not fit
|
|
then any existing buffered characters will be flushed
|
|
before attempting
|
|
to write the new characters directly to the underlying file
|
|
descriptor. If the port is in non-blocking mode and
|
|
buffered characters can not be flushed immediately, then an
|
|
@code{EAGAIN} system-error exception will be raised (Note:
|
|
scsh does not support the use of non-blocking buffered ports.)
|
|
@item
|
|
write fewer than the requested number of
|
|
characters in some cases, e.g., if interrupted by a signal or
|
|
if not all of the output can be accepted immediately.
|
|
@item
|
|
wait indefinitely for at least one character
|
|
from @var{str} to be accepted by the port, unless the port is
|
|
in non-blocking mode.
|
|
@item
|
|
return the number of characters accepted by the port.
|
|
@item
|
|
return 0 if the port is in non-blocking mode and can not accept
|
|
at least one character from @var{str} immediately
|
|
@item
|
|
return 0 immediately if the request size is 0 bytes.
|
|
@end itemize
|
|
@end deffn
|
|
|
|
@node Default Ports
|
|
@subsection Default Ports for Input, Output and Errors
|
|
@cindex Default ports
|
|
@cindex Port, default
|
|
|
|
@rnindex current-input-port
|
|
@deffn {Scheme Procedure} current-input-port
|
|
@deffnx {C Function} scm_current_input_port ()
|
|
@cindex standard input
|
|
Return the current input port. This is the default port used
|
|
by many input procedures.
|
|
|
|
Initially this is the @dfn{standard input} in Unix and C terminology.
|
|
When the standard input is a tty the port is unbuffered, otherwise
|
|
it's fully buffered.
|
|
|
|
Unbuffered input is good if an application runs an interactive
|
|
subprocess, since any type-ahead input won't go into Guile's buffer
|
|
and be unavailable to the subprocess.
|
|
|
|
Note that Guile buffering is completely separate from the tty ``line
|
|
discipline''. In the usual cooked mode on a tty Guile only sees a
|
|
line of input once the user presses @key{Return}.
|
|
@end deffn
|
|
|
|
@rnindex current-output-port
|
|
@deffn {Scheme Procedure} current-output-port
|
|
@deffnx {C Function} scm_current_output_port ()
|
|
@cindex standard output
|
|
Return the current output port. This is the default port used
|
|
by many output procedures.
|
|
|
|
Initially this is the @dfn{standard output} in Unix and C terminology.
|
|
When the standard output is a tty this port is unbuffered, otherwise
|
|
it's fully buffered.
|
|
|
|
Unbuffered output to a tty is good for ensuring progress output or a
|
|
prompt is seen. But an application which always prints whole lines
|
|
could change to line buffered, or an application with a lot of output
|
|
could go fully buffered and perhaps make explicit @code{force-output}
|
|
calls (@pxref{Writing}) at selected points.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} current-error-port
|
|
@deffnx {C Function} scm_current_error_port ()
|
|
@cindex standard error output
|
|
Return the port to which errors and warnings should be sent.
|
|
|
|
Initially this is the @dfn{standard error} in Unix and C terminology.
|
|
When the standard error is a tty this port is unbuffered, otherwise
|
|
it's fully buffered.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} set-current-input-port port
|
|
@deffnx {Scheme Procedure} set-current-output-port port
|
|
@deffnx {Scheme Procedure} set-current-error-port port
|
|
@deffnx {C Function} scm_set_current_input_port (port)
|
|
@deffnx {C Function} scm_set_current_output_port (port)
|
|
@deffnx {C Function} scm_set_current_error_port (port)
|
|
Change the ports returned by @code{current-input-port},
|
|
@code{current-output-port} and @code{current-error-port}, respectively,
|
|
so that they use the supplied @var{port} for input or output.
|
|
@end deffn
|
|
|
|
@deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
|
|
@deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
|
|
@deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
|
|
These functions must be used inside a pair of calls to
|
|
@code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
|
|
Wind}). During the dynwind context, the indicated port is set to
|
|
@var{port}.
|
|
|
|
More precisely, the current port is swapped with a `backup' value
|
|
whenever the dynwind context is entered or left. The backup value is
|
|
initialized with the @var{port} argument.
|
|
@end deftypefn
|
|
|
|
@node Port Types
|
|
@subsection Types of Port
|
|
@cindex Types of ports
|
|
@cindex Port, types
|
|
|
|
@menu
|
|
* File Ports:: Ports on an operating system file.
|
|
* String Ports:: Ports on a Scheme string.
|
|
* Soft Ports:: Ports on arbitrary Scheme procedures.
|
|
* Void Ports:: Ports on nothing at all.
|
|
@end menu
|
|
|
|
|
|
@node File Ports
|
|
@subsubsection File Ports
|
|
@cindex File port
|
|
@cindex Port, file
|
|
|
|
The following procedures are used to open file ports.
|
|
See also @ref{Ports and File Descriptors, open}, for an interface
|
|
to the Unix @code{open} system call.
|
|
|
|
Most systems have limits on how many files can be open, so it's
|
|
strongly recommended that file ports be closed explicitly when no
|
|
longer required (@pxref{Ports}).
|
|
|
|
@deffn {Scheme Procedure} open-file filename mode @
|
|
[#:guess-encoding=#f] [#:encoding=#f]
|
|
@deffnx {C Function} scm_open_file_with_encoding @
|
|
(filename, mode, guess_encoding, encoding)
|
|
@deffnx {C Function} scm_open_file (filename, mode)
|
|
Open the file whose name is @var{filename}, and return a port
|
|
representing that file. The attributes of the port are
|
|
determined by the @var{mode} string. The way in which this is
|
|
interpreted is similar to C stdio. The first character must be
|
|
one of the following:
|
|
|
|
@table @samp
|
|
@item r
|
|
Open an existing file for input.
|
|
@item w
|
|
Open a file for output, creating it if it doesn't already exist
|
|
or removing its contents if it does.
|
|
@item a
|
|
Open a file for output, creating it if it doesn't already
|
|
exist. All writes to the port will go to the end of the file.
|
|
The "append mode" can be turned off while the port is in use
|
|
@pxref{Ports and File Descriptors, fcntl}
|
|
@end table
|
|
|
|
The following additional characters can be appended:
|
|
|
|
@table @samp
|
|
@item +
|
|
Open the port for both input and output. E.g., @code{r+}: open
|
|
an existing file for both input and output.
|
|
@item 0
|
|
Create an "unbuffered" port. In this case input and output
|
|
operations are passed directly to the underlying port
|
|
implementation without additional buffering. This is likely to
|
|
slow down I/O operations. The buffering mode can be changed
|
|
while a port is in use (@pxref{Buffering}).
|
|
@item l
|
|
Add line-buffering to the port. The port output buffer will be
|
|
automatically flushed whenever a newline character is written.
|
|
@item b
|
|
Use binary mode, ensuring that each byte in the file will be read as one
|
|
Scheme character.
|
|
|
|
To provide this property, the file will be opened with the 8-bit
|
|
character encoding "ISO-8859-1", ignoring the default port encoding.
|
|
@xref{Ports}, for more information on port encodings.
|
|
|
|
Note that while it is possible to read and write binary data as
|
|
characters or strings, it is usually better to treat bytes as octets,
|
|
and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
|
|
@ref{R6RS Binary Output}, for more.
|
|
|
|
This option had another historical meaning, for DOS compatibility: in
|
|
the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
|
|
The @code{b} flag prevents this from happening, adding @code{O_BINARY}
|
|
to the underlying @code{open} call. Still, the flag is generally useful
|
|
because of its port encoding ramifications.
|
|
@end table
|
|
|
|
Unless binary mode is requested, the character encoding of the new port
|
|
is determined as follows: First, if @var{guess-encoding} is true, the
|
|
@code{file-encoding} procedure is used to guess the encoding of the file
|
|
(@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
|
|
is false or if @code{file-encoding} fails, @var{encoding} is used unless
|
|
it is also false. As a last resort, the default port encoding is used.
|
|
@xref{Ports}, for more information on port encodings. It is an error to
|
|
pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
|
|
is requested.
|
|
|
|
If a file cannot be opened with the access requested, @code{open-file}
|
|
throws an exception.
|
|
@end deffn
|
|
|
|
@rnindex open-input-file
|
|
@deffn {Scheme Procedure} open-input-file filename @
|
|
[#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
|
|
|
|
Open @var{filename} for input. If @var{binary} is true, open the port
|
|
in binary mode, otherwise use text mode. @var{encoding} and
|
|
@var{guess-encoding} determine the character encoding as described above
|
|
for @code{open-file}. Equivalent to
|
|
@lisp
|
|
(open-file @var{filename}
|
|
(if @var{binary} "rb" "r")
|
|
#:guess-encoding @var{guess-encoding}
|
|
#:encoding @var{encoding})
|
|
@end lisp
|
|
@end deffn
|
|
|
|
@rnindex open-output-file
|
|
@deffn {Scheme Procedure} open-output-file filename @
|
|
[#:encoding=#f] [#:binary=#f]
|
|
|
|
Open @var{filename} for output. If @var{binary} is true, open the port
|
|
in binary mode, otherwise use text mode. @var{encoding} specifies the
|
|
character encoding as described above for @code{open-file}. Equivalent
|
|
to
|
|
@lisp
|
|
(open-file @var{filename}
|
|
(if @var{binary} "wb" "w")
|
|
#:encoding @var{encoding})
|
|
@end lisp
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} call-with-input-file filename proc @
|
|
[#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
|
|
@deffnx {Scheme Procedure} call-with-output-file filename proc @
|
|
[#:encoding=#f] [#:binary=#f]
|
|
@rnindex call-with-input-file
|
|
@rnindex call-with-output-file
|
|
Open @var{filename} for input or output, and call @code{(@var{proc}
|
|
port)} with the resulting port. Return the value returned by
|
|
@var{proc}. @var{filename} is opened as per @code{open-input-file} or
|
|
@code{open-output-file} respectively, and an error is signaled if it
|
|
cannot be opened.
|
|
|
|
When @var{proc} returns, the port is closed. If @var{proc} does not
|
|
return (e.g.@: if it throws an error), then the port might not be
|
|
closed automatically, though it will be garbage collected in the usual
|
|
way if not otherwise referenced.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} with-input-from-file filename thunk @
|
|
[#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
|
|
@deffnx {Scheme Procedure} with-output-to-file filename thunk @
|
|
[#:encoding=#f] [#:binary=#f]
|
|
@deffnx {Scheme Procedure} with-error-to-file filename thunk @
|
|
[#:encoding=#f] [#:binary=#f]
|
|
@rnindex with-input-from-file
|
|
@rnindex with-output-to-file
|
|
Open @var{filename} and call @code{(@var{thunk})} with the new port
|
|
setup as respectively the @code{current-input-port},
|
|
@code{current-output-port}, or @code{current-error-port}. Return the
|
|
value returned by @var{thunk}. @var{filename} is opened as per
|
|
@code{open-input-file} or @code{open-output-file} respectively, and an
|
|
error is signaled if it cannot be opened.
|
|
|
|
When @var{thunk} returns, the port is closed and the previous setting
|
|
of the respective current port is restored.
|
|
|
|
The current port setting is managed with @code{dynamic-wind}, so the
|
|
previous value is restored no matter how @var{thunk} exits (eg.@: an
|
|
exception), and if @var{thunk} is re-entered (via a captured
|
|
continuation) then it's set again to the @var{filename} port.
|
|
|
|
The port is closed when @var{thunk} returns normally, but not when
|
|
exited via an exception or new continuation. This ensures it's still
|
|
ready for use if @var{thunk} is re-entered by a captured continuation.
|
|
Of course the port is always garbage collected and closed in the usual
|
|
way when no longer referenced anywhere.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-mode port
|
|
@deffnx {C Function} scm_port_mode (port)
|
|
Return the port modes associated with the open port @var{port}.
|
|
These will not necessarily be identical to the modes used when
|
|
the port was opened, since modes such as "append" which are
|
|
used only during port creation are not retained.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-filename port
|
|
@deffnx {C Function} scm_port_filename (port)
|
|
Return the filename associated with @var{port}, or @code{#f} if no
|
|
filename is associated with the port.
|
|
|
|
@var{port} must be open, @code{port-filename} cannot be used once the
|
|
port is closed.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} set-port-filename! port filename
|
|
@deffnx {C Function} scm_set_port_filename_x (port, filename)
|
|
Change the filename associated with @var{port}, using the current input
|
|
port if none is specified. Note that this does not change the port's
|
|
source of data, but only the value that is returned by
|
|
@code{port-filename} and reported in diagnostic output.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} file-port? obj
|
|
@deffnx {C Function} scm_file_port_p (obj)
|
|
Determine whether @var{obj} is a port that is related to a file.
|
|
@end deffn
|
|
|
|
|
|
@node String Ports
|
|
@subsubsection String Ports
|
|
@cindex String port
|
|
@cindex Port, string
|
|
|
|
The following allow string ports to be opened by analogy to R4RS
|
|
file port facilities:
|
|
|
|
With string ports, the port-encoding is treated differently than other
|
|
types of ports. When string ports are created, they do not inherit a
|
|
character encoding from the current locale. They are given a
|
|
default locale that allows them to handle all valid string characters.
|
|
Typically one should not modify a string port's character encoding
|
|
away from its default.
|
|
|
|
@deffn {Scheme Procedure} call-with-output-string proc
|
|
@deffnx {C Function} scm_call_with_output_string (proc)
|
|
Calls the one-argument procedure @var{proc} with a newly created output
|
|
port. When the function returns, the string composed of the characters
|
|
written into the port is returned. @var{proc} should not close the port.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} call-with-input-string string proc
|
|
@deffnx {C Function} scm_call_with_input_string (string, proc)
|
|
Calls the one-argument procedure @var{proc} with a newly
|
|
created input port from which @var{string}'s contents may be
|
|
read. The value yielded by the @var{proc} is returned.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} with-output-to-string thunk
|
|
Calls the zero-argument procedure @var{thunk} with the current output
|
|
port set temporarily to a new string port. It returns a string
|
|
composed of the characters written to the current output.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} with-input-from-string string thunk
|
|
Calls the zero-argument procedure @var{thunk} with the current input
|
|
port set temporarily to a string port opened on the specified
|
|
@var{string}. The value yielded by @var{thunk} is returned.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} open-input-string str
|
|
@deffnx {C Function} scm_open_input_string (str)
|
|
Take a string and return an input port that delivers characters
|
|
from the string. The port can be closed by
|
|
@code{close-input-port}, though its storage will be reclaimed
|
|
by the garbage collector if it becomes inaccessible.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} open-output-string
|
|
@deffnx {C Function} scm_open_output_string ()
|
|
Return an output port that will accumulate characters for
|
|
retrieval by @code{get-output-string}. The port can be closed
|
|
by the procedure @code{close-output-port}, though its storage
|
|
will be reclaimed by the garbage collector if it becomes
|
|
inaccessible.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-output-string port
|
|
@deffnx {C Function} scm_get_output_string (port)
|
|
Given an output port created by @code{open-output-string},
|
|
return a string consisting of the characters that have been
|
|
output to the port so far.
|
|
|
|
@code{get-output-string} must be used before closing @var{port}, once
|
|
closed the string cannot be obtained.
|
|
@end deffn
|
|
|
|
A string port can be used in many procedures which accept a port
|
|
but which are not dependent on implementation details of fports.
|
|
E.g., seeking and truncating will work on a string port,
|
|
but trying to extract the file descriptor number will fail.
|
|
|
|
|
|
@node Soft Ports
|
|
@subsubsection Soft Ports
|
|
@cindex Soft port
|
|
@cindex Port, soft
|
|
|
|
A @dfn{soft-port} is a port based on a vector of procedures capable of
|
|
accepting or delivering characters. It allows emulation of I/O ports.
|
|
|
|
@deffn {Scheme Procedure} make-soft-port pv modes
|
|
@deffnx {C Function} scm_make_soft_port (pv, modes)
|
|
Return a port capable of receiving or delivering characters as
|
|
specified by the @var{modes} string (@pxref{File Ports,
|
|
open-file}). @var{pv} must be a vector of length 5 or 6. Its
|
|
components are as follows:
|
|
|
|
@enumerate 0
|
|
@item
|
|
procedure accepting one character for output
|
|
@item
|
|
procedure accepting a string for output
|
|
@item
|
|
thunk for flushing output
|
|
@item
|
|
thunk for getting one character
|
|
@item
|
|
thunk for closing port (not by garbage collection)
|
|
@item
|
|
(if present and not @code{#f}) thunk for computing the number of
|
|
characters that can be read from the port without blocking.
|
|
@end enumerate
|
|
|
|
For an output-only port only elements 0, 1, 2, and 4 need be
|
|
procedures. For an input-only port only elements 3 and 4 need
|
|
be procedures. Thunks 2 and 4 can instead be @code{#f} if
|
|
there is no useful operation for them to perform.
|
|
|
|
If thunk 3 returns @code{#f} or an @code{eof-object}
|
|
(@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
|
|
Scheme}) it indicates that the port has reached end-of-file.
|
|
For example:
|
|
|
|
@lisp
|
|
(define stdout (current-output-port))
|
|
(define p (make-soft-port
|
|
(vector
|
|
(lambda (c) (write c stdout))
|
|
(lambda (s) (display s stdout))
|
|
(lambda () (display "." stdout))
|
|
(lambda () (char-upcase (read-char)))
|
|
(lambda () (display "@@" stdout)))
|
|
"rw"))
|
|
|
|
(write p p) @result{} #<input-output: soft 8081e20>
|
|
@end lisp
|
|
@end deffn
|
|
|
|
|
|
@node Void Ports
|
|
@subsubsection Void Ports
|
|
@cindex Void port
|
|
@cindex Port, void
|
|
|
|
This kind of port causes any data to be discarded when written to, and
|
|
always returns the end-of-file object when read from.
|
|
|
|
@deffn {Scheme Procedure} %make-void-port mode
|
|
@deffnx {C Function} scm_sys_make_void_port (mode)
|
|
Create and return a new void port. A void port acts like
|
|
@file{/dev/null}. The @var{mode} argument
|
|
specifies the input/output modes for this port: see the
|
|
documentation for @code{open-file} in @ref{File Ports}.
|
|
@end deffn
|
|
|
|
|
|
@node R6RS I/O Ports
|
|
@subsection R6RS I/O Ports
|
|
|
|
@cindex R6RS
|
|
@cindex R6RS ports
|
|
|
|
The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
|
|
the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
|
|
io ports)} module. It provides features, such as binary I/O and Unicode
|
|
string I/O, that complement or refine Guile's historical port API
|
|
presented above (@pxref{Input and Output}). Note that R6RS ports are not
|
|
disjoint from Guile's native ports, so Guile-specific procedures will
|
|
work on ports created using the R6RS API, and vice versa.
|
|
|
|
The text in this section is taken from the R6RS standard libraries
|
|
document, with only minor adaptions for inclusion in this manual. The
|
|
Guile developers offer their thanks to the R6RS editors for having
|
|
provided the report's text under permissive conditions making this
|
|
possible.
|
|
|
|
@c FIXME: Update description when implemented.
|
|
@emph{Note}: The implementation of this R6RS API is not complete yet.
|
|
|
|
@menu
|
|
* R6RS File Names:: File names.
|
|
* R6RS File Options:: Options for opening files.
|
|
* R6RS Buffer Modes:: Influencing buffering behavior.
|
|
* R6RS Transcoders:: Influencing port encoding.
|
|
* R6RS End-of-File:: The end-of-file object.
|
|
* R6RS Port Manipulation:: Manipulating R6RS ports.
|
|
* R6RS Input Ports:: Input Ports.
|
|
* R6RS Binary Input:: Binary input.
|
|
* R6RS Textual Input:: Textual input.
|
|
* R6RS Output Ports:: Output Ports.
|
|
* R6RS Binary Output:: Binary output.
|
|
* R6RS Textual Output:: Textual output.
|
|
@end menu
|
|
|
|
A subset of the @code{(rnrs io ports)} module, plus one non-standard
|
|
procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
|
|
provided by the @code{(ice-9 binary-ports)} module. It contains binary
|
|
input/output procedures and does not rely on R6RS support.
|
|
|
|
@node R6RS File Names
|
|
@subsubsection File Names
|
|
|
|
Some of the procedures described in this chapter accept a file name as an
|
|
argument. Valid values for such a file name include strings that name a file
|
|
using the native notation of file system paths on an implementation's
|
|
underlying operating system, and may include implementation-dependent
|
|
values as well.
|
|
|
|
A @var{filename} parameter name means that the
|
|
corresponding argument must be a file name.
|
|
|
|
@node R6RS File Options
|
|
@subsubsection File Options
|
|
@cindex file options
|
|
|
|
When opening a file, the various procedures in this library accept a
|
|
@code{file-options} object that encapsulates flags to specify how the
|
|
file is to be opened. A @code{file-options} object is an enum-set
|
|
(@pxref{rnrs enums}) over the symbols constituting valid file options.
|
|
|
|
A @var{file-options} parameter name means that the corresponding
|
|
argument must be a file-options object.
|
|
|
|
@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
|
|
|
|
Each @var{file-options-symbol} must be a symbol.
|
|
|
|
The @code{file-options} syntax returns a file-options object that
|
|
encapsulates the specified options.
|
|
|
|
When supplied to an operation that opens a file for output, the
|
|
file-options object returned by @code{(file-options)} specifies that the
|
|
file is created if it does not exist and an exception with condition
|
|
type @code{&i/o-file-already-exists} is raised if it does exist. The
|
|
following standard options can be included to modify the default
|
|
behavior.
|
|
|
|
@table @code
|
|
@item no-create
|
|
If the file does not already exist, it is not created;
|
|
instead, an exception with condition type @code{&i/o-file-does-not-exist}
|
|
is raised.
|
|
If the file already exists, the exception with condition type
|
|
@code{&i/o-file-already-exists} is not raised
|
|
and the file is truncated to zero length.
|
|
@item no-fail
|
|
If the file already exists, the exception with condition type
|
|
@code{&i/o-file-already-exists} is not raised,
|
|
even if @code{no-create} is not included,
|
|
and the file is truncated to zero length.
|
|
@item no-truncate
|
|
If the file already exists and the exception with condition type
|
|
@code{&i/o-file-already-exists} has been inhibited by inclusion of
|
|
@code{no-create} or @code{no-fail}, the file is not truncated, but
|
|
the port's current position is still set to the beginning of the
|
|
file.
|
|
@end table
|
|
|
|
These options have no effect when a file is opened only for input.
|
|
Symbols other than those listed above may be used as
|
|
@var{file-options-symbol}s; they have implementation-specific meaning,
|
|
if any.
|
|
|
|
@quotation Note
|
|
Only the name of @var{file-options-symbol} is significant.
|
|
@end quotation
|
|
@end deffn
|
|
|
|
@node R6RS Buffer Modes
|
|
@subsubsection Buffer Modes
|
|
|
|
Each port has an associated buffer mode. For an output port, the
|
|
buffer mode defines when an output operation flushes the buffer
|
|
associated with the output port. For an input port, the buffer mode
|
|
defines how much data will be read to satisfy read operations. The
|
|
possible buffer modes are the symbols @code{none} for no buffering,
|
|
@code{line} for flushing upon line endings and reading up to line
|
|
endings, or other implementation-dependent behavior,
|
|
and @code{block} for arbitrary buffering. This section uses
|
|
the parameter name @var{buffer-mode} for arguments that must be
|
|
buffer-mode symbols.
|
|
|
|
If two ports are connected to the same mutable source, both ports
|
|
are unbuffered, and reading a byte or character from that shared
|
|
source via one of the two ports would change the bytes or characters
|
|
seen via the other port, a lookahead operation on one port will
|
|
render the peeked byte or character inaccessible via the other port,
|
|
while a subsequent read operation on the peeked port will see the
|
|
peeked byte or character even though the port is otherwise unbuffered.
|
|
|
|
In other words, the semantics of buffering is defined in terms of side
|
|
effects on shared mutable sources, and a lookahead operation has the
|
|
same side effect on the shared source as a read operation.
|
|
|
|
@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
|
|
|
|
@var{buffer-mode-symbol} must be a symbol whose name is one of
|
|
@code{none}, @code{line}, and @code{block}. The result is the
|
|
corresponding symbol, and specifies the associated buffer mode.
|
|
|
|
@quotation Note
|
|
Only the name of @var{buffer-mode-symbol} is significant.
|
|
@end quotation
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} buffer-mode? obj
|
|
Returns @code{#t} if the argument is a valid buffer-mode symbol, and
|
|
returns @code{#f} otherwise.
|
|
@end deffn
|
|
|
|
@node R6RS Transcoders
|
|
@subsubsection Transcoders
|
|
@cindex codec
|
|
@cindex end-of-line style
|
|
@cindex transcoder
|
|
@cindex binary port
|
|
@cindex textual port
|
|
|
|
Several different Unicode encoding schemes describe standard ways to
|
|
encode characters and strings as byte sequences and to decode those
|
|
sequences. Within this document, a @dfn{codec} is an immutable Scheme
|
|
object that represents a Unicode or similar encoding scheme.
|
|
|
|
An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
|
|
describes how a textual port transcodes representations of line endings.
|
|
|
|
A @dfn{transcoder} is an immutable Scheme object that combines a codec
|
|
with an end-of-line style and a method for handling decoding errors.
|
|
Each transcoder represents some specific bidirectional (but not
|
|
necessarily lossless), possibly stateful translation between byte
|
|
sequences and Unicode characters and strings. Every transcoder can
|
|
operate in the input direction (bytes to characters) or in the output
|
|
direction (characters to bytes). A @var{transcoder} parameter name
|
|
means that the corresponding argument must be a transcoder.
|
|
|
|
A @dfn{binary port} is a port that supports binary I/O, does not have an
|
|
associated transcoder and does not support textual I/O. A @dfn{textual
|
|
port} is a port that supports textual I/O, and does not support binary
|
|
I/O. A textual port may or may not have an associated transcoder.
|
|
|
|
@deffn {Scheme Procedure} latin-1-codec
|
|
@deffnx {Scheme Procedure} utf-8-codec
|
|
@deffnx {Scheme Procedure} utf-16-codec
|
|
|
|
These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
|
|
encoding schemes.
|
|
|
|
A call to any of these procedures returns a value that is equal in the
|
|
sense of @code{eqv?} to the result of any other call to the same
|
|
procedure.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
|
|
|
|
@var{eol-style-symbol} should be a symbol whose name is one of
|
|
@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
|
|
and @code{none}.
|
|
|
|
The form evaluates to the corresponding symbol. If the name of
|
|
@var{eol-style-symbol} is not one of these symbols, the effect and
|
|
result are implementation-dependent; in particular, the result may be an
|
|
eol-style symbol acceptable as an @var{eol-style} argument to
|
|
@code{make-transcoder}. Otherwise, an exception is raised.
|
|
|
|
All eol-style symbols except @code{none} describe a specific
|
|
line-ending encoding:
|
|
|
|
@table @code
|
|
@item lf
|
|
linefeed
|
|
@item cr
|
|
carriage return
|
|
@item crlf
|
|
carriage return, linefeed
|
|
@item nel
|
|
next line
|
|
@item crnel
|
|
carriage return, next line
|
|
@item ls
|
|
line separator
|
|
@end table
|
|
|
|
For a textual port with a transcoder, and whose transcoder has an
|
|
eol-style symbol @code{none}, no conversion occurs. For a textual input
|
|
port, any eol-style symbol other than @code{none} means that all of the
|
|
above line-ending encodings are recognized and are translated into a
|
|
single linefeed. For a textual output port, @code{none} and @code{lf}
|
|
are equivalent. Linefeed characters are encoded according to the
|
|
specified eol-style symbol, and all other characters that participate in
|
|
possible line endings are encoded as is.
|
|
|
|
@quotation Note
|
|
Only the name of @var{eol-style-symbol} is significant.
|
|
@end quotation
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} native-eol-style
|
|
Returns the default end-of-line style of the underlying platform, e.g.,
|
|
@code{lf} on Unix and @code{crlf} on Windows.
|
|
@end deffn
|
|
|
|
@deffn {Condition Type} &i/o-decoding
|
|
@deffnx {Scheme Procedure} make-i/o-decoding-error port
|
|
@deffnx {Scheme Procedure} i/o-decoding-error? obj
|
|
|
|
This condition type could be defined by
|
|
|
|
@lisp
|
|
(define-condition-type &i/o-decoding &i/o-port
|
|
make-i/o-decoding-error i/o-decoding-error?)
|
|
@end lisp
|
|
|
|
An exception with this type is raised when one of the operations for
|
|
textual input from a port encounters a sequence of bytes that cannot be
|
|
translated into a character or string by the input direction of the
|
|
port's transcoder.
|
|
|
|
When such an exception is raised, the port's position is past the
|
|
invalid encoding.
|
|
@end deffn
|
|
|
|
@deffn {Condition Type} &i/o-encoding
|
|
@deffnx {Scheme Procedure} make-i/o-encoding-error port char
|
|
@deffnx {Scheme Procedure} i/o-encoding-error? obj
|
|
@deffnx {Scheme Procedure} i/o-encoding-error-char condition
|
|
|
|
This condition type could be defined by
|
|
|
|
@lisp
|
|
(define-condition-type &i/o-encoding &i/o-port
|
|
make-i/o-encoding-error i/o-encoding-error?
|
|
(char i/o-encoding-error-char))
|
|
@end lisp
|
|
|
|
An exception with this type is raised when one of the operations for
|
|
textual output to a port encounters a character that cannot be
|
|
translated into bytes by the output direction of the port's transcoder.
|
|
@var{char} is the character that could not be encoded.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
|
|
|
|
@var{error-handling-mode-symbol} should be a symbol whose name is one of
|
|
@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
|
|
the corresponding symbol. If @var{error-handling-mode-symbol} is not
|
|
one of these identifiers, effect and result are
|
|
implementation-dependent: The result may be an error-handling-mode
|
|
symbol acceptable as a @var{handling-mode} argument to
|
|
@code{make-transcoder}. If it is not acceptable as a
|
|
@var{handling-mode} argument to @code{make-transcoder}, an exception is
|
|
raised.
|
|
|
|
@quotation Note
|
|
Only the name of @var{error-handling-mode-symbol} is significant.
|
|
@end quotation
|
|
|
|
The error-handling mode of a transcoder specifies the behavior
|
|
of textual I/O operations in the presence of encoding or decoding
|
|
errors.
|
|
|
|
If a textual input operation encounters an invalid or incomplete
|
|
character encoding, and the error-handling mode is @code{ignore}, an
|
|
appropriate number of bytes of the invalid encoding are ignored and
|
|
decoding continues with the following bytes.
|
|
|
|
If the error-handling mode is @code{replace}, the replacement
|
|
character U+FFFD is injected into the data stream, an appropriate
|
|
number of bytes are ignored, and decoding
|
|
continues with the following bytes.
|
|
|
|
If the error-handling mode is @code{raise}, an exception with condition
|
|
type @code{&i/o-decoding} is raised.
|
|
|
|
If a textual output operation encounters a character it cannot encode,
|
|
and the error-handling mode is @code{ignore}, the character is ignored
|
|
and encoding continues with the next character. If the error-handling
|
|
mode is @code{replace}, a codec-specific replacement character is
|
|
emitted by the transcoder, and encoding continues with the next
|
|
character. The replacement character is U+FFFD for transcoders whose
|
|
codec is one of the Unicode encodings, but is the @code{?} character
|
|
for the Latin-1 encoding. If the error-handling mode is @code{raise},
|
|
an exception with condition type @code{&i/o-encoding} is raised.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} make-transcoder codec
|
|
@deffnx {Scheme Procedure} make-transcoder codec eol-style
|
|
@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
|
|
|
|
@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
|
|
symbol; and @var{handling-mode}, if present, an error-handling-mode
|
|
symbol.
|
|
|
|
@var{eol-style} may be omitted, in which case it defaults to the native
|
|
end-of-line style of the underlying platform. @var{handling-mode} may
|
|
be omitted, in which case it defaults to @code{replace}. The result is
|
|
a transcoder with the behavior specified by its arguments.
|
|
@end deffn
|
|
|
|
@deffn {Scheme procedure} native-transcoder
|
|
Returns an implementation-dependent transcoder that represents a
|
|
possibly locale-dependent ``native'' transcoding.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} transcoder-codec transcoder
|
|
@deffnx {Scheme Procedure} transcoder-eol-style transcoder
|
|
@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
|
|
|
|
These are accessors for transcoder objects; when applied to a
|
|
transcoder returned by @code{make-transcoder}, they return the
|
|
@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
|
|
respectively.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} bytevector->string bytevector transcoder
|
|
|
|
Returns the string that results from transcoding the
|
|
@var{bytevector} according to the input direction of the transcoder.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} string->bytevector string transcoder
|
|
|
|
Returns the bytevector that results from transcoding the
|
|
@var{string} according to the output direction of the transcoder.
|
|
@end deffn
|
|
|
|
@node R6RS End-of-File
|
|
@subsubsection The End-of-File Object
|
|
|
|
@cindex EOF
|
|
@cindex end-of-file
|
|
|
|
R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
|
|
ports)} module:
|
|
|
|
@deffn {Scheme Procedure} eof-object? obj
|
|
@deffnx {C Function} scm_eof_object_p (obj)
|
|
Return true if @var{obj} is the end-of-file (EOF) object.
|
|
@end deffn
|
|
|
|
In addition, the following procedure is provided:
|
|
|
|
@deffn {Scheme Procedure} eof-object
|
|
@deffnx {C Function} scm_eof_object ()
|
|
Return the end-of-file (EOF) object.
|
|
|
|
@lisp
|
|
(eof-object? (eof-object))
|
|
@result{} #t
|
|
@end lisp
|
|
@end deffn
|
|
|
|
|
|
@node R6RS Port Manipulation
|
|
@subsubsection Port Manipulation
|
|
|
|
The procedures listed below operate on any kind of R6RS I/O port.
|
|
|
|
@deffn {Scheme Procedure} port? obj
|
|
Returns @code{#t} if the argument is a port, and returns @code{#f}
|
|
otherwise.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-transcoder port
|
|
Returns the transcoder associated with @var{port} if @var{port} is
|
|
textual and has an associated transcoder, and returns @code{#f} if
|
|
@var{port} is binary or does not have an associated transcoder.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} binary-port? port
|
|
Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
|
|
binary data input/output.
|
|
|
|
Note that internally Guile does not differentiate between binary and
|
|
textual ports, unlike the R6RS. Thus, this procedure returns true when
|
|
@var{port} does not have an associated encoding---i.e., when
|
|
@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
|
|
port-encoding}). This is the case for ports returned by R6RS procedures
|
|
such as @code{open-bytevector-input-port} and
|
|
@code{make-custom-binary-output-port}.
|
|
|
|
However, Guile currently does not prevent use of textual I/O procedures
|
|
such as @code{display} or @code{read-char} with binary ports. Doing so
|
|
``upgrades'' the port from binary to textual, under the ISO-8859-1
|
|
encoding. Likewise, Guile does not prevent use of
|
|
@code{set-port-encoding!} on a binary port, which also turns it into a
|
|
``textual'' port.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} textual-port? port
|
|
Always return @code{#t}, as all ports can be used for textual I/O in
|
|
Guile.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} transcoded-port binary-port transcoder
|
|
The @code{transcoded-port} procedure
|
|
returns a new textual port with the specified @var{transcoder}.
|
|
Otherwise the new textual port's state is largely the same as
|
|
that of @var{binary-port}.
|
|
If @var{binary-port} is an input port, the new textual
|
|
port will be an input port and
|
|
will transcode the bytes that have not yet been read from
|
|
@var{binary-port}.
|
|
If @var{binary-port} is an output port, the new textual
|
|
port will be an output port and
|
|
will transcode output characters into bytes that are
|
|
written to the byte sink represented by @var{binary-port}.
|
|
|
|
As a side effect, however, @code{transcoded-port}
|
|
closes @var{binary-port} in
|
|
a special way that allows the new textual port to continue to
|
|
use the byte source or sink represented by @var{binary-port},
|
|
even though @var{binary-port} itself is closed and cannot
|
|
be used by the input and output operations described in this
|
|
chapter.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-position port
|
|
If @var{port} supports it (see below), return the offset (an integer)
|
|
indicating where the next octet will be read from/written to in
|
|
@var{port}. If @var{port} does not support this operation, an error
|
|
condition is raised.
|
|
|
|
This is similar to Guile's @code{seek} procedure with the
|
|
@code{SEEK_CUR} argument (@pxref{Random Access}).
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-has-port-position? port
|
|
Return @code{#t} is @var{port} supports @code{port-position}.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} set-port-position! port offset
|
|
If @var{port} supports it (see below), set the position where the next
|
|
octet will be read from/written to @var{port} to @var{offset} (an
|
|
integer). If @var{port} does not support this operation, an error
|
|
condition is raised.
|
|
|
|
This is similar to Guile's @code{seek} procedure with the
|
|
@code{SEEK_SET} argument (@pxref{Random Access}).
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-has-set-port-position!? port
|
|
Return @code{#t} is @var{port} supports @code{set-port-position!}.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} call-with-port port proc
|
|
Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
|
|
of @var{proc}. Return the return values of @var{proc}.
|
|
@end deffn
|
|
|
|
@node R6RS Input Ports
|
|
@subsubsection Input Ports
|
|
|
|
@deffn {Scheme Procedure} input-port? obj
|
|
Returns @code{#t} if the argument is an input port (or a combined input
|
|
and output port), and returns @code{#f} otherwise.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} port-eof? input-port
|
|
Returns @code{#t}
|
|
if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
|
|
or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
|
|
would return
|
|
the end-of-file object, and @code{#f} otherwise.
|
|
The operation may block indefinitely if no data is available
|
|
but the port cannot be determined to be at end of file.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} open-file-input-port filename
|
|
@deffnx {Scheme Procedure} open-file-input-port filename file-options
|
|
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
|
|
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
|
|
@var{maybe-transcoder} must be either a transcoder or @code{#f}.
|
|
|
|
The @code{open-file-input-port} procedure returns an
|
|
input port for the named file. The @var{file-options} and
|
|
@var{maybe-transcoder} arguments are optional.
|
|
|
|
The @var{file-options} argument, which may determine
|
|
various aspects of the returned port (@pxref{R6RS File Options}),
|
|
defaults to the value of @code{(file-options)}.
|
|
|
|
The @var{buffer-mode} argument, if supplied,
|
|
must be one of the symbols that name a buffer mode.
|
|
The @var{buffer-mode} argument defaults to @code{block}.
|
|
|
|
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
|
|
with the returned port.
|
|
|
|
If @var{maybe-transcoder} is @code{#f} or absent,
|
|
the port will be a binary port and will support the
|
|
@code{port-position} and @code{set-port-position!} operations.
|
|
Otherwise the port will be a textual port, and whether it supports
|
|
the @code{port-position} and @code{set-port-position!} operations
|
|
is implementation-dependent (and possibly transcoder-dependent).
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} standard-input-port
|
|
Returns a fresh binary input port connected to standard input. Whether
|
|
the port supports the @code{port-position} and @code{set-port-position!}
|
|
operations is implementation-dependent.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} current-input-port
|
|
This returns a default textual port for input. Normally, this default
|
|
port is associated with standard input, but can be dynamically
|
|
re-assigned using the @code{with-input-from-file} procedure from the
|
|
@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
|
|
may not have an associated transcoder; if it does, the transcoder is
|
|
implementation-dependent.
|
|
@end deffn
|
|
|
|
@node R6RS Binary Input
|
|
@subsubsection Binary Input
|
|
|
|
@cindex binary input
|
|
|
|
R6RS binary input ports can be created with the procedures described
|
|
below.
|
|
|
|
@deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
|
|
@deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
|
|
Return an input port whose contents are drawn from bytevector @var{bv}
|
|
(@pxref{Bytevectors}).
|
|
|
|
@c FIXME: Update description when implemented.
|
|
The @var{transcoder} argument is currently not supported.
|
|
@end deffn
|
|
|
|
@cindex custom binary input ports
|
|
|
|
@deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
|
|
@deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
|
|
Return a new custom binary input port@footnote{This is similar in spirit
|
|
to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
|
|
string) whose input is drained by invoking @var{read!} and passing it a
|
|
bytevector, an index where bytes should be written, and the number of
|
|
bytes to read. The @code{read!} procedure must return an integer
|
|
indicating the number of bytes read, or @code{0} to indicate the
|
|
end-of-file.
|
|
|
|
Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
|
|
that will be called when @code{port-position} is invoked on the custom
|
|
binary port and should return an integer indicating the position within
|
|
the underlying data stream; if @var{get-position} was not supplied, the
|
|
returned port does not support @code{port-position}.
|
|
|
|
Likewise, if @var{set-position!} is not @code{#f}, it should be a
|
|
one-argument procedure. When @code{set-port-position!} is invoked on the
|
|
custom binary input port, @var{set-position!} is passed an integer
|
|
indicating the position of the next byte is to read.
|
|
|
|
Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
|
|
invoked when the custom binary input port is closed.
|
|
|
|
The returned port is fully buffered by default, but its buffering mode
|
|
can be changed using @code{setvbuf} (@pxref{Buffering}).
|
|
|
|
Using a custom binary input port, the @code{open-bytevector-input-port}
|
|
procedure could be implemented as follows:
|
|
|
|
@lisp
|
|
(define (open-bytevector-input-port source)
|
|
(define position 0)
|
|
(define length (bytevector-length source))
|
|
|
|
(define (read! bv start count)
|
|
(let ((count (min count (- length position))))
|
|
(bytevector-copy! source position
|
|
bv start count)
|
|
(set! position (+ position count))
|
|
count))
|
|
|
|
(define (get-position) position)
|
|
|
|
(define (set-position! new-position)
|
|
(set! position new-position))
|
|
|
|
(make-custom-binary-input-port "the port" read!
|
|
get-position
|
|
set-position!))
|
|
|
|
(read (open-bytevector-input-port (string->utf8 "hello")))
|
|
@result{} hello
|
|
@end lisp
|
|
@end deffn
|
|
|
|
@cindex binary input
|
|
Binary input is achieved using the procedures below:
|
|
|
|
@deffn {Scheme Procedure} get-u8 port
|
|
@deffnx {C Function} scm_get_u8 (port)
|
|
Return an octet read from @var{port}, a binary input port, blocking as
|
|
necessary, or the end-of-file object.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} lookahead-u8 port
|
|
@deffnx {C Function} scm_lookahead_u8 (port)
|
|
Like @code{get-u8} but does not update @var{port}'s position to point
|
|
past the octet.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-bytevector-n port count
|
|
@deffnx {C Function} scm_get_bytevector_n (port, count)
|
|
Read @var{count} octets from @var{port}, blocking as necessary and
|
|
return a bytevector containing the octets read. If fewer bytes are
|
|
available, a bytevector smaller than @var{count} is returned.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-bytevector-n! port bv start count
|
|
@deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
|
|
Read @var{count} bytes from @var{port} and store them in @var{bv}
|
|
starting at index @var{start}. Return either the number of bytes
|
|
actually read or the end-of-file object.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-bytevector-some port
|
|
@deffnx {C Function} scm_get_bytevector_some (port)
|
|
Read from @var{port}, blocking as necessary, until bytes are available
|
|
or an end-of-file is reached. Return either the end-of-file object or a
|
|
new bytevector containing some of the available bytes (at least one),
|
|
and update the port position to point just past these bytes.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-bytevector-all port
|
|
@deffnx {C Function} scm_get_bytevector_all (port)
|
|
Read from @var{port}, blocking as necessary, until the end-of-file is
|
|
reached. Return either a new bytevector containing the data read or the
|
|
end-of-file object (if no data were available).
|
|
@end deffn
|
|
|
|
The @code{(ice-9 binary-ports)} module provides the following procedure
|
|
as an extension to @code{(rnrs io ports)}:
|
|
|
|
@deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
|
|
@deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
|
|
Place the contents of @var{bv} in @var{port}, optionally starting at
|
|
index @var{start} and limiting to @var{count} octets, so that its bytes
|
|
will be read from left-to-right as the next bytes from @var{port} during
|
|
subsequent read operations. If called multiple times, the unread bytes
|
|
will be read again in last-in first-out order.
|
|
@end deffn
|
|
|
|
@node R6RS Textual Input
|
|
@subsubsection Textual Input
|
|
|
|
@deffn {Scheme Procedure} get-char textual-input-port
|
|
Reads from @var{textual-input-port}, blocking as necessary, until a
|
|
complete character is available from @var{textual-input-port},
|
|
or until an end of file is reached.
|
|
|
|
If a complete character is available before the next end of file,
|
|
@code{get-char} returns that character and updates the input port to
|
|
point past the character. If an end of file is reached before any
|
|
character is read, @code{get-char} returns the end-of-file object.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} lookahead-char textual-input-port
|
|
The @code{lookahead-char} procedure is like @code{get-char}, but it does
|
|
not update @var{textual-input-port} to point past the character.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-string-n textual-input-port count
|
|
|
|
@var{count} must be an exact, non-negative integer object, representing
|
|
the number of characters to be read.
|
|
|
|
The @code{get-string-n} procedure reads from @var{textual-input-port},
|
|
blocking as necessary, until @var{count} characters are available, or
|
|
until an end of file is reached.
|
|
|
|
If @var{count} characters are available before end of file,
|
|
@code{get-string-n} returns a string consisting of those @var{count}
|
|
characters. If fewer characters are available before an end of file, but
|
|
one or more characters can be read, @code{get-string-n} returns a string
|
|
containing those characters. In either case, the input port is updated
|
|
to point just past the characters read. If no characters can be read
|
|
before an end of file, the end-of-file object is returned.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-string-n! textual-input-port string start count
|
|
|
|
@var{start} and @var{count} must be exact, non-negative integer objects,
|
|
with @var{count} representing the number of characters to be read.
|
|
@var{string} must be a string with at least $@var{start} + @var{count}$
|
|
characters.
|
|
|
|
The @code{get-string-n!} procedure reads from @var{textual-input-port}
|
|
in the same manner as @code{get-string-n}. If @var{count} characters
|
|
are available before an end of file, they are written into @var{string}
|
|
starting at index @var{start}, and @var{count} is returned. If fewer
|
|
characters are available before an end of file, but one or more can be
|
|
read, those characters are written into @var{string} starting at index
|
|
@var{start} and the number of characters actually read is returned as an
|
|
exact integer object. If no characters can be read before an end of
|
|
file, the end-of-file object is returned.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-string-all textual-input-port
|
|
Reads from @var{textual-input-port} until an end of file, decoding
|
|
characters in the same manner as @code{get-string-n} and
|
|
@code{get-string-n!}.
|
|
|
|
If characters are available before the end of file, a string containing
|
|
all the characters decoded from that data are returned. If no character
|
|
precedes the end of file, the end-of-file object is returned.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-line textual-input-port
|
|
Reads from @var{textual-input-port} up to and including the linefeed
|
|
character or end of file, decoding characters in the same manner as
|
|
@code{get-string-n} and @code{get-string-n!}.
|
|
|
|
If a linefeed character is read, a string containing all of the text up
|
|
to (but not including) the linefeed character is returned, and the port
|
|
is updated to point just past the linefeed character. If an end of file
|
|
is encountered before any linefeed character is read, but some
|
|
characters have been read and decoded as characters, a string containing
|
|
those characters is returned. If an end of file is encountered before
|
|
any characters are read, the end-of-file object is returned.
|
|
|
|
@quotation Note
|
|
The end-of-line style, if not @code{none}, will cause all line endings
|
|
to be read as linefeed characters. @xref{R6RS Transcoders}.
|
|
@end quotation
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} get-datum textual-input-port count
|
|
Reads an external representation from @var{textual-input-port} and returns the
|
|
datum it represents. The @code{get-datum} procedure returns the next
|
|
datum that can be parsed from the given @var{textual-input-port}, updating
|
|
@var{textual-input-port} to point exactly past the end of the external
|
|
representation of the object.
|
|
|
|
Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
|
|
Syntax}) in the input is first skipped. If an end of file occurs after
|
|
the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
|
|
is returned.
|
|
|
|
If a character inconsistent with an external representation is
|
|
encountered in the input, an exception with condition types
|
|
@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
|
|
file is encountered after the beginning of an external representation,
|
|
but the external representation is incomplete and therefore cannot be
|
|
parsed, an exception with condition types @code{&lexical} and
|
|
@code{&i/o-read} is raised.
|
|
@end deffn
|
|
|
|
@node R6RS Output Ports
|
|
@subsubsection Output Ports
|
|
|
|
@deffn {Scheme Procedure} output-port? obj
|
|
Returns @code{#t} if the argument is an output port (or a
|
|
combined input and output port), @code{#f} otherwise.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} flush-output-port port
|
|
Flushes any buffered output from the buffer of @var{output-port} to the
|
|
underlying file, device, or object. The @code{flush-output-port}
|
|
procedure returns an unspecified values.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} open-file-output-port filename
|
|
@deffnx {Scheme Procedure} open-file-output-port filename file-options
|
|
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
|
|
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
|
|
|
|
@var{maybe-transcoder} must be either a transcoder or @code{#f}.
|
|
|
|
The @code{open-file-output-port} procedure returns an output port for the named file.
|
|
|
|
The @var{file-options} argument, which may determine various aspects of
|
|
the returned port (@pxref{R6RS File Options}), defaults to the value of
|
|
@code{(file-options)}.
|
|
|
|
The @var{buffer-mode} argument, if supplied,
|
|
must be one of the symbols that name a buffer mode.
|
|
The @var{buffer-mode} argument defaults to @code{block}.
|
|
|
|
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
|
|
associated with the port.
|
|
|
|
If @var{maybe-transcoder} is @code{#f} or absent,
|
|
the port will be a binary port and will support the
|
|
@code{port-position} and @code{set-port-position!} operations.
|
|
Otherwise the port will be a textual port, and whether it supports
|
|
the @code{port-position} and @code{set-port-position!} operations
|
|
is implementation-dependent (and possibly transcoder-dependent).
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} standard-output-port
|
|
@deffnx {Scheme Procedure} standard-error-port
|
|
Returns a fresh binary output port connected to the standard output or
|
|
standard error respectively. Whether the port supports the
|
|
@code{port-position} and @code{set-port-position!} operations is
|
|
implementation-dependent.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} current-output-port
|
|
@deffnx {Scheme Procedure} current-error-port
|
|
These return default textual ports for regular output and error output.
|
|
Normally, these default ports are associated with standard output, and
|
|
standard error, respectively. The return value of
|
|
@code{current-output-port} can be dynamically re-assigned using the
|
|
@code{with-output-to-file} procedure from the @code{io simple (6)}
|
|
library (@pxref{rnrs io simple}). A port returned by one of these
|
|
procedures may or may not have an associated transcoder; if it does, the
|
|
transcoder is implementation-dependent.
|
|
@end deffn
|
|
|
|
@node R6RS Binary Output
|
|
@subsubsection Binary Output
|
|
|
|
Binary output ports can be created with the procedures below.
|
|
|
|
@deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
|
|
@deffnx {C Function} scm_open_bytevector_output_port (transcoder)
|
|
Return two values: a binary output port and a procedure. The latter
|
|
should be called with zero arguments to obtain a bytevector containing
|
|
the data accumulated by the port, as illustrated below.
|
|
|
|
@lisp
|
|
(call-with-values
|
|
(lambda ()
|
|
(open-bytevector-output-port))
|
|
(lambda (port get-bytevector)
|
|
(display "hello" port)
|
|
(get-bytevector)))
|
|
|
|
@result{} #vu8(104 101 108 108 111)
|
|
@end lisp
|
|
|
|
@c FIXME: Update description when implemented.
|
|
The @var{transcoder} argument is currently not supported.
|
|
@end deffn
|
|
|
|
@cindex custom binary output ports
|
|
|
|
@deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
|
|
@deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
|
|
Return a new custom binary output port named @var{id} (a string) whose
|
|
output is sunk by invoking @var{write!} and passing it a bytevector, an
|
|
index where bytes should be read from this bytevector, and the number of
|
|
bytes to be ``written''. The @code{write!} procedure must return an
|
|
integer indicating the number of bytes actually written; when it is
|
|
passed @code{0} as the number of bytes to write, it should behave as
|
|
though an end-of-file was sent to the byte sink.
|
|
|
|
The other arguments are as for @code{make-custom-binary-input-port}
|
|
(@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
|
|
@end deffn
|
|
|
|
@cindex binary output
|
|
Writing to a binary output port can be done using the following
|
|
procedures:
|
|
|
|
@deffn {Scheme Procedure} put-u8 port octet
|
|
@deffnx {C Function} scm_put_u8 (port, octet)
|
|
Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
|
|
binary output port.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} put-bytevector port bv [start [count]]
|
|
@deffnx {C Function} scm_put_bytevector (port, bv, start, count)
|
|
Write the contents of @var{bv} to @var{port}, optionally starting at
|
|
index @var{start} and limiting to @var{count} octets.
|
|
@end deffn
|
|
|
|
@node R6RS Textual Output
|
|
@subsubsection Textual Output
|
|
|
|
@deffn {Scheme Procedure} put-char port char
|
|
Writes @var{char} to the port. The @code{put-char} procedure returns
|
|
an unspecified value.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} put-string port string
|
|
@deffnx {Scheme Procedure} put-string port string start
|
|
@deffnx {Scheme Procedure} put-string port string start count
|
|
|
|
@var{start} and @var{count} must be non-negative exact integer objects.
|
|
@var{string} must have a length of at least @math{@var{start} +
|
|
@var{count}}. @var{start} defaults to 0. @var{count} defaults to
|
|
@math{@code{(string-length @var{string})} - @var{start}}$. The
|
|
@code{put-string} procedure writes the @var{count} characters of
|
|
@var{string} starting at index @var{start} to the port. The
|
|
@code{put-string} procedure returns an unspecified value.
|
|
@end deffn
|
|
|
|
@deffn {Scheme Procedure} put-datum textual-output-port datum
|
|
@var{datum} should be a datum value. The @code{put-datum} procedure
|
|
writes an external representation of @var{datum} to
|
|
@var{textual-output-port}. The specific external representation is
|
|
implementation-dependent. However, whenever possible, an implementation
|
|
should produce a representation for which @code{get-datum}, when reading
|
|
the representation, will return an object equal (in the sense of
|
|
@code{equal?}) to @var{datum}.
|
|
|
|
@quotation Note
|
|
Not all datums may allow producing an external representation for which
|
|
@code{get-datum} will produce an object that is equal to the
|
|
original. Specifically, NaNs contained in @var{datum} may make
|
|
this impossible.
|
|
@end quotation
|
|
|
|
@quotation Note
|
|
The @code{put-datum} procedure merely writes the external
|
|
representation, but no trailing delimiter. If @code{put-datum} is
|
|
used to write several subsequent external representations to an
|
|
output port, care should be taken to delimit them properly so they can
|
|
be read back in by subsequent calls to @code{get-datum}.
|
|
@end quotation
|
|
@end deffn
|
|
|
|
@node I/O Extensions
|
|
@subsection Implementing New Port Types in C
|
|
|
|
This section describes how to implement a new port type in C. Although
|
|
ports support many operations, as a data structure they present an
|
|
opaque interface to the user. To the port implementor, you have two
|
|
additional pieces of information: the port type, which is an opaque
|
|
pointer allocated when defining your port type; and a port's ``stream'',
|
|
which you allocate when you create a port.
|
|
|
|
The type code helps you identify which ports are actually yours. The
|
|
``stream'' is the private data associated with that port which you and
|
|
only you control. Get a stream from a port using the @code{SCM_STREAM}
|
|
macro. Note that your port methods are only ever called with ports of
|
|
your type.
|
|
|
|
A port type is created by calling @code{scm_make_port_type}. Once you
|
|
have your port type, you can create ports with @code{scm_c_make_port},
|
|
or @code{scm_c_make_port_with_encoding}.
|
|
|
|
@deftypefun scm_t_port_type* scm_make_port_type (char *name, size_t (*read) (SCM port, SCM dst, size_t start, size_t count), size_t (*write) (SCM port, SCM src, size_t start, size_t count))
|
|
Define a new port type. The @var{name}, @var{read} and @var{write}
|
|
parameters are initial values for those port type fields, as described
|
|
below. The other fields are initialized with default values and can be
|
|
changed later.
|
|
@end deftypefun
|
|
|
|
@deftypefun SCM scm_c_make_port_with_encoding (scm_t_port_type *type, unsigned long mode_bits, SCM encoding, SCM conversion_strategy, scm_t_bits stream)
|
|
@deftypefunx SCM scm_c_make_port (scm_t_port_type *type, unsigned long mode_bits, scm_t_bits stream)
|
|
Make a port with the given @var{type}. The @var{stream} indicates the
|
|
private data associated with the port, which your port implementation
|
|
may later retrieve with @code{SCM_STREAM}. The mode bits should include
|
|
one or more of the flags @code{SCM_RDNG} or @code{SCM_WRTNG}, indicating
|
|
that the port is an input and/or an output port, respectively. The mode
|
|
bits may also include @code{SCM_BUF0} or @code{SCM_BUFLINE}, indicating
|
|
that the port should be unbuffered or line-buffered, respectively. The
|
|
default is that the port will be block-buffered. @xref{Buffering}.
|
|
|
|
As you would imagine, @var{encoding} and @var{conversion_strategy}
|
|
specify the port's initial textual encoding and conversion strategy.
|
|
Both are symbols. @code{scm_c_make_port} is the same as
|
|
@code{scm_c_make_port_with_encoding}, except it uses the default port
|
|
encoding and conversion strategy.
|
|
@end deftypefun
|
|
|
|
The port type has a number of associate procedures and properties which
|
|
collectively implement the port's behavior. Creating a new port type
|
|
mostly involves writing these procedures.
|
|
|
|
@table @code
|
|
@item name
|
|
A pointer to a NUL terminated string: the name of the port type. This
|
|
property is initialized via the first argument to
|
|
@code{scm_make_port_type}.
|
|
|
|
@item read
|
|
A port's @code{read} implementation fills read buffers. It should copy
|
|
bytes to the supplied bytevector @code{dst}, starting at offset
|
|
@code{start} and continuing for @code{count} bytes, returning the number
|
|
of bytes read.
|
|
|
|
@item write
|
|
A port's @code{write} implementation flushes write buffers to the
|
|
mutable store. A port's @code{read} implementation fills read buffers.
|
|
It should write out bytes from the supplied bytevector @code{src},
|
|
starting at offset @code{start} and continuing for @code{count} bytes,
|
|
and return the number of bytes that were written.
|
|
|
|
@item read_wait_fd
|
|
@itemx write_wait_fd
|
|
If a port's @code{read} or @code{write} function returns @code{(size_t)
|
|
-1}, that indicates that reading or writing would block. In that case
|
|
to preserve the illusion of a blocking read or write operation, Guile's
|
|
C port run-time will @code{poll} on the file descriptor returned by
|
|
either the port's @code{read_wait_fd} or @code{write_wait_fd} function.
|
|
Set using
|
|
|
|
@deftypefun void scm_set_port_read_wait_fd (scm_t_port_type *type, int (*wait_fd) (SCM port))
|
|
@deftypefunx void scm_set_port_write_wait_fd (scm_t_port_type *type, int (*wait_fd) (SCM port))
|
|
@end deftypefun
|
|
|
|
Only a port type which implements the @code{read_wait_fd} or
|
|
@code{write_wait_fd} port methods can usefully return @code{(size_t) -1}
|
|
from a read or write function. @xref{Non-Blocking I/O}, for more on
|
|
non-blocking I/O in Guile.
|
|
|
|
@item print
|
|
Called when @code{write} is called on the port, to print a port
|
|
description. For example, for a file port it may produce something
|
|
like: @code{#<input: /etc/passwd 3>}. Set using
|
|
|
|
@deftypefun void scm_set_port_print (scm_t_port_type *type, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
|
|
The first argument @var{port} is the port being printed, the second
|
|
argument @var{dest_port} is where its description should go.
|
|
@end deftypefun
|
|
|
|
@item close
|
|
Called when the port is closed. It should free any resources used by
|
|
the port. Set using
|
|
|
|
@deftypefun void scm_set_port_close (scm_t_port_type *type, void (*close) (SCM port))
|
|
@end deftypefun
|
|
|
|
By default, ports that are garbage collected just go away without
|
|
closing. If your port type needs to release some external resource like
|
|
a file descriptor, or needs to make sure that its internal buffers are
|
|
flushed even if the port is collected while it was open, then mark the
|
|
port type as needing a close on GC.
|
|
|
|
@deftypefun void scm_set_port_needs_close_on_gc (scm_t_port_type *type, int needs_close_p)
|
|
@end deftypefun
|
|
|
|
@item seek
|
|
Set the current position of the port. Guile will flush read and/or
|
|
write buffers before seeking, as appropriate.
|
|
|
|
@deftypefun void scm_set_port_seek (scm_t_port_type *type, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
|
|
@end deftypefun
|
|
|
|
@item truncate
|
|
Truncate the port data to be specified length. Guile will flush buffers
|
|
before hand, as appropriate. Set using
|
|
|
|
@deftypefun void scm_set_port_truncate (scm_t_port_type *type, void (*truncate) (SCM port, scm_t_off length))
|
|
@end deftypefun
|
|
|
|
@item random_access_p
|
|
Determine whether this port is a random-access port.
|
|
|
|
@cindex random access
|
|
Seeking on a random-access port with buffered input, or switching to
|
|
writing after reading, will cause the buffered input to be discarded and
|
|
Guile will seek the port back the buffered number of bytes. Likewise
|
|
seeking on a random-access port with buffered output, or switching to
|
|
reading after writing, will flush pending bytes with a call to the
|
|
@code{write} procedure. @xref{Buffering}.
|
|
|
|
Indicate to Guile that your port needs this behavior by returning a
|
|
nonzero value from your @code{random_access_p} function. The default
|
|
implementation of this function returns nonzero if the port type
|
|
supplies a seek implementation.
|
|
|
|
@deftypefun void scm_set_port_random_access_p (scm_t_port_type *type, int (*random_access_p) (SCM port));
|
|
@end deftypefun
|
|
|
|
@item get_natural_buffer_sizes
|
|
Guile will internally attach buffers to ports. An input port always has
|
|
a read buffer and an output port always has a write buffer.
|
|
@xref{Buffering}. A port buffer consists of a bytevector, along with
|
|
some cursors into that bytevector denoting where to get and put data.
|
|
|
|
Port implementations generally don't have to be concerned with
|
|
buffering: a port type's @code{read} or @code{write} function will
|
|
receive the buffer's bytevector as an argument, along with an offset and
|
|
a length into that bytevector, and should then either fill or empty that
|
|
bytevector. However in some cases, port implementations may be able to
|
|
provide an appropriate default buffer size to Guile.
|
|
|
|
@deftypefun void scm_set_port_get_natural_buffer_sizes @
|
|
(scm_t_port_type *type, void (*get_natural_buffer_sizes) (SCM, size_t *read_buf_size, size_t *write_buf_size))
|
|
Fill in @var{read_buf_size} and @var{write_buf_size} with an appropriate buffer size for this port, if one is known.
|
|
@end deftypefun
|
|
|
|
File ports implement a @code{get_natural_buffer_sizes} to let the
|
|
operating system inform Guile about the appropriate buffer sizes for the
|
|
particular file opened by the port.
|
|
@end table
|
|
|
|
@node Non-Blocking I/O
|
|
@subsection Non-Blocking I/O
|
|
|
|
Most ports in Guile are @dfn{blocking}: when you try to read a character
|
|
from a port, Guile will block on the read until a character is ready, or
|
|
end-of-stream is detected. Likewise whenever Guile goes to write
|
|
(possibly buffered) data to an output port, Guile will block until all
|
|
the data is written.
|
|
|
|
Interacting with ports in blocking mode is very convenient: you can
|
|
write straightforward, sequential algorithms whose code flow reflects
|
|
the flow of data. However, blocking I/O has two main limitations.
|
|
|
|
The first is that it's easy to get into a situation where code is
|
|
waiting on data. Time spent waiting on data when code could be doing
|
|
something else is wasteful and prevents your program from reaching its
|
|
peak throughput. If you implement a web server that sequentially
|
|
handles requests from clients, it's very easy for the server to end up
|
|
waiting on a client to finish its HTTP request, or waiting on it to
|
|
consume the response. The end result is that you are able to serve
|
|
fewer requests per second than you'd like to serve.
|
|
|
|
The second limitation is related: a blocking parser over user-controlled
|
|
input is a denial-of-service vulnerability. Indeed the so-called ``slow
|
|
loris'' attack of the early 2010s was just that: an attack on common web
|
|
servers that drip-fed HTTP requests, one character at a time. All it
|
|
took was a handful of slow loris connections to occupy an entire web
|
|
server.
|
|
|
|
In Guile we would like to preserve the ability to write straightforward
|
|
blocking networking processes of all kinds, but under the hood to allow
|
|
those processes to suspend their requests if they would block.
|
|
|
|
To do this, the first piece is to allow Guile ports to declare
|
|
themselves as being nonblocking. This is currently supported only for
|
|
file ports, which also includes sockets, terminals, or any other port
|
|
that is backed by a file descriptor. To do that, we use an arcane UNIX
|
|
incantation:
|
|
|
|
@example
|
|
(let ((flags (fcntl socket F_GETFL)))
|
|
(fcntl socket F_SETFL (logior O_NONBLOCK flags)))
|
|
@end example
|
|
|
|
Now the file descriptor is open in non-blocking mode. If Guile tries to
|
|
read or write from this file descriptor in C, it will block by polling
|
|
on the socket's @code{read_wait_fd}, to preserve the illusion of a
|
|
blocking read or write. @xref{I/O Extensions} for more on that internal
|
|
interface.
|
|
|
|
However if a user uses the new and experimental Scheme implementation of
|
|
ports in @code{(ice-9 sports)}, Guile instead calls the value of the
|
|
@code{current-read-waiter} or @code{current-write-waiter} parameters on
|
|
the port before re-trying the read or write. The default value of these
|
|
parameters does the same thing as the C port runtime: it blocks.
|
|
However it's possible to dynamically bind these parameters to handlers
|
|
that can suspend the current coroutine to a scheduler, to be later
|
|
re-animated once the port becomes readable or writable in the future.
|
|
In the mean-time the scheduler can run other code, for example servicing
|
|
other web requests.
|
|
|
|
Guile does not currently include such a scheduler. Currently we want to
|
|
make sure that we're providing the right primitives that can be used to
|
|
build schedulers and other user-space concurrency patterns. In the
|
|
meantime, have a look at 8sync (@url{https://gnu.org/software/8sync})
|
|
for a prototype of an asynchronous I/O and concurrency facility.
|
|
|
|
|
|
@node BOM Handling
|
|
@subsection Handling of Unicode byte order marks.
|
|
@cindex BOM
|
|
@cindex byte order mark
|
|
|
|
This section documents the finer points of Guile's handling of Unicode
|
|
byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
|
|
at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
|
|
determine the byte order. Occasionally, a BOM is found at the start of
|
|
a UTF-8 stream, but this is much less common and not generally
|
|
recommended.
|
|
|
|
Guile attempts to handle BOMs automatically, and in accordance with the
|
|
recommendations of the Unicode Standard, when the port encoding is set
|
|
to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
|
|
automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
|
|
and automatically consumes one from the start of a UTF-8, UTF-16, or
|
|
UTF-32 stream.
|
|
|
|
As specified in the Unicode Standard, a BOM is only handled specially at
|
|
the start of a stream, and only if the port encoding is set to
|
|
@code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
|
|
set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
|
|
@code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
|
|
the special handling described in this section applies.
|
|
|
|
@itemize @bullet
|
|
@item
|
|
To ensure that Guile will properly detect the byte order of a UTF-16 or
|
|
UTF-32 stream, you must perform a textual read before any writes, seeks,
|
|
or binary I/O. Guile will not attempt to read a BOM unless a read is
|
|
explicitly requested at the start of the stream.
|
|
|
|
@item
|
|
If a textual write is performed before the first read, then an arbitrary
|
|
byte order will be chosen. Currently, big endian is the default on all
|
|
platforms, but that may change in the future. If you wish to explicitly
|
|
control the byte order of an output stream, set the port encoding to
|
|
@code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
|
|
and explicitly write a BOM (@code{#\xFEFF}) if desired.
|
|
|
|
@item
|
|
If @code{set-port-encoding!} is called in the middle of a stream, Guile
|
|
treats this as a new logical ``start of stream'' for purposes of BOM
|
|
handling, and will forget about any BOMs that had previously been seen.
|
|
Therefore, it may choose a different byte order than had been used
|
|
previously. This is intended to support multiple logical text streams
|
|
embedded within a larger binary stream.
|
|
|
|
@item
|
|
Binary I/O operations are not guaranteed to update Guile's notion of
|
|
whether the port is at the ``start of the stream'', nor are they
|
|
guaranteed to produce or consume BOMs.
|
|
|
|
@item
|
|
For ports that support seeking (e.g. normal files), the input and output
|
|
streams are considered linked: if the user reads first, then a BOM will
|
|
be consumed (if appropriate), but later writes will @emph{not} produce a
|
|
BOM. Similarly, if the user writes first, then later reads will
|
|
@emph{not} consume a BOM.
|
|
|
|
@item
|
|
For ports that are not random access (e.g. pipes, sockets, and
|
|
terminals), the input and output streams are considered
|
|
@emph{independent} for purposes of BOM handling: the first read will
|
|
consume a BOM (if appropriate), and the first write will @emph{also}
|
|
produce a BOM (if appropriate). However, the input and output streams
|
|
will always use the same byte order.
|
|
|
|
@item
|
|
Seeks to the beginning of a file will set the ``start of stream'' flags.
|
|
Therefore, a subsequent textual read or write will consume or produce a
|
|
BOM. However, unlike @code{set-port-encoding!}, if a byte order had
|
|
already been chosen for the port, it will remain in effect after a seek,
|
|
and cannot be changed by the presence of a BOM. Seeks anywhere other
|
|
than the beginning of a file clear the ``start of stream'' flags.
|
|
@end itemize
|
|
|
|
@c Local Variables:
|
|
@c TeX-master: "guile.texi"
|
|
@c End:
|