@c -*-texinfo-*- @c This is part of the GNU Guile Reference Manual. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009, @c 2010, 2011, 2013 Free Software Foundation, Inc. @c See the file guile.texi for copying conditions. @node Input and Output @section Input and Output @menu * Ports:: The idea of the port abstraction. * Reading:: Procedures for reading from a port. * Writing:: Procedures for writing to a port. * Closing:: Procedures to close a port. * Buffering:: Controlling when data is written to ports. * Random Access:: Moving around a random access port. * Line/Delimited:: Read and write lines or delimited text. * Block Reading and Writing:: Reading and writing blocks of text. * Default Ports:: Defaults for input, output and errors. * Port Types:: Types of port and how to make them. * R6RS I/O Ports:: The R6RS port API. * I/O Extensions:: Implementing new port types in C. * Non-Blocking I/O:: How Guile deals with EWOULDBLOCK. * BOM Handling:: Handling of Unicode byte order marks. @end menu @node Ports @subsection Ports @cindex Port Sequential input/output in Scheme is represented by operations on a @dfn{port}. This chapter explains the operations that Guile provides for working with ports. Ports are created by opening, for instance @code{open-file} for a file (@pxref{File Ports}). Other kinds of ports include @dfn{soft ports} and @dfn{string ports} (@pxref{Soft Ports}, and @ref{String Ports}). Characters or bytes can be read from an input port and written to an output port, or both on an input/output port. A port can be closed (@pxref{Closing}) when no longer required, after which any attempt to read or write is an error. Ports are garbage collected in the usual way (@pxref{Memory Management}), and will be closed at that time if not already closed. In this case any errors occurring in the close will not be reported. Usually a program will want to explicitly close so as to be sure all its operations have been successful, including any buffered writes (@pxref{Buffering}). Of course if a program has abandoned something due to an error or other condition then closing problems are probably not of interest. It is strongly recommended that file ports be closed explicitly when no longer required. Most systems have limits on how many files can be open, both on a per-process and a system-wide basis. A program that uses many files should take care not to hit those limits. The same applies to similar system resources such as pipes and sockets. Note that automatic garbage collection is triggered only by memory consumption, not by file or other resource usage, so a program cannot rely on that to keep it away from system limits. An explicit call to @code{gc} can of course be relied on to pick up unreferenced ports. If program flow makes it hard to be certain when to close then this may be an acceptable way to control resource usage. All file access uses the ``LFS'' large file support functions when available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be read and written on a 32-bit system. Each port has an associated character encoding that controls how bytes read from the port are converted to characters and controls how characters written to the port are converted to bytes. When ports are created, they inherit their character encoding from the current locale, but, that can be modified after the port is created. Currently, the ports only work with @emph{non-modal} encodings. Most encodings are non-modal, meaning that the conversion of bytes to a string doesn't depend on its context: the same byte sequence will always return the same string. A couple of modal encodings are in common use, like ISO-2022-JP and ISO-2022-KR, and they are not yet supported. @cindex port conversion strategy @cindex conversion strategy, port @cindex decoding error @cindex encoding error Each port also has an associated conversion strategy, which determines what to do when a Guile character can't be converted to the port's encoded character representation for output. There are three possible strategies: to raise an error, to replace the character with a hex escape, or to replace the character with a substitute character. Port conversion strategies are also used when decoding characters from an input port. Finally, all ports have associated input and output buffers, as appropriate. Buffering is a common strategy to limit the overhead of small reads and writes: without buffering, each character fetched from a file would involve at least one call into the kernel, and maybe more depending on the character and the encoding. Instead, Guile will batch reads and writes into internal buffers. However, sometimes you want to make output on a port show up immediately. @xref{Buffering}, for more on interfaces to control port buffering. @rnindex input-port? @deffn {Scheme Procedure} input-port? x @deffnx {C Function} scm_input_port_p (x) Return @code{#t} if @var{x} is an input port, otherwise return @code{#f}. Any object satisfying this predicate also satisfies @code{port?}. @end deffn @rnindex output-port? @deffn {Scheme Procedure} output-port? x @deffnx {C Function} scm_output_port_p (x) Return @code{#t} if @var{x} is an output port, otherwise return @code{#f}. Any object satisfying this predicate also satisfies @code{port?}. @end deffn @deffn {Scheme Procedure} port? x @deffnx {C Function} scm_port_p (x) Return a boolean indicating whether @var{x} is a port. Equivalent to @code{(or (input-port? @var{x}) (output-port? @var{x}))}. @end deffn @deffn {Scheme Procedure} set-port-encoding! port enc @deffnx {C Function} scm_set_port_encoding_x (port, enc) Sets the character encoding that will be used to interpret all port I/O. @var{enc} is a string containing the name of an encoding. Valid encoding names are those @url{http://www.iana.org/assignments/character-sets, defined by IANA}. @end deffn @defvr {Scheme Variable} %default-port-encoding A fluid containing @code{#f} or the name of the encoding to be used by default for newly created ports (@pxref{Fluids and Dynamic States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}. New ports are created with the encoding appropriate for the current locale if @code{setlocale} has been called or the value specified by this fluid otherwise. @end defvr @deffn {Scheme Procedure} port-encoding port @deffnx {C Function} scm_port_encoding (port) Returns, as a string, the character encoding that @var{port} uses to interpret its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}. @end deffn @deffn {Scheme Procedure} set-port-conversion-strategy! port sym @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym) Sets the behavior of Guile when outputting a character that is not representable in the port's current encoding, or when Guile encounters a decoding error when trying to read a character. @var{sym} can be either @code{error}, @code{substitute}, or @code{escape}. If @var{port} is an open port, the conversion error behavior is set for that port. If it is @code{#f}, it is set as the default behavior for any future ports that get created in this thread. @end deffn For an output port, a there are three possible port conversion strategies. The @code{error} strategy will throw an error when a nonconvertible character is encountered. The @code{substitute} strategy will replace nonconvertible characters with a question mark (@samp{?}). Finally the @code{escape} strategy will print nonconvertible characters as a hex escape, using the escaping that is recognized by Guile's string syntax. Note that if the port's encoding is a Unicode encoding, like @code{UTF-8}, then encoding errors are impossible. For an input port, the @code{error} strategy will cause Guile to throw an error if it encounters an invalid encoding, such as might happen if you tried to read @code{ISO-8859-1} as @code{UTF-8}. The error is thrown before advancing the read position. The @code{substitute} strategy will replace the bad bytes with a U+FFFD replacement character, in accordance with Unicode recommendations. When reading from an input port, the @code{escape} strategy is treated as if it were @code{error}. @deffn {Scheme Procedure} port-conversion-strategy port @deffnx {C Function} scm_port_conversion_strategy (port) Returns the behavior of the port when outputting a character that is not representable in the port's current encoding. If @var{port} is @code{#f}, then the current default behavior will be returned. New ports will have this default behavior when they are created. @end deffn @deffn {Scheme Variable} %default-port-conversion-strategy The fluid that defines the conversion strategy for newly created ports, and for other conversion routines such as @code{scm_to_stringn}, @code{scm_from_stringn}, @code{string->pointer}, and @code{pointer->string}. Its value must be one of the symbols described above, with the same semantics: @code{error}, @code{substitute}, or @code{escape}. When Guile starts, its value is @code{substitute}. Note that @code{(set-port-conversion-strategy! #f @var{sym})} is equivalent to @code{(fluid-set! %default-port-conversion-strategy @var{sym})}. @end deffn @node Reading @subsection Reading @cindex Reading These procedures pertain to reading characters and strings from ports. To read general S-expressions from ports, @xref{Scheme Read}. @rnindex eof-object? @cindex End of file object @deffn {Scheme Procedure} eof-object? x @deffnx {C Function} scm_eof_object_p (x) Return @code{#t} if @var{x} is an end-of-file object; otherwise return @code{#f}. @end deffn @rnindex char-ready? @deffn {Scheme Procedure} char-ready? [port] @deffnx {C Function} scm_char_ready_p (port) Return @code{#t} if a character is ready on input @var{port} and return @code{#f} otherwise. If @code{char-ready?} returns @code{#t} then the next @code{read-char} operation on @var{port} is guaranteed not to hang. If @var{port} is a file port at end of file then @code{char-ready?} returns @code{#t}. @code{char-ready?} exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors associated with such ports must make sure that characters whose existence has been asserted by @code{char-ready?} cannot be rubbed out. If @code{char-ready?} were to return @code{#f} at end of file, a port at end of file would be indistinguishable from an interactive port that has no ready characters. @end deffn @rnindex read-char @deffn {Scheme Procedure} read-char [port] @deffnx {C Function} scm_read_char (port) Return the next character available from @var{port}, updating @var{port} to point to the following character. If no more characters are available, the end-of-file object is returned. A decoding error, if any, is handled in accordance with the port's conversion strategy. @end deffn @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size) Read up to @var{size} bytes from @var{port} and store them in @var{buffer}. The return value is the number of bytes actually read, which can be less than @var{size} if end-of-file has been reached. Note that this function does not update @code{port-line} and @code{port-column} below. @end deftypefn @rnindex peek-char @deffn {Scheme Procedure} peek-char [port] @deffnx {C Function} scm_peek_char (port) Return the next character available from @var{port}, @emph{without} updating @var{port} to point to the following character. If no more characters are available, the end-of-file object is returned. The value returned by a call to @code{peek-char} is the same as the value that would have been returned by a call to @code{read-char} on the same port. The only difference is that the very next call to @code{read-char} or @code{peek-char} on that @var{port} will return the value returned by the preceding call to @code{peek-char}. In particular, a call to @code{peek-char} on an interactive port will hang waiting for input whenever a call to @code{read-char} would have hung. As for @code{read-char}, decoding errors are handled in accordance with the port's conversion strategy. @end deffn @deffn {Scheme Procedure} unread-char cobj [port] @deffnx {C Function} scm_unread_char (cobj, port) Place character @var{cobj} in @var{port} so that it will be read by the next read operation. If called multiple times, the unread characters will be read again in last-in first-out order. If @var{port} is not supplied, the current input port is used. @end deffn @deffn {Scheme Procedure} unread-string str port @deffnx {C Function} scm_unread_string (str, port) Place the string @var{str} in @var{port} so that its characters will be read from left-to-right as the next characters from @var{port} during subsequent read operations. If called multiple times, the unread characters will be read again in last-in first-out order. If @var{port} is not supplied, the @code{current-input-port} is used. @end deffn @deffn {Scheme Procedure} drain-input port @deffnx {C Function} scm_drain_input (port) This procedure clears a port's input buffers, similar to the way that force-output clears the output buffer. The contents of the buffers are returned as a single string, e.g., @lisp (define p (open-input-file ...)) (drain-input p) => empty string, nothing buffered yet. (unread-char (read-char p) p) (drain-input p) => initial chars from p, up to the buffer size. @end lisp Draining the buffers may be useful for cleanly finishing buffered I/O so that the file descriptor can be used directly for further input. @end deffn @deffn {Scheme Procedure} port-column port @deffnx {Scheme Procedure} port-line port @deffnx {C Function} scm_port_column (port) @deffnx {C Function} scm_port_line (port) Return the current column number or line number of @var{port}. If the number is unknown, the result is #f. Otherwise, the result is a 0-origin integer - i.e.@: the first character of the first line is line 0, column 0. (However, when you display a file position, for example in an error message, we recommend you add 1 to get 1-origin integers. This is because lines and column numbers traditionally start with 1, and that is what non-programmers will find most natural.) @end deffn @deffn {Scheme Procedure} set-port-column! port column @deffnx {Scheme Procedure} set-port-line! port line @deffnx {C Function} scm_set_port_column_x (port, column) @deffnx {C Function} scm_set_port_line_x (port, line) Set the current column or line number of @var{port}. @end deffn @node Writing @subsection Writing @cindex Writing These procedures are for writing characters and strings to ports. For more information on writing arbitrary Scheme objects to ports, @xref{Scheme Write}. @deffn {Scheme Procedure} get-print-state port @deffnx {C Function} scm_get_print_state (port) Return the print state of the port @var{port}. If @var{port} has no associated print state, @code{#f} is returned. @end deffn @rnindex newline @deffn {Scheme Procedure} newline [port] @deffnx {C Function} scm_newline (port) Send a newline to @var{port}. If @var{port} is omitted, send to the current output port. @end deffn @deffn {Scheme Procedure} port-with-print-state port [pstate] @deffnx {C Function} scm_port_with_print_state (port, pstate) Create a new port which behaves like @var{port}, but with an included print state @var{pstate}. @var{pstate} is optional. If @var{pstate} isn't supplied and @var{port} already has a print state, the old print state is reused. @end deffn @deffn {Scheme Procedure} simple-format destination message . args @deffnx {C Function} scm_simple_format (destination, message, args) Write @var{message} to @var{destination}, defaulting to the current output port. @var{message} can contain @code{~A} and @code{~S} escapes. When printed, the escapes are replaced with corresponding members of @var{args}: @code{~A} formats using @code{display} and @code{~S} formats using @code{write}. If @var{destination} is @code{#t}, then use the current output port, if @var{destination} is @code{#f}, then return a string containing the formatted text. Does not add a trailing newline. @end deffn @rnindex write-char @deffn {Scheme Procedure} write-char chr [port] @deffnx {C Function} scm_write_char (chr, port) Send character @var{chr} to @var{port}. @end deffn @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size) Write @var{size} bytes at @var{buffer} to @var{port}. Note that this function does not update @code{port-line} and @code{port-column} (@pxref{Reading}). @end deftypefn @deftypefn {C Function} void scm_lfwrite (const char *buffer, size_t size, SCM port) Write @var{size} bytes at @var{buffer} to @var{port}. The @code{lf} indicates that unlike @code{scm_c_write}, this function updates the port's @code{port-line} and @code{port-column}, and also flushes the port if the data contains a newline (@code{\n}) and the port is line-buffered. @end deftypefn @findex fflush @deffn {Scheme Procedure} force-output [port] @deffnx {C Function} scm_force_output (port) Flush the specified output port, or the current output port if @var{port} is omitted. The current output buffer contents are passed to the underlying port implementation (e.g., in the case of fports, the data will be written to the file and the output buffer will be cleared.) It has no effect on an unbuffered port. The return value is unspecified. @end deffn @deffn {Scheme Procedure} flush-all-ports @deffnx {C Function} scm_flush_all_ports () Equivalent to calling @code{force-output} on all open output ports. The return value is unspecified. @end deffn @node Closing @subsection Closing @cindex Closing ports @cindex Port, close @deffn {Scheme Procedure} close-port port @deffnx {C Function} scm_close_port (port) Close the specified port object. Return @code{#t} if it successfully closes a port or @code{#f} if it was already closed. An exception may be raised if an error occurs, for example when flushing buffered output. @xref{Buffering}, for more on buffered output. See also @ref{Ports and File Descriptors, close}, for a procedure which can close file descriptors. @end deffn @deffn {Scheme Procedure} close-input-port port @deffnx {Scheme Procedure} close-output-port port @deffnx {C Function} scm_close_input_port (port) @deffnx {C Function} scm_close_output_port (port) @rnindex close-input-port @rnindex close-output-port Close the specified input or output @var{port}. An exception may be raised if an error occurs while closing. If @var{port} is already closed, nothing is done. The return value is unspecified. See also @ref{Ports and File Descriptors, close}, for a procedure which can close file descriptors. @end deffn @deffn {Scheme Procedure} port-closed? port @deffnx {C Function} scm_port_closed_p (port) Return @code{#t} if @var{port} is closed or @code{#f} if it is open. @end deffn @node Buffering @subsection Buffering @cindex Port, buffering Every port has associated input and output buffers. You can think of ports as being backed by some mutable store, and that store might be far away. For example, ports backed by file descriptors have to go all the way to the kernel to read and write their data. To avoid this round-trip cost, Guile usually reads in data from the mutable store in chunks, and then services small requests like @code{get-char} out of that intermediate buffer. Similarly, small writes like @code{write-char} first go to a buffer, and are sent to the store when the buffer is full (or when port is flushed). Buffered ports speed up your program by reducing the number of round-trips to the mutable store, and the do so in a way that is mostly transparent to the user. There are two major ways, however, in which buffering affects program semantics. Building correct, performant programs requires understanding these situations. The first case is in random-access read/write ports (@pxref{Random Access}). These ports, usually backed by a file, logically operate over the same mutable store when both reading and writing. So, if you read a character, causing the buffer to fill, then write a character, the bytes you filled in your read buffer are now invalid. Every time you switch between reading and writing, Guile has to flush any pending buffer. If this happens frequently, the cost can be high. In that case you should reduce the amount that you buffer, in both directions. Similarly, Guile has to flush buffers before seeking. None of these considerations apply to sockets, which don't logically read from and write to the same mutable store, and are not seekable. Note also that sockets are unbuffered by default. @xref{Network Sockets and Communication}. The second case is the more pernicious one. If you write data to a buffered port, it probably hasn't gone out to the mutable store yet. (This ``probably'' introduces some indeterminism in your program: what goes to the store, and when, depends on how full the buffer is. It is something that the user needs to explicitly be aware of.) The data is written to the store later -- when the buffer fills up due to another write, or when @code{force-output} is called, or when @code{close-port} is called, or when the program exits, or even when the garbage collector runs. The salient point is, @emph{the errors are signalled then too}. Buffered writes defer error detection (and defer the side effects to the mutable store), perhaps indefinitely if the port type does not need to be closed at GC. One common heuristic that works well for textual ports is to flush output when a newline (@code{\n}) is written. This @dfn{line buffering} mode is on by default for TTY ports. Most other ports are @dfn{block buffered}, meaning that once the output buffer reaches the block size, which depends on the port and its configuration, the output is flushed as a block, without regard to what is in the block. Likewise reads are read in at the block size, though if there are fewer bytes available to read, the buffer may not be entirely filled. Note that binary reads or writes that are larger than the buffer size go directly to the mutable store without passing through the buffers. If your access pattern involves many big reads or writes, buffering might not matter so much to you. To control the buffering behavior of a port, use @code{setvbuf}. @deffn {Scheme Procedure} setvbuf port mode [size] @deffnx {C Function} scm_setvbuf (port, mode, size) @cindex port buffering Set the buffering mode for @var{port}. @var{mode} can be one of the following symbols: @table @code @item none non-buffered @item line line buffered @item block block buffered, using a newly allocated buffer of @var{size} bytes. If @var{size} is omitted, a default size will be used. @end table @end deffn Another way to set the buffering, for file ports, is to open the file with @code{0} or @code{l} as part of the mode string, for unbuffered or line-buffered ports, respectively. @xref{File Ports}, for more. All of these considerations are very similar to those of streams in the C library, although Guile's ports are not built on top of C streams. Still, it is useful to read what other systems do. @xref{Streams,,,libc,The GNU C Library Reference Manual}, for more discussion on C streams. @node Random Access @subsection Random Access @cindex Random access, ports @cindex Port, random access @deffn {Scheme Procedure} seek fd_port offset whence @deffnx {C Function} scm_seek (fd_port, offset, whence) Sets the current position of @var{fd_port} to the integer @var{offset}. For a file port, @var{offset} is expressed as a number of bytes; for other types of ports, such as string ports, @var{offset} is an abstract representation of the position within the port's data, not necessarily expressed as a number of bytes. @var{offset} is interpreted according to the value of @var{whence}. One of the following variables should be supplied for @var{whence}: @defvar SEEK_SET Seek from the beginning of the file. @end defvar @defvar SEEK_CUR Seek from the current position. @end defvar @defvar SEEK_END Seek from the end of the file. @end defvar If @var{fd_port} is a file descriptor, the underlying system call is @code{lseek}. @var{port} may be a string port. The value returned is the new position in @var{fd_port}. This means that the current position of a port can be obtained using: @lisp (seek port 0 SEEK_CUR) @end lisp @end deffn @deffn {Scheme Procedure} ftell fd_port @deffnx {C Function} scm_ftell (fd_port) Return an integer representing the current position of @var{fd_port}, measured from the beginning. Equivalent to: @lisp (seek port 0 SEEK_CUR) @end lisp @end deffn @findex truncate @findex ftruncate @deffn {Scheme Procedure} truncate-file file [length] @deffnx {C Function} scm_truncate_file (file, length) Truncate @var{file} to @var{length} bytes. @var{file} can be a filename string, a port object, or an integer file descriptor. The return value is unspecified. For a port or file descriptor @var{length} can be omitted, in which case the file is truncated at the current position (per @code{ftell} above). On most systems a file can be extended by giving a length greater than the current size, but this is not mandatory in the POSIX standard. @end deffn @node Line/Delimited @subsection Line Oriented and Delimited Text @cindex Line input/output @cindex Port, line input/output The delimited-I/O module can be accessed with: @lisp (use-modules (ice-9 rdelim)) @end lisp It can be used to read or write lines of text, or read text delimited by a specified set of characters. It's similar to the @code{(scsh rdelim)} module from guile-scsh, but does not use multiple values or character sets and has an extra procedure @code{write-line}. @c begin (scm-doc-string "rdelim.scm" "read-line") @deffn {Scheme Procedure} read-line [port] [handle-delim] Return a line of text from @var{port} if specified, otherwise from the value returned by @code{(current-input-port)}. Under Unix, a line of text is terminated by the first end-of-line character or by end-of-file. If @var{handle-delim} is specified, it should be one of the following symbols: @table @code @item trim Discard the terminating delimiter. This is the default, but it will be impossible to tell whether the read terminated with a delimiter or end-of-file. @item concat Append the terminating delimiter (if any) to the returned string. @item peek Push the terminating delimiter (if any) back on to the port. @item split Return a pair containing the string read from the port and the terminating delimiter or end-of-file object. @end table @end deffn @c begin (scm-doc-string "rdelim.scm" "read-line!") @deffn {Scheme Procedure} read-line! buf [port] Read a line of text into the supplied string @var{buf} and return the number of characters added to @var{buf}. If @var{buf} is filled, then @code{#f} is returned. Read from @var{port} if specified, otherwise from the value returned by @code{(current-input-port)}. @end deffn @c begin (scm-doc-string "rdelim.scm" "read-delimited") @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim] Read text until one of the characters in the string @var{delims} is found or end-of-file is reached. Read from @var{port} if supplied, otherwise from the value returned by @code{(current-input-port)}. @var{handle-delim} takes the same values as described for @code{read-line}. @end deffn @c begin (scm-doc-string "rdelim.scm" "read-delimited!") @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end] Read text into the supplied string @var{buf}. If a delimiter was found, return the number of characters written, except if @var{handle-delim} is @code{split}, in which case the return value is a pair, as noted above. As a special case, if @var{port} was already at end-of-stream, the EOF object is returned. Also, if no characters were written because the buffer was full, @code{#f} is returned. It's something of a wacky interface, to be honest. @end deffn @deffn {Scheme Procedure} write-line obj [port] @deffnx {C Function} scm_write_line (obj, port) Display @var{obj} and a newline character to @var{port}. If @var{port} is not specified, @code{(current-output-port)} is used. This function is equivalent to: @lisp (display obj [port]) (newline [port]) @end lisp @end deffn In the past, Guile did not have a procedure that would just read out all of the characters from a port. As a workaround, many people just called @code{read-delimited} with no delimiters, knowing that would produce the behavior they wanted. This prompted Guile developers to add some routines that would read all characters from a port. So it is that @code{(ice-9 rdelim)} is also the home for procedures that can reading undelimited text: @deffn {Scheme Procedure} read-string [port] [count] Read all of the characters out of @var{port} and return them as a string. If the @var{count} is present, treat it as a limit to the number of characters to read. By default, read from the current input port, with no size limit on the result. This procedure always returns a string, even if no characters were read. @end deffn @deffn {Scheme Procedure} read-string! buf [port] [start] [end] Fill @var{buf} with characters read from @var{port}, defaulting to the current input port. Return the number of characters read. If @var{start} or @var{end} are specified, store data only into the substring of @var{str} bounded by @var{start} and @var{end} (which default to the beginning and end of the string, respectively). @end deffn Some of the aforementioned I/O functions rely on the following C primitives. These will mainly be of interest to people hacking Guile internals. @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]] @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end) Read characters from @var{port} into @var{str} until one of the characters in the @var{delims} string is encountered. If @var{gobble} is true, discard the delimiter character; otherwise, leave it in the input stream for the next read. If @var{port} is not specified, use the value of @code{(current-input-port)}. If @var{start} or @var{end} are specified, store data only into the substring of @var{str} bounded by @var{start} and @var{end} (which default to the beginning and end of the string, respectively). Return a pair consisting of the delimiter that terminated the string and the number of characters read. If reading stopped at the end of file, the delimiter returned is the @var{eof-object}; if the string was filled without encountering a delimiter, this value is @code{#f}. @end deffn @deffn {Scheme Procedure} %read-line [port] @deffnx {C Function} scm_read_line (port) Read a newline-terminated line from @var{port}, allocating storage as necessary. The newline terminator (if any) is removed from the string, and a pair consisting of the line and its delimiter is returned. The delimiter may be either a newline or the @var{eof-object}; if @code{%read-line} is called at the end of file, it returns the pair @code{(# . #)}. @end deffn @node Block Reading and Writing @subsection Block reading and writing @cindex Block read/write @cindex Port, block read/write The Block-string-I/O module can be accessed with: @lisp (use-modules (ice-9 rw)) @end lisp It currently contains procedures that help to implement the @code{(scsh rw)} module in guile-scsh. @deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]] @deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end) Read characters from a port or file descriptor into a string @var{str}. A port must have an underlying file descriptor --- a so-called fport. This procedure is scsh-compatible and can efficiently read large strings. It will: @itemize @item attempt to fill the entire string, unless the @var{start} and/or @var{end} arguments are supplied. i.e., @var{start} defaults to 0 and @var{end} defaults to @code{(string-length str)} @item use the current input port if @var{port_or_fdes} is not supplied. @item return fewer than the requested number of characters in some cases, e.g., on end of file, if interrupted by a signal, or if not all the characters are immediately available. @item wait indefinitely for some input if no characters are currently available, unless the port is in non-blocking mode. @item read characters from the port's input buffers if available, instead from the underlying file descriptor. @item return @code{#f} if end-of-file is encountered before reading any characters, otherwise return the number of characters read. @item return 0 if the port is in non-blocking mode and no characters are immediately available. @item return 0 if the request is for 0 bytes, with no end-of-file check. @end itemize @end deffn @deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]] @deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end) Write characters from a string @var{str} to a port or file descriptor. A port must have an underlying file descriptor --- a so-called fport. This procedure is scsh-compatible and can efficiently write large strings. It will: @itemize @item attempt to write the entire string, unless the @var{start} and/or @var{end} arguments are supplied. i.e., @var{start} defaults to 0 and @var{end} defaults to @code{(string-length str)} @item use the current output port if @var{port_of_fdes} is not supplied. @item in the case of a buffered port, store the characters in the port's output buffer, if all will fit. If they will not fit then any existing buffered characters will be flushed before attempting to write the new characters directly to the underlying file descriptor. If the port is in non-blocking mode and buffered characters can not be flushed immediately, then an @code{EAGAIN} system-error exception will be raised (Note: scsh does not support the use of non-blocking buffered ports.) @item write fewer than the requested number of characters in some cases, e.g., if interrupted by a signal or if not all of the output can be accepted immediately. @item wait indefinitely for at least one character from @var{str} to be accepted by the port, unless the port is in non-blocking mode. @item return the number of characters accepted by the port. @item return 0 if the port is in non-blocking mode and can not accept at least one character from @var{str} immediately @item return 0 immediately if the request size is 0 bytes. @end itemize @end deffn @node Default Ports @subsection Default Ports for Input, Output and Errors @cindex Default ports @cindex Port, default @rnindex current-input-port @deffn {Scheme Procedure} current-input-port @deffnx {C Function} scm_current_input_port () @cindex standard input Return the current input port. This is the default port used by many input procedures. Initially this is the @dfn{standard input} in Unix and C terminology. When the standard input is a tty the port is unbuffered, otherwise it's fully buffered. Unbuffered input is good if an application runs an interactive subprocess, since any type-ahead input won't go into Guile's buffer and be unavailable to the subprocess. Note that Guile buffering is completely separate from the tty ``line discipline''. In the usual cooked mode on a tty Guile only sees a line of input once the user presses @key{Return}. @end deffn @rnindex current-output-port @deffn {Scheme Procedure} current-output-port @deffnx {C Function} scm_current_output_port () @cindex standard output Return the current output port. This is the default port used by many output procedures. Initially this is the @dfn{standard output} in Unix and C terminology. When the standard output is a tty this port is unbuffered, otherwise it's fully buffered. Unbuffered output to a tty is good for ensuring progress output or a prompt is seen. But an application which always prints whole lines could change to line buffered, or an application with a lot of output could go fully buffered and perhaps make explicit @code{force-output} calls (@pxref{Writing}) at selected points. @end deffn @deffn {Scheme Procedure} current-error-port @deffnx {C Function} scm_current_error_port () @cindex standard error output Return the port to which errors and warnings should be sent. Initially this is the @dfn{standard error} in Unix and C terminology. When the standard error is a tty this port is unbuffered, otherwise it's fully buffered. @end deffn @deffn {Scheme Procedure} set-current-input-port port @deffnx {Scheme Procedure} set-current-output-port port @deffnx {Scheme Procedure} set-current-error-port port @deffnx {C Function} scm_set_current_input_port (port) @deffnx {C Function} scm_set_current_output_port (port) @deffnx {C Function} scm_set_current_error_port (port) Change the ports returned by @code{current-input-port}, @code{current-output-port} and @code{current-error-port}, respectively, so that they use the supplied @var{port} for input or output. @end deffn @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port) @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port) @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port) These functions must be used inside a pair of calls to @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic Wind}). During the dynwind context, the indicated port is set to @var{port}. More precisely, the current port is swapped with a `backup' value whenever the dynwind context is entered or left. The backup value is initialized with the @var{port} argument. @end deftypefn @node Port Types @subsection Types of Port @cindex Types of ports @cindex Port, types @menu * File Ports:: Ports on an operating system file. * String Ports:: Ports on a Scheme string. * Soft Ports:: Ports on arbitrary Scheme procedures. * Void Ports:: Ports on nothing at all. @end menu @node File Ports @subsubsection File Ports @cindex File port @cindex Port, file The following procedures are used to open file ports. See also @ref{Ports and File Descriptors, open}, for an interface to the Unix @code{open} system call. Most systems have limits on how many files can be open, so it's strongly recommended that file ports be closed explicitly when no longer required (@pxref{Ports}). @deffn {Scheme Procedure} open-file filename mode @ [#:guess-encoding=#f] [#:encoding=#f] @deffnx {C Function} scm_open_file_with_encoding @ (filename, mode, guess_encoding, encoding) @deffnx {C Function} scm_open_file (filename, mode) Open the file whose name is @var{filename}, and return a port representing that file. The attributes of the port are determined by the @var{mode} string. The way in which this is interpreted is similar to C stdio. The first character must be one of the following: @table @samp @item r Open an existing file for input. @item w Open a file for output, creating it if it doesn't already exist or removing its contents if it does. @item a Open a file for output, creating it if it doesn't already exist. All writes to the port will go to the end of the file. The "append mode" can be turned off while the port is in use @pxref{Ports and File Descriptors, fcntl} @end table The following additional characters can be appended: @table @samp @item + Open the port for both input and output. E.g., @code{r+}: open an existing file for both input and output. @item 0 Create an "unbuffered" port. In this case input and output operations are passed directly to the underlying port implementation without additional buffering. This is likely to slow down I/O operations. The buffering mode can be changed while a port is in use (@pxref{Buffering}). @item l Add line-buffering to the port. The port output buffer will be automatically flushed whenever a newline character is written. @item b Use binary mode, ensuring that each byte in the file will be read as one Scheme character. To provide this property, the file will be opened with the 8-bit character encoding "ISO-8859-1", ignoring the default port encoding. @xref{Ports}, for more information on port encodings. Note that while it is possible to read and write binary data as characters or strings, it is usually better to treat bytes as octets, and byte sequences as bytevectors. @xref{R6RS Binary Input}, and @ref{R6RS Binary Output}, for more. This option had another historical meaning, for DOS compatibility: in the default (textual) mode, DOS reads a CR-LF sequence as one LF byte. The @code{b} flag prevents this from happening, adding @code{O_BINARY} to the underlying @code{open} call. Still, the flag is generally useful because of its port encoding ramifications. @end table Unless binary mode is requested, the character encoding of the new port is determined as follows: First, if @var{guess-encoding} is true, the @code{file-encoding} procedure is used to guess the encoding of the file (@pxref{Character Encoding of Source Files}). If @var{guess-encoding} is false or if @code{file-encoding} fails, @var{encoding} is used unless it is also false. As a last resort, the default port encoding is used. @xref{Ports}, for more information on port encodings. It is an error to pass a non-false @var{guess-encoding} or @var{encoding} if binary mode is requested. If a file cannot be opened with the access requested, @code{open-file} throws an exception. @end deffn @rnindex open-input-file @deffn {Scheme Procedure} open-input-file filename @ [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] Open @var{filename} for input. If @var{binary} is true, open the port in binary mode, otherwise use text mode. @var{encoding} and @var{guess-encoding} determine the character encoding as described above for @code{open-file}. Equivalent to @lisp (open-file @var{filename} (if @var{binary} "rb" "r") #:guess-encoding @var{guess-encoding} #:encoding @var{encoding}) @end lisp @end deffn @rnindex open-output-file @deffn {Scheme Procedure} open-output-file filename @ [#:encoding=#f] [#:binary=#f] Open @var{filename} for output. If @var{binary} is true, open the port in binary mode, otherwise use text mode. @var{encoding} specifies the character encoding as described above for @code{open-file}. Equivalent to @lisp (open-file @var{filename} (if @var{binary} "wb" "w") #:encoding @var{encoding}) @end lisp @end deffn @deffn {Scheme Procedure} call-with-input-file filename proc @ [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] @deffnx {Scheme Procedure} call-with-output-file filename proc @ [#:encoding=#f] [#:binary=#f] @rnindex call-with-input-file @rnindex call-with-output-file Open @var{filename} for input or output, and call @code{(@var{proc} port)} with the resulting port. Return the value returned by @var{proc}. @var{filename} is opened as per @code{open-input-file} or @code{open-output-file} respectively, and an error is signaled if it cannot be opened. When @var{proc} returns, the port is closed. If @var{proc} does not return (e.g.@: if it throws an error), then the port might not be closed automatically, though it will be garbage collected in the usual way if not otherwise referenced. @end deffn @deffn {Scheme Procedure} with-input-from-file filename thunk @ [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] @deffnx {Scheme Procedure} with-output-to-file filename thunk @ [#:encoding=#f] [#:binary=#f] @deffnx {Scheme Procedure} with-error-to-file filename thunk @ [#:encoding=#f] [#:binary=#f] @rnindex with-input-from-file @rnindex with-output-to-file Open @var{filename} and call @code{(@var{thunk})} with the new port setup as respectively the @code{current-input-port}, @code{current-output-port}, or @code{current-error-port}. Return the value returned by @var{thunk}. @var{filename} is opened as per @code{open-input-file} or @code{open-output-file} respectively, and an error is signaled if it cannot be opened. When @var{thunk} returns, the port is closed and the previous setting of the respective current port is restored. The current port setting is managed with @code{dynamic-wind}, so the previous value is restored no matter how @var{thunk} exits (eg.@: an exception), and if @var{thunk} is re-entered (via a captured continuation) then it's set again to the @var{filename} port. The port is closed when @var{thunk} returns normally, but not when exited via an exception or new continuation. This ensures it's still ready for use if @var{thunk} is re-entered by a captured continuation. Of course the port is always garbage collected and closed in the usual way when no longer referenced anywhere. @end deffn @deffn {Scheme Procedure} port-mode port @deffnx {C Function} scm_port_mode (port) Return the port modes associated with the open port @var{port}. These will not necessarily be identical to the modes used when the port was opened, since modes such as "append" which are used only during port creation are not retained. @end deffn @deffn {Scheme Procedure} port-filename port @deffnx {C Function} scm_port_filename (port) Return the filename associated with @var{port}, or @code{#f} if no filename is associated with the port. @var{port} must be open, @code{port-filename} cannot be used once the port is closed. @end deffn @deffn {Scheme Procedure} set-port-filename! port filename @deffnx {C Function} scm_set_port_filename_x (port, filename) Change the filename associated with @var{port}, using the current input port if none is specified. Note that this does not change the port's source of data, but only the value that is returned by @code{port-filename} and reported in diagnostic output. @end deffn @deffn {Scheme Procedure} file-port? obj @deffnx {C Function} scm_file_port_p (obj) Determine whether @var{obj} is a port that is related to a file. @end deffn @node String Ports @subsubsection String Ports @cindex String port @cindex Port, string The following allow string ports to be opened by analogy to R4RS file port facilities: With string ports, the port-encoding is treated differently than other types of ports. When string ports are created, they do not inherit a character encoding from the current locale. They are given a default locale that allows them to handle all valid string characters. Typically one should not modify a string port's character encoding away from its default. @deffn {Scheme Procedure} call-with-output-string proc @deffnx {C Function} scm_call_with_output_string (proc) Calls the one-argument procedure @var{proc} with a newly created output port. When the function returns, the string composed of the characters written into the port is returned. @var{proc} should not close the port. @end deffn @deffn {Scheme Procedure} call-with-input-string string proc @deffnx {C Function} scm_call_with_input_string (string, proc) Calls the one-argument procedure @var{proc} with a newly created input port from which @var{string}'s contents may be read. The value yielded by the @var{proc} is returned. @end deffn @deffn {Scheme Procedure} with-output-to-string thunk Calls the zero-argument procedure @var{thunk} with the current output port set temporarily to a new string port. It returns a string composed of the characters written to the current output. @end deffn @deffn {Scheme Procedure} with-input-from-string string thunk Calls the zero-argument procedure @var{thunk} with the current input port set temporarily to a string port opened on the specified @var{string}. The value yielded by @var{thunk} is returned. @end deffn @deffn {Scheme Procedure} open-input-string str @deffnx {C Function} scm_open_input_string (str) Take a string and return an input port that delivers characters from the string. The port can be closed by @code{close-input-port}, though its storage will be reclaimed by the garbage collector if it becomes inaccessible. @end deffn @deffn {Scheme Procedure} open-output-string @deffnx {C Function} scm_open_output_string () Return an output port that will accumulate characters for retrieval by @code{get-output-string}. The port can be closed by the procedure @code{close-output-port}, though its storage will be reclaimed by the garbage collector if it becomes inaccessible. @end deffn @deffn {Scheme Procedure} get-output-string port @deffnx {C Function} scm_get_output_string (port) Given an output port created by @code{open-output-string}, return a string consisting of the characters that have been output to the port so far. @code{get-output-string} must be used before closing @var{port}, once closed the string cannot be obtained. @end deffn A string port can be used in many procedures which accept a port but which are not dependent on implementation details of fports. E.g., seeking and truncating will work on a string port, but trying to extract the file descriptor number will fail. @node Soft Ports @subsubsection Soft Ports @cindex Soft port @cindex Port, soft A @dfn{soft-port} is a port based on a vector of procedures capable of accepting or delivering characters. It allows emulation of I/O ports. @deffn {Scheme Procedure} make-soft-port pv modes @deffnx {C Function} scm_make_soft_port (pv, modes) Return a port capable of receiving or delivering characters as specified by the @var{modes} string (@pxref{File Ports, open-file}). @var{pv} must be a vector of length 5 or 6. Its components are as follows: @enumerate 0 @item procedure accepting one character for output @item procedure accepting a string for output @item thunk for flushing output @item thunk for getting one character @item thunk for closing port (not by garbage collection) @item (if present and not @code{#f}) thunk for computing the number of characters that can be read from the port without blocking. @end enumerate For an output-only port only elements 0, 1, 2, and 4 need be procedures. For an input-only port only elements 3 and 4 need be procedures. Thunks 2 and 4 can instead be @code{#f} if there is no useful operation for them to perform. If thunk 3 returns @code{#f} or an @code{eof-object} (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on Scheme}) it indicates that the port has reached end-of-file. For example: @lisp (define stdout (current-output-port)) (define p (make-soft-port (vector (lambda (c) (write c stdout)) (lambda (s) (display s stdout)) (lambda () (display "." stdout)) (lambda () (char-upcase (read-char))) (lambda () (display "@@" stdout))) "rw")) (write p p) @result{} # @end lisp @end deffn @node Void Ports @subsubsection Void Ports @cindex Void port @cindex Port, void This kind of port causes any data to be discarded when written to, and always returns the end-of-file object when read from. @deffn {Scheme Procedure} %make-void-port mode @deffnx {C Function} scm_sys_make_void_port (mode) Create and return a new void port. A void port acts like @file{/dev/null}. The @var{mode} argument specifies the input/output modes for this port: see the documentation for @code{open-file} in @ref{File Ports}. @end deffn @node R6RS I/O Ports @subsection R6RS I/O Ports @cindex R6RS @cindex R6RS ports The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs io ports)} module. It provides features, such as binary I/O and Unicode string I/O, that complement or refine Guile's historical port API presented above (@pxref{Input and Output}). Note that R6RS ports are not disjoint from Guile's native ports, so Guile-specific procedures will work on ports created using the R6RS API, and vice versa. The text in this section is taken from the R6RS standard libraries document, with only minor adaptions for inclusion in this manual. The Guile developers offer their thanks to the R6RS editors for having provided the report's text under permissive conditions making this possible. @c FIXME: Update description when implemented. @emph{Note}: The implementation of this R6RS API is not complete yet. @menu * R6RS File Names:: File names. * R6RS File Options:: Options for opening files. * R6RS Buffer Modes:: Influencing buffering behavior. * R6RS Transcoders:: Influencing port encoding. * R6RS End-of-File:: The end-of-file object. * R6RS Port Manipulation:: Manipulating R6RS ports. * R6RS Input Ports:: Input Ports. * R6RS Binary Input:: Binary input. * R6RS Textual Input:: Textual input. * R6RS Output Ports:: Output Ports. * R6RS Binary Output:: Binary output. * R6RS Textual Output:: Textual output. @end menu A subset of the @code{(rnrs io ports)} module, plus one non-standard procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is provided by the @code{(ice-9 binary-ports)} module. It contains binary input/output procedures and does not rely on R6RS support. @node R6RS File Names @subsubsection File Names Some of the procedures described in this chapter accept a file name as an argument. Valid values for such a file name include strings that name a file using the native notation of file system paths on an implementation's underlying operating system, and may include implementation-dependent values as well. A @var{filename} parameter name means that the corresponding argument must be a file name. @node R6RS File Options @subsubsection File Options @cindex file options When opening a file, the various procedures in this library accept a @code{file-options} object that encapsulates flags to specify how the file is to be opened. A @code{file-options} object is an enum-set (@pxref{rnrs enums}) over the symbols constituting valid file options. A @var{file-options} parameter name means that the corresponding argument must be a file-options object. @deffn {Scheme Syntax} file-options @var{file-options-symbol} ... Each @var{file-options-symbol} must be a symbol. The @code{file-options} syntax returns a file-options object that encapsulates the specified options. When supplied to an operation that opens a file for output, the file-options object returned by @code{(file-options)} specifies that the file is created if it does not exist and an exception with condition type @code{&i/o-file-already-exists} is raised if it does exist. The following standard options can be included to modify the default behavior. @table @code @item no-create If the file does not already exist, it is not created; instead, an exception with condition type @code{&i/o-file-does-not-exist} is raised. If the file already exists, the exception with condition type @code{&i/o-file-already-exists} is not raised and the file is truncated to zero length. @item no-fail If the file already exists, the exception with condition type @code{&i/o-file-already-exists} is not raised, even if @code{no-create} is not included, and the file is truncated to zero length. @item no-truncate If the file already exists and the exception with condition type @code{&i/o-file-already-exists} has been inhibited by inclusion of @code{no-create} or @code{no-fail}, the file is not truncated, but the port's current position is still set to the beginning of the file. @end table These options have no effect when a file is opened only for input. Symbols other than those listed above may be used as @var{file-options-symbol}s; they have implementation-specific meaning, if any. @quotation Note Only the name of @var{file-options-symbol} is significant. @end quotation @end deffn @node R6RS Buffer Modes @subsubsection Buffer Modes Each port has an associated buffer mode. For an output port, the buffer mode defines when an output operation flushes the buffer associated with the output port. For an input port, the buffer mode defines how much data will be read to satisfy read operations. The possible buffer modes are the symbols @code{none} for no buffering, @code{line} for flushing upon line endings and reading up to line endings, or other implementation-dependent behavior, and @code{block} for arbitrary buffering. This section uses the parameter name @var{buffer-mode} for arguments that must be buffer-mode symbols. If two ports are connected to the same mutable source, both ports are unbuffered, and reading a byte or character from that shared source via one of the two ports would change the bytes or characters seen via the other port, a lookahead operation on one port will render the peeked byte or character inaccessible via the other port, while a subsequent read operation on the peeked port will see the peeked byte or character even though the port is otherwise unbuffered. In other words, the semantics of buffering is defined in terms of side effects on shared mutable sources, and a lookahead operation has the same side effect on the shared source as a read operation. @deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol} @var{buffer-mode-symbol} must be a symbol whose name is one of @code{none}, @code{line}, and @code{block}. The result is the corresponding symbol, and specifies the associated buffer mode. @quotation Note Only the name of @var{buffer-mode-symbol} is significant. @end quotation @end deffn @deffn {Scheme Procedure} buffer-mode? obj Returns @code{#t} if the argument is a valid buffer-mode symbol, and returns @code{#f} otherwise. @end deffn @node R6RS Transcoders @subsubsection Transcoders @cindex codec @cindex end-of-line style @cindex transcoder @cindex binary port @cindex textual port Several different Unicode encoding schemes describe standard ways to encode characters and strings as byte sequences and to decode those sequences. Within this document, a @dfn{codec} is an immutable Scheme object that represents a Unicode or similar encoding scheme. An @dfn{end-of-line style} is a symbol that, if it is not @code{none}, describes how a textual port transcodes representations of line endings. A @dfn{transcoder} is an immutable Scheme object that combines a codec with an end-of-line style and a method for handling decoding errors. Each transcoder represents some specific bidirectional (but not necessarily lossless), possibly stateful translation between byte sequences and Unicode characters and strings. Every transcoder can operate in the input direction (bytes to characters) or in the output direction (characters to bytes). A @var{transcoder} parameter name means that the corresponding argument must be a transcoder. A @dfn{binary port} is a port that supports binary I/O, does not have an associated transcoder and does not support textual I/O. A @dfn{textual port} is a port that supports textual I/O, and does not support binary I/O. A textual port may or may not have an associated transcoder. @deffn {Scheme Procedure} latin-1-codec @deffnx {Scheme Procedure} utf-8-codec @deffnx {Scheme Procedure} utf-16-codec These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16 encoding schemes. A call to any of these procedures returns a value that is equal in the sense of @code{eqv?} to the result of any other call to the same procedure. @end deffn @deffn {Scheme Syntax} eol-style @var{eol-style-symbol} @var{eol-style-symbol} should be a symbol whose name is one of @code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls}, and @code{none}. The form evaluates to the corresponding symbol. If the name of @var{eol-style-symbol} is not one of these symbols, the effect and result are implementation-dependent; in particular, the result may be an eol-style symbol acceptable as an @var{eol-style} argument to @code{make-transcoder}. Otherwise, an exception is raised. All eol-style symbols except @code{none} describe a specific line-ending encoding: @table @code @item lf linefeed @item cr carriage return @item crlf carriage return, linefeed @item nel next line @item crnel carriage return, next line @item ls line separator @end table For a textual port with a transcoder, and whose transcoder has an eol-style symbol @code{none}, no conversion occurs. For a textual input port, any eol-style symbol other than @code{none} means that all of the above line-ending encodings are recognized and are translated into a single linefeed. For a textual output port, @code{none} and @code{lf} are equivalent. Linefeed characters are encoded according to the specified eol-style symbol, and all other characters that participate in possible line endings are encoded as is. @quotation Note Only the name of @var{eol-style-symbol} is significant. @end quotation @end deffn @deffn {Scheme Procedure} native-eol-style Returns the default end-of-line style of the underlying platform, e.g., @code{lf} on Unix and @code{crlf} on Windows. @end deffn @deffn {Condition Type} &i/o-decoding @deffnx {Scheme Procedure} make-i/o-decoding-error port @deffnx {Scheme Procedure} i/o-decoding-error? obj This condition type could be defined by @lisp (define-condition-type &i/o-decoding &i/o-port make-i/o-decoding-error i/o-decoding-error?) @end lisp An exception with this type is raised when one of the operations for textual input from a port encounters a sequence of bytes that cannot be translated into a character or string by the input direction of the port's transcoder. When such an exception is raised, the port's position is past the invalid encoding. @end deffn @deffn {Condition Type} &i/o-encoding @deffnx {Scheme Procedure} make-i/o-encoding-error port char @deffnx {Scheme Procedure} i/o-encoding-error? obj @deffnx {Scheme Procedure} i/o-encoding-error-char condition This condition type could be defined by @lisp (define-condition-type &i/o-encoding &i/o-port make-i/o-encoding-error i/o-encoding-error? (char i/o-encoding-error-char)) @end lisp An exception with this type is raised when one of the operations for textual output to a port encounters a character that cannot be translated into bytes by the output direction of the port's transcoder. @var{char} is the character that could not be encoded. @end deffn @deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol} @var{error-handling-mode-symbol} should be a symbol whose name is one of @code{ignore}, @code{raise}, and @code{replace}. The form evaluates to the corresponding symbol. If @var{error-handling-mode-symbol} is not one of these identifiers, effect and result are implementation-dependent: The result may be an error-handling-mode symbol acceptable as a @var{handling-mode} argument to @code{make-transcoder}. If it is not acceptable as a @var{handling-mode} argument to @code{make-transcoder}, an exception is raised. @quotation Note Only the name of @var{error-handling-mode-symbol} is significant. @end quotation The error-handling mode of a transcoder specifies the behavior of textual I/O operations in the presence of encoding or decoding errors. If a textual input operation encounters an invalid or incomplete character encoding, and the error-handling mode is @code{ignore}, an appropriate number of bytes of the invalid encoding are ignored and decoding continues with the following bytes. If the error-handling mode is @code{replace}, the replacement character U+FFFD is injected into the data stream, an appropriate number of bytes are ignored, and decoding continues with the following bytes. If the error-handling mode is @code{raise}, an exception with condition type @code{&i/o-decoding} is raised. If a textual output operation encounters a character it cannot encode, and the error-handling mode is @code{ignore}, the character is ignored and encoding continues with the next character. If the error-handling mode is @code{replace}, a codec-specific replacement character is emitted by the transcoder, and encoding continues with the next character. The replacement character is U+FFFD for transcoders whose codec is one of the Unicode encodings, but is the @code{?} character for the Latin-1 encoding. If the error-handling mode is @code{raise}, an exception with condition type @code{&i/o-encoding} is raised. @end deffn @deffn {Scheme Procedure} make-transcoder codec @deffnx {Scheme Procedure} make-transcoder codec eol-style @deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode @var{codec} must be a codec; @var{eol-style}, if present, an eol-style symbol; and @var{handling-mode}, if present, an error-handling-mode symbol. @var{eol-style} may be omitted, in which case it defaults to the native end-of-line style of the underlying platform. @var{handling-mode} may be omitted, in which case it defaults to @code{replace}. The result is a transcoder with the behavior specified by its arguments. @end deffn @deffn {Scheme procedure} native-transcoder Returns an implementation-dependent transcoder that represents a possibly locale-dependent ``native'' transcoding. @end deffn @deffn {Scheme Procedure} transcoder-codec transcoder @deffnx {Scheme Procedure} transcoder-eol-style transcoder @deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder These are accessors for transcoder objects; when applied to a transcoder returned by @code{make-transcoder}, they return the @var{codec}, @var{eol-style}, and @var{handling-mode} arguments, respectively. @end deffn @deffn {Scheme Procedure} bytevector->string bytevector transcoder Returns the string that results from transcoding the @var{bytevector} according to the input direction of the transcoder. @end deffn @deffn {Scheme Procedure} string->bytevector string transcoder Returns the bytevector that results from transcoding the @var{string} according to the output direction of the transcoder. @end deffn @node R6RS End-of-File @subsubsection The End-of-File Object @cindex EOF @cindex end-of-file R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io ports)} module: @deffn {Scheme Procedure} eof-object? obj @deffnx {C Function} scm_eof_object_p (obj) Return true if @var{obj} is the end-of-file (EOF) object. @end deffn In addition, the following procedure is provided: @deffn {Scheme Procedure} eof-object @deffnx {C Function} scm_eof_object () Return the end-of-file (EOF) object. @lisp (eof-object? (eof-object)) @result{} #t @end lisp @end deffn @node R6RS Port Manipulation @subsubsection Port Manipulation The procedures listed below operate on any kind of R6RS I/O port. @deffn {Scheme Procedure} port? obj Returns @code{#t} if the argument is a port, and returns @code{#f} otherwise. @end deffn @deffn {Scheme Procedure} port-transcoder port Returns the transcoder associated with @var{port} if @var{port} is textual and has an associated transcoder, and returns @code{#f} if @var{port} is binary or does not have an associated transcoder. @end deffn @deffn {Scheme Procedure} binary-port? port Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for binary data input/output. Note that internally Guile does not differentiate between binary and textual ports, unlike the R6RS. Thus, this procedure returns true when @var{port} does not have an associated encoding---i.e., when @code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports, port-encoding}). This is the case for ports returned by R6RS procedures such as @code{open-bytevector-input-port} and @code{make-custom-binary-output-port}. However, Guile currently does not prevent use of textual I/O procedures such as @code{display} or @code{read-char} with binary ports. Doing so ``upgrades'' the port from binary to textual, under the ISO-8859-1 encoding. Likewise, Guile does not prevent use of @code{set-port-encoding!} on a binary port, which also turns it into a ``textual'' port. @end deffn @deffn {Scheme Procedure} textual-port? port Always return @code{#t}, as all ports can be used for textual I/O in Guile. @end deffn @deffn {Scheme Procedure} transcoded-port binary-port transcoder The @code{transcoded-port} procedure returns a new textual port with the specified @var{transcoder}. Otherwise the new textual port's state is largely the same as that of @var{binary-port}. If @var{binary-port} is an input port, the new textual port will be an input port and will transcode the bytes that have not yet been read from @var{binary-port}. If @var{binary-port} is an output port, the new textual port will be an output port and will transcode output characters into bytes that are written to the byte sink represented by @var{binary-port}. As a side effect, however, @code{transcoded-port} closes @var{binary-port} in a special way that allows the new textual port to continue to use the byte source or sink represented by @var{binary-port}, even though @var{binary-port} itself is closed and cannot be used by the input and output operations described in this chapter. @end deffn @deffn {Scheme Procedure} port-position port If @var{port} supports it (see below), return the offset (an integer) indicating where the next octet will be read from/written to in @var{port}. If @var{port} does not support this operation, an error condition is raised. This is similar to Guile's @code{seek} procedure with the @code{SEEK_CUR} argument (@pxref{Random Access}). @end deffn @deffn {Scheme Procedure} port-has-port-position? port Return @code{#t} is @var{port} supports @code{port-position}. @end deffn @deffn {Scheme Procedure} set-port-position! port offset If @var{port} supports it (see below), set the position where the next octet will be read from/written to @var{port} to @var{offset} (an integer). If @var{port} does not support this operation, an error condition is raised. This is similar to Guile's @code{seek} procedure with the @code{SEEK_SET} argument (@pxref{Random Access}). @end deffn @deffn {Scheme Procedure} port-has-set-port-position!? port Return @code{#t} is @var{port} supports @code{set-port-position!}. @end deffn @deffn {Scheme Procedure} call-with-port port proc Call @var{proc}, passing it @var{port} and closing @var{port} upon exit of @var{proc}. Return the return values of @var{proc}. @end deffn @node R6RS Input Ports @subsubsection Input Ports @deffn {Scheme Procedure} input-port? obj Returns @code{#t} if the argument is an input port (or a combined input and output port), and returns @code{#f} otherwise. @end deffn @deffn {Scheme Procedure} port-eof? input-port Returns @code{#t} if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port) or the @code{lookahead-char} procedure (if @var{input-port} is a textual port) would return the end-of-file object, and @code{#f} otherwise. The operation may block indefinitely if no data is available but the port cannot be determined to be at end of file. @end deffn @deffn {Scheme Procedure} open-file-input-port filename @deffnx {Scheme Procedure} open-file-input-port filename file-options @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder @var{maybe-transcoder} must be either a transcoder or @code{#f}. The @code{open-file-input-port} procedure returns an input port for the named file. The @var{file-options} and @var{maybe-transcoder} arguments are optional. The @var{file-options} argument, which may determine various aspects of the returned port (@pxref{R6RS File Options}), defaults to the value of @code{(file-options)}. The @var{buffer-mode} argument, if supplied, must be one of the symbols that name a buffer mode. The @var{buffer-mode} argument defaults to @code{block}. If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated with the returned port. If @var{maybe-transcoder} is @code{#f} or absent, the port will be a binary port and will support the @code{port-position} and @code{set-port-position!} operations. Otherwise the port will be a textual port, and whether it supports the @code{port-position} and @code{set-port-position!} operations is implementation-dependent (and possibly transcoder-dependent). @end deffn @deffn {Scheme Procedure} standard-input-port Returns a fresh binary input port connected to standard input. Whether the port supports the @code{port-position} and @code{set-port-position!} operations is implementation-dependent. @end deffn @deffn {Scheme Procedure} current-input-port This returns a default textual port for input. Normally, this default port is associated with standard input, but can be dynamically re-assigned using the @code{with-input-from-file} procedure from the @code{io simple (6)} library (@pxref{rnrs io simple}). The port may or may not have an associated transcoder; if it does, the transcoder is implementation-dependent. @end deffn @node R6RS Binary Input @subsubsection Binary Input @cindex binary input R6RS binary input ports can be created with the procedures described below. @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder] @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder) Return an input port whose contents are drawn from bytevector @var{bv} (@pxref{Bytevectors}). @c FIXME: Update description when implemented. The @var{transcoder} argument is currently not supported. @end deffn @cindex custom binary input ports @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close @deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close) Return a new custom binary input port@footnote{This is similar in spirit to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a string) whose input is drained by invoking @var{read!} and passing it a bytevector, an index where bytes should be written, and the number of bytes to read. The @code{read!} procedure must return an integer indicating the number of bytes read, or @code{0} to indicate the end-of-file. Optionally, if @var{get-position} is not @code{#f}, it must be a thunk that will be called when @code{port-position} is invoked on the custom binary port and should return an integer indicating the position within the underlying data stream; if @var{get-position} was not supplied, the returned port does not support @code{port-position}. Likewise, if @var{set-position!} is not @code{#f}, it should be a one-argument procedure. When @code{set-port-position!} is invoked on the custom binary input port, @var{set-position!} is passed an integer indicating the position of the next byte is to read. Finally, if @var{close} is not @code{#f}, it must be a thunk. It is invoked when the custom binary input port is closed. The returned port is fully buffered by default, but its buffering mode can be changed using @code{setvbuf} (@pxref{Buffering}). Using a custom binary input port, the @code{open-bytevector-input-port} procedure could be implemented as follows: @lisp (define (open-bytevector-input-port source) (define position 0) (define length (bytevector-length source)) (define (read! bv start count) (let ((count (min count (- length position)))) (bytevector-copy! source position bv start count) (set! position (+ position count)) count)) (define (get-position) position) (define (set-position! new-position) (set! position new-position)) (make-custom-binary-input-port "the port" read! get-position set-position!)) (read (open-bytevector-input-port (string->utf8 "hello"))) @result{} hello @end lisp @end deffn @cindex binary input Binary input is achieved using the procedures below: @deffn {Scheme Procedure} get-u8 port @deffnx {C Function} scm_get_u8 (port) Return an octet read from @var{port}, a binary input port, blocking as necessary, or the end-of-file object. @end deffn @deffn {Scheme Procedure} lookahead-u8 port @deffnx {C Function} scm_lookahead_u8 (port) Like @code{get-u8} but does not update @var{port}'s position to point past the octet. @end deffn @deffn {Scheme Procedure} get-bytevector-n port count @deffnx {C Function} scm_get_bytevector_n (port, count) Read @var{count} octets from @var{port}, blocking as necessary and return a bytevector containing the octets read. If fewer bytes are available, a bytevector smaller than @var{count} is returned. @end deffn @deffn {Scheme Procedure} get-bytevector-n! port bv start count @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count) Read @var{count} bytes from @var{port} and store them in @var{bv} starting at index @var{start}. Return either the number of bytes actually read or the end-of-file object. @end deffn @deffn {Scheme Procedure} get-bytevector-some port @deffnx {C Function} scm_get_bytevector_some (port) Read from @var{port}, blocking as necessary, until bytes are available or an end-of-file is reached. Return either the end-of-file object or a new bytevector containing some of the available bytes (at least one), and update the port position to point just past these bytes. @end deffn @deffn {Scheme Procedure} get-bytevector-all port @deffnx {C Function} scm_get_bytevector_all (port) Read from @var{port}, blocking as necessary, until the end-of-file is reached. Return either a new bytevector containing the data read or the end-of-file object (if no data were available). @end deffn The @code{(ice-9 binary-ports)} module provides the following procedure as an extension to @code{(rnrs io ports)}: @deffn {Scheme Procedure} unget-bytevector port bv [start [count]] @deffnx {C Function} scm_unget_bytevector (port, bv, start, count) Place the contents of @var{bv} in @var{port}, optionally starting at index @var{start} and limiting to @var{count} octets, so that its bytes will be read from left-to-right as the next bytes from @var{port} during subsequent read operations. If called multiple times, the unread bytes will be read again in last-in first-out order. @end deffn @node R6RS Textual Input @subsubsection Textual Input @deffn {Scheme Procedure} get-char textual-input-port Reads from @var{textual-input-port}, blocking as necessary, until a complete character is available from @var{textual-input-port}, or until an end of file is reached. If a complete character is available before the next end of file, @code{get-char} returns that character and updates the input port to point past the character. If an end of file is reached before any character is read, @code{get-char} returns the end-of-file object. @end deffn @deffn {Scheme Procedure} lookahead-char textual-input-port The @code{lookahead-char} procedure is like @code{get-char}, but it does not update @var{textual-input-port} to point past the character. @end deffn @deffn {Scheme Procedure} get-string-n textual-input-port count @var{count} must be an exact, non-negative integer object, representing the number of characters to be read. The @code{get-string-n} procedure reads from @var{textual-input-port}, blocking as necessary, until @var{count} characters are available, or until an end of file is reached. If @var{count} characters are available before end of file, @code{get-string-n} returns a string consisting of those @var{count} characters. If fewer characters are available before an end of file, but one or more characters can be read, @code{get-string-n} returns a string containing those characters. In either case, the input port is updated to point just past the characters read. If no characters can be read before an end of file, the end-of-file object is returned. @end deffn @deffn {Scheme Procedure} get-string-n! textual-input-port string start count @var{start} and @var{count} must be exact, non-negative integer objects, with @var{count} representing the number of characters to be read. @var{string} must be a string with at least $@var{start} + @var{count}$ characters. The @code{get-string-n!} procedure reads from @var{textual-input-port} in the same manner as @code{get-string-n}. If @var{count} characters are available before an end of file, they are written into @var{string} starting at index @var{start}, and @var{count} is returned. If fewer characters are available before an end of file, but one or more can be read, those characters are written into @var{string} starting at index @var{start} and the number of characters actually read is returned as an exact integer object. If no characters can be read before an end of file, the end-of-file object is returned. @end deffn @deffn {Scheme Procedure} get-string-all textual-input-port Reads from @var{textual-input-port} until an end of file, decoding characters in the same manner as @code{get-string-n} and @code{get-string-n!}. If characters are available before the end of file, a string containing all the characters decoded from that data are returned. If no character precedes the end of file, the end-of-file object is returned. @end deffn @deffn {Scheme Procedure} get-line textual-input-port Reads from @var{textual-input-port} up to and including the linefeed character or end of file, decoding characters in the same manner as @code{get-string-n} and @code{get-string-n!}. If a linefeed character is read, a string containing all of the text up to (but not including) the linefeed character is returned, and the port is updated to point just past the linefeed character. If an end of file is encountered before any linefeed character is read, but some characters have been read and decoded as characters, a string containing those characters is returned. If an end of file is encountered before any characters are read, the end-of-file object is returned. @quotation Note The end-of-line style, if not @code{none}, will cause all line endings to be read as linefeed characters. @xref{R6RS Transcoders}. @end quotation @end deffn @deffn {Scheme Procedure} get-datum textual-input-port count Reads an external representation from @var{textual-input-port} and returns the datum it represents. The @code{get-datum} procedure returns the next datum that can be parsed from the given @var{textual-input-port}, updating @var{textual-input-port} to point exactly past the end of the external representation of the object. Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme Syntax}) in the input is first skipped. If an end of file occurs after the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File}) is returned. If a character inconsistent with an external representation is encountered in the input, an exception with condition types @code{&lexical} and @code{&i/o-read} is raised. Also, if the end of file is encountered after the beginning of an external representation, but the external representation is incomplete and therefore cannot be parsed, an exception with condition types @code{&lexical} and @code{&i/o-read} is raised. @end deffn @node R6RS Output Ports @subsubsection Output Ports @deffn {Scheme Procedure} output-port? obj Returns @code{#t} if the argument is an output port (or a combined input and output port), @code{#f} otherwise. @end deffn @deffn {Scheme Procedure} flush-output-port port Flushes any buffered output from the buffer of @var{output-port} to the underlying file, device, or object. The @code{flush-output-port} procedure returns an unspecified values. @end deffn @deffn {Scheme Procedure} open-file-output-port filename @deffnx {Scheme Procedure} open-file-output-port filename file-options @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder @var{maybe-transcoder} must be either a transcoder or @code{#f}. The @code{open-file-output-port} procedure returns an output port for the named file. The @var{file-options} argument, which may determine various aspects of the returned port (@pxref{R6RS File Options}), defaults to the value of @code{(file-options)}. The @var{buffer-mode} argument, if supplied, must be one of the symbols that name a buffer mode. The @var{buffer-mode} argument defaults to @code{block}. If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated with the port. If @var{maybe-transcoder} is @code{#f} or absent, the port will be a binary port and will support the @code{port-position} and @code{set-port-position!} operations. Otherwise the port will be a textual port, and whether it supports the @code{port-position} and @code{set-port-position!} operations is implementation-dependent (and possibly transcoder-dependent). @end deffn @deffn {Scheme Procedure} standard-output-port @deffnx {Scheme Procedure} standard-error-port Returns a fresh binary output port connected to the standard output or standard error respectively. Whether the port supports the @code{port-position} and @code{set-port-position!} operations is implementation-dependent. @end deffn @deffn {Scheme Procedure} current-output-port @deffnx {Scheme Procedure} current-error-port These return default textual ports for regular output and error output. Normally, these default ports are associated with standard output, and standard error, respectively. The return value of @code{current-output-port} can be dynamically re-assigned using the @code{with-output-to-file} procedure from the @code{io simple (6)} library (@pxref{rnrs io simple}). A port returned by one of these procedures may or may not have an associated transcoder; if it does, the transcoder is implementation-dependent. @end deffn @node R6RS Binary Output @subsubsection Binary Output Binary output ports can be created with the procedures below. @deffn {Scheme Procedure} open-bytevector-output-port [transcoder] @deffnx {C Function} scm_open_bytevector_output_port (transcoder) Return two values: a binary output port and a procedure. The latter should be called with zero arguments to obtain a bytevector containing the data accumulated by the port, as illustrated below. @lisp (call-with-values (lambda () (open-bytevector-output-port)) (lambda (port get-bytevector) (display "hello" port) (get-bytevector))) @result{} #vu8(104 101 108 108 111) @end lisp @c FIXME: Update description when implemented. The @var{transcoder} argument is currently not supported. @end deffn @cindex custom binary output ports @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close @deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close) Return a new custom binary output port named @var{id} (a string) whose output is sunk by invoking @var{write!} and passing it a bytevector, an index where bytes should be read from this bytevector, and the number of bytes to be ``written''. The @code{write!} procedure must return an integer indicating the number of bytes actually written; when it is passed @code{0} as the number of bytes to write, it should behave as though an end-of-file was sent to the byte sink. The other arguments are as for @code{make-custom-binary-input-port} (@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}). @end deffn @cindex binary output Writing to a binary output port can be done using the following procedures: @deffn {Scheme Procedure} put-u8 port octet @deffnx {C Function} scm_put_u8 (port, octet) Write @var{octet}, an integer in the 0--255 range, to @var{port}, a binary output port. @end deffn @deffn {Scheme Procedure} put-bytevector port bv [start [count]] @deffnx {C Function} scm_put_bytevector (port, bv, start, count) Write the contents of @var{bv} to @var{port}, optionally starting at index @var{start} and limiting to @var{count} octets. @end deffn @node R6RS Textual Output @subsubsection Textual Output @deffn {Scheme Procedure} put-char port char Writes @var{char} to the port. The @code{put-char} procedure returns an unspecified value. @end deffn @deffn {Scheme Procedure} put-string port string @deffnx {Scheme Procedure} put-string port string start @deffnx {Scheme Procedure} put-string port string start count @var{start} and @var{count} must be non-negative exact integer objects. @var{string} must have a length of at least @math{@var{start} + @var{count}}. @var{start} defaults to 0. @var{count} defaults to @math{@code{(string-length @var{string})} - @var{start}}$. The @code{put-string} procedure writes the @var{count} characters of @var{string} starting at index @var{start} to the port. The @code{put-string} procedure returns an unspecified value. @end deffn @deffn {Scheme Procedure} put-datum textual-output-port datum @var{datum} should be a datum value. The @code{put-datum} procedure writes an external representation of @var{datum} to @var{textual-output-port}. The specific external representation is implementation-dependent. However, whenever possible, an implementation should produce a representation for which @code{get-datum}, when reading the representation, will return an object equal (in the sense of @code{equal?}) to @var{datum}. @quotation Note Not all datums may allow producing an external representation for which @code{get-datum} will produce an object that is equal to the original. Specifically, NaNs contained in @var{datum} may make this impossible. @end quotation @quotation Note The @code{put-datum} procedure merely writes the external representation, but no trailing delimiter. If @code{put-datum} is used to write several subsequent external representations to an output port, care should be taken to delimit them properly so they can be read back in by subsequent calls to @code{get-datum}. @end quotation @end deffn @node I/O Extensions @subsection Implementing New Port Types in C This section describes how to implement a new port type in C. Although ports support many operations, as a data structure they present an opaque interface to the user. To the port implementor, you have two additional pieces of information: the port type, which is an opaque pointer allocated when defining your port type; and a port's ``stream'', which you allocate when you create a port. The type code helps you identify which ports are actually yours. The ``stream'' is the private data associated with that port which you and only you control. Get a stream from a port using the @code{SCM_STREAM} macro. Note that your port methods are only ever called with ports of your type. A port type is created by calling @code{scm_make_port_type}. Once you have your port type, you can create ports with @code{scm_c_make_port}, or @code{scm_c_make_port_with_encoding}. @deftypefun scm_t_port_type* scm_make_port_type (char *name, size_t (*read) (SCM port, SCM dst, size_t start, size_t count), size_t (*write) (SCM port, SCM src, size_t start, size_t count)) Define a new port type. The @var{name}, @var{read} and @var{write} parameters are initial values for those port type fields, as described below. The other fields are initialized with default values and can be changed later. @end deftypefun @deftypefun SCM scm_c_make_port_with_encoding (scm_t_port_type *type, unsigned long mode_bits, SCM encoding, SCM conversion_strategy, scm_t_bits stream) @deftypefunx SCM scm_c_make_port (scm_t_port_type *type, unsigned long mode_bits, scm_t_bits stream) Make a port with the given @var{type}. The @var{stream} indicates the private data associated with the port, which your port implementation may later retrieve with @code{SCM_STREAM}. The mode bits should include one or more of the flags @code{SCM_RDNG} or @code{SCM_WRTNG}, indicating that the port is an input and/or an output port, respectively. The mode bits may also include @code{SCM_BUF0} or @code{SCM_BUFLINE}, indicating that the port should be unbuffered or line-buffered, respectively. The default is that the port will be block-buffered. @xref{Buffering}. As you would imagine, @var{encoding} and @var{conversion_strategy} specify the port's initial textual encoding and conversion strategy. Both are symbols. @code{scm_c_make_port} is the same as @code{scm_c_make_port_with_encoding}, except it uses the default port encoding and conversion strategy. @end deftypefun The port type has a number of associate procedures and properties which collectively implement the port's behavior. Creating a new port type mostly involves writing these procedures. @table @code @item name A pointer to a NUL terminated string: the name of the port type. This property is initialized via the first argument to @code{scm_make_port_type}. @item read A port's @code{read} implementation fills read buffers. It should copy bytes to the supplied bytevector @code{dst}, starting at offset @code{start} and continuing for @code{count} bytes, returning the number of bytes read. @item write A port's @code{write} implementation flushes write buffers to the mutable store. A port's @code{read} implementation fills read buffers. It should write out bytes from the supplied bytevector @code{src}, starting at offset @code{start} and continuing for @code{count} bytes, and return the number of bytes that were written. @item read_wait_fd @itemx write_wait_fd If a port's @code{read} or @code{write} function returns @code{(size_t) -1}, that indicates that reading or writing would block. In that case to preserve the illusion of a blocking read or write operation, Guile's C port run-time will @code{poll} on the file descriptor returned by either the port's @code{read_wait_fd} or @code{write_wait_fd} function. Set using @deftypefun void scm_set_port_read_wait_fd (scm_t_port_type *type, int (*wait_fd) (SCM port)) @deftypefunx void scm_set_port_write_wait_fd (scm_t_port_type *type, int (*wait_fd) (SCM port)) @end deftypefun Only a port type which implements the @code{read_wait_fd} or @code{write_wait_fd} port methods can usefully return @code{(size_t) -1} from a read or write function. @xref{Non-Blocking I/O}, for more on non-blocking I/O in Guile. @item print Called when @code{write} is called on the port, to print a port description. For example, for a file port it may produce something like: @code{#}. Set using @deftypefun void scm_set_port_print (scm_t_port_type *type, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate)) The first argument @var{port} is the port being printed, the second argument @var{dest_port} is where its description should go. @end deftypefun @item close Called when the port is closed. It should free any resources used by the port. Set using @deftypefun void scm_set_port_close (scm_t_port_type *type, void (*close) (SCM port)) @end deftypefun By default, ports that are garbage collected just go away without closing. If your port type needs to release some external resource like a file descriptor, or needs to make sure that its internal buffers are flushed even if the port is collected while it was open, then mark the port type as needing a close on GC. @deftypefun void scm_set_port_needs_close_on_gc (scm_t_port_type *type, int needs_close_p) @end deftypefun @item seek Set the current position of the port. Guile will flush read and/or write buffers before seeking, as appropriate. @deftypefun void scm_set_port_seek (scm_t_port_type *type, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence)) @end deftypefun @item truncate Truncate the port data to be specified length. Guile will flush buffers before hand, as appropriate. Set using @deftypefun void scm_set_port_truncate (scm_t_port_type *type, void (*truncate) (SCM port, scm_t_off length)) @end deftypefun @item random_access_p Determine whether this port is a random-access port. @cindex random access Seeking on a random-access port with buffered input, or switching to writing after reading, will cause the buffered input to be discarded and Guile will seek the port back the buffered number of bytes. Likewise seeking on a random-access port with buffered output, or switching to reading after writing, will flush pending bytes with a call to the @code{write} procedure. @xref{Buffering}. Indicate to Guile that your port needs this behavior by returning a nonzero value from your @code{random_access_p} function. The default implementation of this function returns nonzero if the port type supplies a seek implementation. @deftypefun void scm_set_port_random_access_p (scm_t_port_type *type, int (*random_access_p) (SCM port)); @end deftypefun @item get_natural_buffer_sizes Guile will internally attach buffers to ports. An input port always has a read buffer and an output port always has a write buffer. @xref{Buffering}. A port buffer consists of a bytevector, along with some cursors into that bytevector denoting where to get and put data. Port implementations generally don't have to be concerned with buffering: a port type's @code{read} or @code{write} function will receive the buffer's bytevector as an argument, along with an offset and a length into that bytevector, and should then either fill or empty that bytevector. However in some cases, port implementations may be able to provide an appropriate default buffer size to Guile. @deftypefun void scm_set_port_get_natural_buffer_sizes @ (scm_t_port_type *type, void (*get_natural_buffer_sizes) (SCM, size_t *read_buf_size, size_t *write_buf_size)) Fill in @var{read_buf_size} and @var{write_buf_size} with an appropriate buffer size for this port, if one is known. @end deftypefun File ports implement a @code{get_natural_buffer_sizes} to let the operating system inform Guile about the appropriate buffer sizes for the particular file opened by the port. @end table @node Non-Blocking I/O @subsection Non-Blocking I/O Most ports in Guile are @dfn{blocking}: when you try to read a character from a port, Guile will block on the read until a character is ready, or end-of-stream is detected. Likewise whenever Guile goes to write (possibly buffered) data to an output port, Guile will block until all the data is written. Interacting with ports in blocking mode is very convenient: you can write straightforward, sequential algorithms whose code flow reflects the flow of data. However, blocking I/O has two main limitations. The first is that it's easy to get into a situation where code is waiting on data. Time spent waiting on data when code could be doing something else is wasteful and prevents your program from reaching its peak throughput. If you implement a web server that sequentially handles requests from clients, it's very easy for the server to end up waiting on a client to finish its HTTP request, or waiting on it to consume the response. The end result is that you are able to serve fewer requests per second than you'd like to serve. The second limitation is related: a blocking parser over user-controlled input is a denial-of-service vulnerability. Indeed the so-called ``slow loris'' attack of the early 2010s was just that: an attack on common web servers that drip-fed HTTP requests, one character at a time. All it took was a handful of slow loris connections to occupy an entire web server. In Guile we would like to preserve the ability to write straightforward blocking networking processes of all kinds, but under the hood to allow those processes to suspend their requests if they would block. To do this, the first piece is to allow Guile ports to declare themselves as being nonblocking. This is currently supported only for file ports, which also includes sockets, terminals, or any other port that is backed by a file descriptor. To do that, we use an arcane UNIX incantation: @example (let ((flags (fcntl socket F_GETFL))) (fcntl socket F_SETFL (logior O_NONBLOCK flags))) @end example Now the file descriptor is open in non-blocking mode. If Guile tries to read or write from this file descriptor in C, it will block by polling on the socket's @code{read_wait_fd}, to preserve the illusion of a blocking read or write. @xref{I/O Extensions} for more on that internal interface. However if a user uses the new and experimental Scheme implementation of ports in @code{(ice-9 sports)}, Guile instead calls the value of the @code{current-read-waiter} or @code{current-write-waiter} parameters on the port before re-trying the read or write. The default value of these parameters does the same thing as the C port runtime: it blocks. However it's possible to dynamically bind these parameters to handlers that can suspend the current coroutine to a scheduler, to be later re-animated once the port becomes readable or writable in the future. In the mean-time the scheduler can run other code, for example servicing other web requests. Guile does not currently include such a scheduler. Currently we want to make sure that we're providing the right primitives that can be used to build schedulers and other user-space concurrency patterns. In the meantime, have a look at 8sync (@url{https://gnu.org/software/8sync}) for a prototype of an asynchronous I/O and concurrency facility. @node BOM Handling @subsection Handling of Unicode byte order marks. @cindex BOM @cindex byte order mark This section documents the finer points of Guile's handling of Unicode byte order marks (BOMs). A byte order mark (U+FEFF) is typically found at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably determine the byte order. Occasionally, a BOM is found at the start of a UTF-8 stream, but this is much less common and not generally recommended. Guile attempts to handle BOMs automatically, and in accordance with the recommendations of the Unicode Standard, when the port encoding is set to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile automatically writes a BOM at the start of a UTF-16 or UTF-32 stream, and automatically consumes one from the start of a UTF-8, UTF-16, or UTF-32 stream. As specified in the Unicode Standard, a BOM is only handled specially at the start of a stream, and only if the port encoding is set to @code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of the special handling described in this section applies. @itemize @bullet @item To ensure that Guile will properly detect the byte order of a UTF-16 or UTF-32 stream, you must perform a textual read before any writes, seeks, or binary I/O. Guile will not attempt to read a BOM unless a read is explicitly requested at the start of the stream. @item If a textual write is performed before the first read, then an arbitrary byte order will be chosen. Currently, big endian is the default on all platforms, but that may change in the future. If you wish to explicitly control the byte order of an output stream, set the port encoding to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE}, and explicitly write a BOM (@code{#\xFEFF}) if desired. @item If @code{set-port-encoding!} is called in the middle of a stream, Guile treats this as a new logical ``start of stream'' for purposes of BOM handling, and will forget about any BOMs that had previously been seen. Therefore, it may choose a different byte order than had been used previously. This is intended to support multiple logical text streams embedded within a larger binary stream. @item Binary I/O operations are not guaranteed to update Guile's notion of whether the port is at the ``start of the stream'', nor are they guaranteed to produce or consume BOMs. @item For ports that support seeking (e.g. normal files), the input and output streams are considered linked: if the user reads first, then a BOM will be consumed (if appropriate), but later writes will @emph{not} produce a BOM. Similarly, if the user writes first, then later reads will @emph{not} consume a BOM. @item For ports that are not random access (e.g. pipes, sockets, and terminals), the input and output streams are considered @emph{independent} for purposes of BOM handling: the first read will consume a BOM (if appropriate), and the first write will @emph{also} produce a BOM (if appropriate). However, the input and output streams will always use the same byte order. @item Seeks to the beginning of a file will set the ``start of stream'' flags. Therefore, a subsequent textual read or write will consume or produce a BOM. However, unlike @code{set-port-encoding!}, if a byte order had already been chosen for the port, it will remain in effect after a seek, and cannot be changed by the presence of a BOM. Seeks anywhere other than the beginning of a file clear the ``start of stream'' flags. @end itemize @c Local Variables: @c TeX-master: "guile.texi" @c End: