This reverts commit 0f983e3db0.
After discussing with Mike we are going to punt the read-line changes
for now. Open the port in O_TEXT mode if you want to chomp the CR in
CFLF sequences.
* libguile/rdelim.c (scm_read_line): handle CRLF, LS and PS
* module/ice-9/suspendable-ports.scm (read-line): handle CRLF, LS, and PS
* module/web/http.scm (read-header-line): take advantage of CRLF in read-line
(read-header): don't need to test for \return
* test-suite/tests/rdelim.test: new tests for read-line CRLF, LS and PS
* doc/ref/api-io.texi: update doc for read-line
* doc/ref/api-io.texi (Custom Ports): Add additional argument to example's
invocation of make-custom-binary-input-port. Previously had mismatched arity
by missing "closed" argument.
* module/ice-9/suspendable-ports.scm: Rename from ice-9/sports.scm, and
adapt module names. Remove exports that are not related to the
suspendable ports facility; we want people to continue using the port
operations from their original locations. Add put-string to the
replacement list.
* module/Makefile.am: Adapt to rename.
* test-suite/tests/suspendable-ports.test: Rename from sports.test.
* test-suite/Makefile.am: Adapt to rename.
* module/ice-9/textual-ports.scm (unget-char, unget-string): New
functions.
* doc/ref/api-io.texi (Textual I/O, Simple Output): Flesh out
documentation.
(Line/Delimited): Undocument write-line, read-string, and
read-string!. This is handled by (ice-9 textual-ports).
(Bytevector Ports): Fix duplicated section.
(String Ports): Move the note about encodings down to the end.
(Custom Ports): Add explanatory text. Remove mention of C functions;
they should use the C port interface.
(Venerable Port Interfaces): Add text, and make documentation refer to
recommended interfaces.
(Using Ports from C): Add documentation.
(Non-Blocking I/O): Document more fully and adapt to suspendable-ports
name change.
* doc/ref/api-io.texi: Update to document ports more thoroughly. Still
some work needed.
* doc/ref/r6rs.texi: Move ports documentation back to r6rs.texi, now
that Guile has a more thorough binary/textual I/O story, heavily based
on R6RS.
* doc/ref/api-control.texi:
* doc/ref/api-data.texi:
* doc/ref/api-options.texi:
* doc/ref/misc-modules.texi:
* doc/ref/posix.texi:
* doc/ref/srfi-modules.texi: Update references.
* doc/ref/api-io.texi (I/O Extensions): Update documentation on
implementing port types. Document get_natural_buffer_sizes. Document
the new random_access_p.
* libguile/fports.c (scm_i_fdes_to_port, fport_random_access_p):
(scm_make_fptob): Instead of frobbing rw_random manually, implement a
random_access_p function.
* libguile/ports.c (default_random_access_p)
(scm_set_port_random_access_p): New functions.
scm_make_port_type, scm_c_make_port_with_encoding): Arrange for
random_access_p to work.
* libguile/ports.h (struct scm_t_port_buffer): New data type.
(struct scm_t_port): Refactor to use port buffers instead of
implementation-managed read and write pointers. Add "read_buffering"
member.
(SCM_INITIAL_PUTBACK_BUF_SIZE, SCM_READ_BUFFER_EMPTY_P): Remove.
(scm_t_ptob_descriptor): Rename "fill_input" function to "read", and
take a port buffer, returning void. Likewise "write" takes a port
buffer and returns void. Remove "end_input"; instead if there is
buffered input and rw_random is true, then there must be a seek
function, so just seek back if needed. Remove "flush"; instead all
calls to the "write" function implicitly include a "flush", since the
buffering happens in the generic port code now. Remove "setvbuf", but
add "get_natural_buffer_sizes"; instead the generic port code can
buffer any port.
(scm_make_port_type): Adapt to read and write prototype changes.
(scm_set_port_flush, scm_set_port_end_input, scm_set_port_setvbuf):
Remove.
(scm_slow_get_byte_or_eof_unlocked)
(scm_slow_get_peek_or_eof_unlocked): Remove; the slow path is to call
scm_fill_input.
(scm_set_port_get_natural_buffer_sizes): New function.
(scm_c_make_port_buffer): New internal function.
(scm_port_non_buffer): Remove. This was a function for
implementations that is no longer needed. Instead open with BUF0 or
use (setvbuf port 'none).
(scm_fill_input, scm_fill_input_unlocked): Return the filled port
buffer.
(scm_get_byte_or_eof_unlocked, scm_peek_byte_or_eof_unlocked): Adapt
to changes in buffering and EOF management.
* libguile/ports.c: Adapt to port interface changes.
(initialize_port_buffers): New function, using the port mode flags to
set up appropriate initial buffering for all ports.
(scm_c_make_port_with_encoding): Create port buffers here instead of
delegating to implementations.
(scm_close_port): Flush the port if needed instead of delegating to
the implementation.
* libguile/filesys.c (set_element): Adapt to buffering changes.
* libguile/fports.c (fport_get_natural_buffer_sizes): New function,
replacing scm_fport_buffer_add.
(fport_write, fport_read): Update to let the generic ports code do the
buffering.
(fport_flush, fport_end_input): Remove.
(fport_close): Don't flush in a dynwind; that's the core ports' job.
(scm_make_fptob): Adapt.
* libguile/ioext.c (scm_redirect_port): Adapt to buffering changes.
* libguile/poll.c (scm_primitive_poll): Adapt to buffering changes.
* libguile/ports-internal.h (struct scm_port_internal): Remove
pending_eof flag; this is now set on the read buffer.
* libguile/r6rs-ports.c (struct bytevector_input_port): New type. The
new buffering arrangement means that there's now an intermediate
buffer between the bytevector and the user of the port; this could
lead to a perf degradation, but on the other hand there are some other
speedups enabled by the buffering refactor, so probably the memcpy
cost is dwarfed by the cost of the other parts of the ports
machinery.
(make_bytevector_input_port, bytevector_input_port_read):
(bytevector_input_port_seek, initialize_bytevector_input_ports): Adapt
to new buffering arrangement.
(struct custom_binary_port): Remove read buffer, as Guile handles that
now.
(custom_binary_input_port_setvbuf): Remove; now handled by Guile.
(make_custom_binary_input_port, custom_binary_input_port_read)
(initialize_custom_binary_input_ports): Adapt.
(scm_get_bytevector_some): Adapt to new EOF management.
(scm_t_bytevector_output_port_buffer): Hold on to the underlying port,
so we can flush it if it's open.
(make_bytevector_output_port, bytevector_output_port_write):
(bytevector_output_port_seek): Adapt.
(bytevector_output_port_procedure): Flush the port as appropriate, so
that we get all the bytes.
(make_custom_binary_output_port, custom_binary_output_port_write):
Adapt.
(make_transcoded_port): Don't muck with buffering.
(transcoded_port_write): Simply forward the write to the underlying
port.
(transcoded_port_read): Likewise.
(transcoded_port_close): No need to flush.
(initialize_transcoded_ports): Adapt.
* libguile/read.c (scm_i_scan_for_encoding): Adapt to buffering
changes.
* libguile/rw.c (scm_write_string_partial): Adapt to buffering changes.
* libguile/strports.c: Adapt to the fact that we don't manage the
buffer. Probably room for speed improvements here...
* libguile/vports.c (soft_port_get_natural_buffer_sizes): New function.
Adapt the rest of the file for the new buffering regime.
* test-suite/tests/r6rs-ports.test ("8.2.10 Output ports"): Custom
binary output ports need to be flushed before you can rely on the
write! procedure having been called. Add necessary flush-port
invocations.
("8.2.6 Input and output ports"): Transcoded ports now have an
internal buffer by default. This test checks that the characters are
transcoded one at a time, so to do that, call setvbuf on the
transcoded port to remove the buffer.
* test-suite/tests/web-client.test (run-with-http-transcript): Fix for
different flushing regime on soft ports. (The vestigial flush
procedure is now called after each write, which is not what the test
was expecting.)
* test-suite/standalone/test-scm-c-read.c: Update for changes to the C
interface for defining port types.
* doc/ref/api-io.texi (Ports): Update to discuss buffering in a generic
way, and to remove a hand-wavey paragraph describing string ports as
"interesting and powerful".
(Reading, Writing): Remove placeholder comments. Document
`scm_lfwrite'.
(Buffering): New section.
(File Ports): Link to buffering.
(I/O Extensions): Join subnodes into parent and describe new API,
including buffering API.
* doc/ref/posix.texi (Ports and File Descriptors): Link to buffering.
Remove unread-char etc, as they are documented elsewhere.
(Pipes, Network Sockets and Communication): Link to buffering.
* libguile/ports.h (scm_t_ptob_descriptor): The port close function now
returns void.
(scm_set_port_close): Adapt prototype.
* libguile/ports.c (scm_close_port): Always return true if we managed to
call the close function. There's no other sensible result; exceptions
are handled, well, exceptionally.
* libguile/fports.c (fport_close)
* libguile/r6rs-ports.c (custom_binary_port_close, transcoded_port_close):
* libguile/vports.c (soft_port_close): Adapt.
* doc/ref/api-io.texi (Port Implementation): Update.
* libguile/ports.h (scm_t_port_type_flags): Replace
SCM_PORT_TYPE_HAS_FLUSH with SCM_PORT_TYPE_NEEDS_CLOSE_ON_GC.
(scm_t_ptob_descriptor): Remove free function.
* libguile/ports.c (scm_set_port_needs_close_on_gc): New function.
(scm_set_port_flush): Don't set flags.
(scm_c_make_port_with_encoding, scm_close_port): Use the new flag to
determine when to add a finalizer and also when to include the port in
the weak set.
(scm_set_port_free): Remove.
(do_close, finalize_port): Close port instead of calling free
function.
* libguile/r6rs-ports.c (initialize_transcoded_ports):
* libguile/vports.c (scm_make_sfptob):
* libguile/fports.c (scm_make_fptob): Mark these ports as needing close
on GC.
* libguile/fports.c (fport_free): Remove.
* NEWS: Update.
* doc/ref/api-io.texi (Port Implementation): Update.
Reported in <http://bugs.gnu.org/18520>
by David Kastrup <dak@gnu.org>.
* doc/ref/api-io.texi (Random Access): Clarify the unit of the 'offset'
argument to 'seek'.
* libguile/r6rs-ports.c (CBIP_BUFFER_SIZE): Adjust comment. Set to 8KiB.
(SCM_SET_CBIP_BYTEVECTOR): New macro.
(cbip_setvbuf): New function.
(make_cbip): Set PORT's 'setvbuf' internal field.
(cbip_fill_input): Check whether PORT is buffered. When unbuffered,
check whether BV can hold C_REQUESTED bytes, and allocate a new
bytevector if not; copy the data back from BV to c_port->read_pos.
Remove 'again' label, and don't loop there.
* test-suite/tests/r6rs-ports.test ("7.2.7 Input Ports")["custom binary
input port unbuffered & 'port-position'", "custom binary input port
unbuffered & 'read!' calls", "custom binary input port, unbuffered
then buffered", "custom binary input port, buffered then unbuffered"]:
New tests.
* doc/ref/api-io.texi (R6RS Binary Input): Document the buffering of
custom binary input ports, and link to 'setvbuf'.
* libguile/strports.c (scm_mkstrport): Use UTF-8; ignore
%default-port-encoding. Rename 'str_len' and 'c_pos' to
'num_bytes' and 'c_byte_pos'. Interpret 'pos' argument
as a character index instead of a byte index.
* module/ice-9/boot-9.scm (%cond-expand-features): Add srfi-6 to the
list of core features.
* module/srfi/srfi-6.scm (open-input-string, open-output-string): Simply
re-export these, since the core versions are now compliant.
* doc/ref/api-io.texi (String Ports): Remove text that describes
non-compliant behavior of string ports with regard to encoding.
* doc/ref/srfi-modules.texi (SRFI-0): Add srfi-6 to the list of
core features.
(SRFI-6): Remove text that mentions non-compliant behavior of
core string ports.
* module/ice-9/format.scm (format):
* module/ice-9/pretty-print.scm (truncated-print):
* module/rnrs/io/ports.scm (open-string-input-port,
open-string-output-port):
* test-suite/test-suite/lib.scm (format-test-name):
* test-suite/tests/chars.test ("combining accent is pretty-printed",
"combining X is pretty-printed"):
* test-suite/tests/ecmascript.test (eread, eread/1):
* test-suite/tests/rdelim.test:
* test-suite/tests/reader.test (read-string):
* test-suite/tests/regexp.test:
* test-suite/tests/srfi-105.test (read-string): Don't set
%default-port-encoding before creating string ports.
* benchmark-suite/benchmarks/ports.bm (%latin1-port): Use
'set-port-encoding!' to set the string port encoding.
(%utf8/ascii-port, %utf8/wide-port, "rdelim"): Don't set
%default-port-encoding before creating string ports.
* test-suite/tests/r6rs-ports.test ("lookahead-u8 non-ASCII"): Don't set
%default-port-encoding before creating string ports.
("put-bytevector with UTF-16 string port", "put-bytevector with
wrong-encoding string port"): Use 'set-port-encoding!' to set the
string port encoding.
* test-suite/tests/print.test (tprint): Use 'set-port-encoding!' to set
the string port encoding.
("truncated-print"): Use 'pass-if-equal'.
* test-suite/tests/ports.test ("encoding failure leads to exception",
"%default-port-encoding is honored", "peek-char [latin-1]", "peek-char
[utf-8]", "peek-char [utf-16]"): Remove tests.
("%default-port-encoding is ignored", "peek-char"): Add tests.
("suitable encoding [latin-1]", "suitable encoding [latin-3]",
"wrong encoding, error", "wrong encoding, substitute",
"wrong encoding, escape"): Use 'set-port-encoding!' to set the
string port encoding.
("%default-port-encoding, wrong encoding"): Rewrite to use
a file port instead of a string port.
* libguile/fports.c (scm_open_file_with_encoding): New API function,
containing the code previously found in 'scm_open_file', but modified
to accept the new 'guess_encoding' and 'encoding' arguments.
(scm_open_file): Now just a simple wrapper that calls
'scm_open_file_with_encoding'.
(scm_i_open_file): New implementation of 'open-file' that accepts
keyword arguments '#:guess-encoding' and '#:encoding', and calls
'scm_open_file_with_encoding'.
(scm_init_fports_keywords): New initialization function that gets
called after keywords are initialized.
* libguile/fports.h (scm_open_file_with_encoding,
scm_init_fports_keywords): Add prototypes.
* libguile/init.c (scm_i_init_guile): Call 'scm_init_fports_keywords'.
* module/ice-9/boot-9.scm: Add enhanced versions of 'open-input-file',
'open-output-file', 'call-with-input-file', 'call-with-output-file',
'with-input-from-file', 'with-output-to-file', and
'with-error-to-file', that accept keyword arguments '#:binary',
'#:encoding', and (for input port constructors) '#:guess-encoding'.
* doc/ref/api-io.texi (File Ports): Update documentation.
* test-suite/tests/ports.test ("keyword arguments for file openers"):
Add tests.
* libguile/ports.c (scm_i_unget_bytes): New static function.
(scm_unget_bytes): New API function.
(scm_unget_byte): Rewrite to simply call 'scm_i_unget_bytes'.
(scm_ungetc, scm_peek_char, looking_at_bytes): Use 'scm_i_unget_bytes'.
* libguile/ports.h: Add prototype for 'scm_unget_bytes'.
* libguile/fports.c (scm_setvbuf): Use 'scm_unget_bytes'.
* libguile/r6rs-ports.c (scm_unget_bytevector): New procedure.
* module/ice-9/binary-ports.scm (unget-bytevector): New export.
* doc/ref/api-io.texi (R6RS Binary Input): Add documentation.
(R6RS I/O Ports): Update brief description of (ice-9 binary-ports) to
reflect the new reality: it is no longer a subset of (rnrs io ports).
* test-suite/tests/ports.test ("unget-bytevector"): Add test.
* libguile/fports.c (scm_open_file): Do not scan for coding
declarations. Replace 'use_encoding' local variable with
'binary'. Update documentation string.
* module/ice-9/psyntax.scm (include): Add the same file-encoding
logic that's used in compile-file and scm_primitive_load.
* module/ice-9/psyntax-pp.scm: Regenerate.
* doc/ref/api-io.texi (File Ports): Update docs.
* test-suite/tests/ports.test: Change "open-file HONORS file coding
declarations" test to "open-file IGNORES file coding declaration".
* test-suite/tests/coding.test (scan-coding): Use 'file-encoding' to
scan for the encoding, since 'open-input-file' no longer does so.
* libguile/ports-internal.h (struct scm_port_internal): Add new members
'at_stream_start_for_bom_read' and 'at_stream_start_for_bom_write'.
(SCM_UNICODE_BOM): New macro.
(scm_i_port_iconv_descriptors): Add 'mode' parameter to prototype.
* libguile/ports.c (scm_new_port_table_entry): Initialize
'at_stream_start_for_bom_read' and 'at_stream_start_for_bom_write'.
(get_iconv_codepoint): Pass new 'mode' parameter to
'scm_i_port_iconv_descriptors'.
(get_codepoint): After reading a codepoint at stream start, record
that we're no longer at stream start, and consume a BOM where
appropriate.
(scm_seek): Set the stream start flags according to the new position.
(looking_at_bytes): New static function.
(scm_utf8_bom, scm_utf16be_bom, scm_utf16le_bom, scm_utf32be_bom,
scm_utf32le_bom): New static const arrays.
(decide_utf16_encoding, decide_utf32_encoding): New static functions.
(scm_i_port_iconv_descriptors): Add new 'mode' parameter. If the
specified encoding is UTF-16 or UTF-32, make that precise by deciding
what byte order to use, and construct iconv descriptors based on the
precise encoding.
(scm_i_set_port_encoding_x): Record that we are now at stream start.
Do not open the new iconv descriptors immediately; let them be
initialized lazily.
* libguile/print.c (display_string_using_iconv): Record that we're no
longer at stream start. Write a BOM if appropriate.
* doc/ref/api-io.texi (BOM Handling): New node.
* test-suite/tests/ports.test ("set-port-encoding!, wrong encoding"):
Adapt test to cope with the fact that 'set-port-encoding!' does not
immediately open the iconv descriptors.
(bv-read-test): New procedure.
("unicode byte-order marks (BOMs)"): New test prefix.
* libguile/r6rs-ports.c (scm_get_bytevector_some): Rewrite to
efficiently take the contents of the read/putback buffers. In the
docstring, clarify that it might not return all available bytes.
* doc/ref/api-io.texi (R6RS Binary Input): Clarify that
'get-bytevector-some' might not return all available bytes.
* test-suite/tests/r6rs-ports.test ("get-bytevector-some [only-some]"):
Remove bogus test, which requires more than the R6RS requires.
Fixes <http://bugs.gnu.org/11468>.
* libguile/ports.c (scm_conversion_strategy): Remove.
(default_conversion_strategy_var, sym_error, sym_substitute,
sym_escape): New variables.
(scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x):
Remove.
(scm_i_default_port_conversion_handler,
scm_i_set_default_port_conversion_handler): New functions.
(scm_port_conversion_strategy): Use
`scm_i_default_port_conversion_handler' when PORT is #f.
(scm_set_port_conversion_strategy_x): Use SYM_ERROR, SYM_SUBSTITUTE,
and SYM_ESCAPE. Use `scm_i_set_default_port_conversion_handler' when
PORT is #f.
(scm_init_ports): Initialize DEFAULT_CONVERSION_STRATEGY_VAR.
* libguile/ports.h: Update declarations accordingly.
* libguile/foreign.c: Change
`scm_i_get_conversion_strategy (SCM_BOOL_F)' to
`scm_i_default_port_conversion_handler ()'.
* libguile/strings.c: Likewise.
* test-suite/tests/ports.test ("%default-port-conversion-strategy"): New
test prefix.
* test-suite/tests/foreign.test ("pointer<->string")["%default-port-conversion-strategy
is error", "%default-port-conversion-strategy is soft"]: New tests.
* test-suite/test-suite/lib.scm (exception:encoding-error): Allow the
regexp to match `scm_to_stringn' error messages.
* doc/ref/api-io.texi (Ports): Document `%default-port-conversion-strategy'.
* doc/ref/api-compound.texi
* doc/ref/api-evaluation.texi
* doc/ref/api-foreign.texi
* doc/ref/api-io.texi
* doc/ref/posix.texi
* doc/ref/srfi-modules.texi: Add missing parentheses and commas to definitions
of C functions.
* doc/ref/api-data.texi: Change from @deffn to @deftypefn for C function
with arguments not of SCM type.