* libguile/ports.c (scm_specialize_port_encoding_x)
(scm_port_clear_stream_start_for_bom_read): New functions exported
to (ice-9 ports).
* module/ice-9/ports.scm (clear-stream-start-for-bom-read):
(fill-input, peek-char-and-len): Rework to handle BOM in fill-input
instead of once per peek-char.
* libguile/print.c (display_string_using_iconv): Remove BOM handling;
this is now handled by scm_lfwrite.
* libguile/ports.c (open_iconv_descriptors): Refactor to take encoding
as a symbol.
(prepare_iconv_descriptors): New helper.
(scm_i_port_iconv_descriptors): Remove scm_t_port_rw_active argument,
and don't sniff UTF-16/UTF-32 byte orders here. Instead BOM handlers
will call prepare_iconv_descriptors.
(scm_c_read_bytes): Call new port_clear_stream_start_for_bom_read
helper.
(port_maybe_consume_initial_byte_order_mark)
(scm_port_maybe_consume_initial_byte_order_mark): Remove. Leaves
Scheme %peek-char broken but it's unused currently so that's OK.
(peek_iconv_codepoint): Fetch iconv descriptors after doing fill-input
because it's fill-input that will sniff the BOM.
(peek_codepoint): Instead of handling BOM at every character, handle
in fill-input instead.
(maybe_consume_bom, port_clear_stream_start_for_bom_read)
(port_clear_stream_start_for_bom_write): New helpers.
(scm_fill_input): Slurp a BOM if needed.
(scm_i_write): Clear the start-of-stream-for-bom-write flag.
(scm_lfwrite): Write a BOM if needed.
* module/ice-9/ports.scm: Speed tweaks to %peek-char. Ultimately
somewhat fruitless; I can get 1.4s instead of 1.5s by only
half-inlining the UTF-8 case though.
* module/ice-9/ports.scm (EILSEQ, decoding-error, peek-char-and-len/utf8):
(peek-char-and-len/iso-8859-1, peek-char-and-len/iconv):
(peek-char-and-len, %peek-char): New definitions. Missing iconv1 for
peek-char, but enough to benchmark.
* libguile/ports.h (scm_sys_port_encoding, scm_sys_set_port_encoding):
New functions, to expose port encodings as symbols directly to (ice-9
ports).
(scm_port_maybe_consume_initial_byte_order_mark): New function.
* libguile/ports.c (scm_port_encoding): Dispatch to %port-encoding.
(scm_set_port_encoding_x): Dispatch to %set-port-encoding!.
(port_maybe_consume_initial_byte_order_mark): New helper, factored out
of peek_codepoint.
(scm_port_maybe_consume_initial_byte_order_mark, peek_codepoint): Call
port_maybe_consume_initial_byte_order_mark.
* module/ice-9/ports.scm (port-encoding): Implement in Scheme.
* libguile/ports.h (scm_t_port): Represent the conversion strategy as a
symbol, to make things easier for Scheme. Rename to
"conversion_strategy".
(scm_c_make_port_with_encoding): Change to take encoding and
conversion_strategy arguments as symbols.
(scm_i_string_failed_conversion_handler): New internal helper, to turn
a symbol to a scm_t_string_failed_conversion_handler.
(scm_i_default_port_encoding): Return the default port encoding as a
symbol.
(scm_i_default_port_conversion_strategy)
(scm_i_set_default_port_conversion_strategy): Rename from
scm_i_default_port_conversion_handler et al. Take and return Scheme
symbols.
* libguile/foreign.c (scm_string_to_pointer, scm_pointer_to_string): Use
scm_i_default_string_failed_conversion_handler instead of
scm_i_default_port_conversion_handler.
* libguile/print.c (PORT_CONVERSION_HANDLER): Update definition.
(print_normal_symbol): Use PORT_CONVERSION_HANDLER.
* libguile/r6rs-ports.c (make_bytevector_input_port):
(make_custom_binary_input_port, make_bytevector_output_port): Adapt to
changes in scm_c_make_port_with_encoding.
* libguile/strings.h:
* libguile/strings.c (scm_i_default_string_failed_conversion_handler):
New helper.
(scm_from_locale_stringn, scm_from_port_stringn):
(scm_to_locale_stringn, scm_to_port_stringn): Adapt to interface
changes.
* libguile/strports.c (scm_mkstrport): Adapt to
scm_c_make_port_with_encoding change.
* libguile/ports.c (scm_c_make_port): Adapt to
scm_c_make_port_with_encoding change.
(ascii_toupper, encoding_matches, canonicalize_encoding): Move down in
the file.
(peek_codepoint, get_codepoint, scm_ungetc): Adapt to port conversion
strategy change. Remove duplicate case in get_codepoint.
(scm_init_ports): Move symbol initializations to the same place.
* libguile/ports-internal.h (scm_t_port_internal): Remove encoding_mode
member.
* libguile/ports.h (scm_t_port): "encoding" member is now a SCM symbol.
* libguile/ports.c (scm_init_ports): Define symbols for the encodings
that we handle explicitly.
(encoding_matches): Adapt to check against an encoding as a symbol.
(canonicalize_encoding): Return an encoding as a symbol.
(scm_c_make_port_with_encoding, scm_i_set_default_port_encoding)
(decide_utf16_encoding, decide_utf32_encoding)
(scm_i_port_iconv_descriptors, scm_i_set_port_encoding_x)
(scm_port_encoding, peek_codepoint, scm_ungetc): Adapt to encoding
change.
* libguile/print.c (display_string_using_iconv, display_string):
* libguile/read.c (scm_read_character):
* libguile/strings.c (scm_from_port_stringn, scm_to_port_stringn): Adapt
to port encoding change.
* module/ice-9/ports.scm (fill-input): Rewrite to make changes like the
ones made to the C scm_fill_input: allow callers to specify a minimum
amount of buffering.
* libguile/ports.c (scm_i_set_pending_eof): Remove now-unused helper.
(peek_utf8_codepoint, peek_latin1_codepoint, peek_iconv_codepoint):
(peek_codepoint): Refactor the fundamental character readers in Guile
to peek into the read buffer instead of reading then unreading. This
will allow Scheme to use the port buffer to convert, when we port this
to Scheme.
(get_codepoint): Use peek_codepoint.
(scm_getc): Adapt.
(scm_peek_char): Use peek_codepoint.
* libguile/ports.h (scm_fill_input): Add "minimum_size" argument. Adapt
all callers to pass 0 as this argument.
* libguile/ports.c (scm_i_read): Inline into scm_fill_input.
(scm_fill_input): "minimum_size" argument ensures that there are a
certain number of bytes available, or EOF. Instead of shrinking the
read buffer, only fill by the read_buffering amount, or the
minimum_size, whichever is larger.
* libguile/r6rs-ports.c:
* libguile/read.c: Adapt scm_fill_input callers.
* libguile/ports.c (trampoline_to_c_read, trampoline_to_c_write): Since
C might assume that the indices are within bounds of the bytevector,
verify them more here.
(scm_port_random_access_p, scm_port_read_buffering)
(scm_set_port_read_buffer, scm_port_read, scm_port_write): New helpers
exposed to (ice-9 ports).
(scm_port_read_buffer, scm_port_write_buffer): Don't flush or validate
port mode; we do that in Scheme.
* module/ice-9/ports.scm: Implement enough of port machinery to
implement peek-byte in Scheme. Not yet exported.
* libguile/init.c (scm_i_init_guile): Initialize ports before
strports/fports, so that we have initialized the read/write
trampolines before making port types.
* libguile/ports.c (scm_init_ice_9_ports): Define the-eof-object here.
Update a comment.
* module/ice-9/ports.scm: Use the-eof-object definition from C.
* libguile/ports.h (scm_t_ptob_descriptor): Add "scm_read" and
"scm_write" members, for calling from Scheme.
(scm_set_port_scm_read, scm_set_port_scm_write): New procedures.
* libguile/ports.c (trampoline_to_c_read_subr)
(trampoline_to_c_write_subr): New static variables.
* libguile/ports.c (scm_make_port_type): Initialize scm_read and
scm_write members to trampoline to C.
(trampoline_to_c_read, trampoline_to_scm_read)
(trampoline_to_c_write, trampoline_to_scm_write): New helpers.
(scm_set_port_scm_read, scm_set_port_scm_write): New functions.
(default_buffer_size): Move definition down.
(scm_i_read_bytes, scm_i_write_bytes): Use new names for read and
write procedures.
(scm_init_ports): Initialize trampolines.
* libguile/ports.c (scm_peek_char): Optimize. A loop calling peek-char
on a buffered string port 10e6 times goes down from 50ns/iteration to
32ns/iteration.
* libguile/print.c (scm_write, scm_display):
* libguile/read.c (set_port_read_option): Remove port locking. Reading
and writing to the same port from multiple threads just must not
crash; it doesn't have to make sense.
* libguile/ports.h (scm_unget_bytes_unlocked, scm_unget_byte_unlocked):
Remove.
* libguile/ports.c (looking_at_bytes): Use scm_unget_bytes instead of
scm_i_unget_bytes_unlocked
(scm_unget_bytes): Rename from scm_i_unget_bytes_unlocked. Remove
other implementations of this function.
(scm_unget_byte): Likewise.
(scm_ungetc_unlocked, scm_peek_char): Use scm_unget_byte.
* libguile/read.c (read_token): Use scm_unget_byte.
* libguile/ports.h (scm_getc_unlocked): Remove, or rather rename to
scm_getc. This probably introduces some thread-related bugs but we'll
fix them in a different way.
* libguile/ports.c (scm_getc): Rename from scm_getc_unlocked, replacing
the locky implementation.
(scm_read_char): Use scm_getc.
* libguile/r6rs-ports.c (scm_get_string_n_x): Use scm_getc.
* libguile/rdelim.c (scm_read_delimited_x, scm_read_line): Use
scm_getc.
* libguile/read.c: Use scm_getc.
* libguile/ports.h (scm_flush_unlocked, scm_end_input_unlocked):
Remove.
* libguile/ports.c (scm_c_read_bytes_unlocked):
(scm_i_unget_bytes_unlocked, scm_setvbuf, scm_force_output)
(scm_fill_input_unlocked, scm_c_write_bytes_unlocked)
(scm_c_write_unlocked, scm_lfwrite_unlocked, scm_seek)
(scm_truncate_file, flush_output_port): Call scm_flush / scm_end_input
instead of the _unlocked variants.
(scm_end_input): Lock while discarding the input buffer but not while
calling out to the seek function.
* libguile/filesys.c (scm_fsync):
* libguile/ioext.c (scm_redirect_port):
* libguile/read.c (scm_i_scan_for_encoding):
* libguile/rw.c (scm_write_string_partial): Use scm_flush, not
scm_flush_unlocked.
* libguile/ports.h (scm_c_read_unlocked): Remove.
* libguile/ports.c (scm_c_read): Rename from scm_c_read_unlocked.
Remove old scm_c_read. Lock around access to the rw_active flag, and
call scm_flush instead of scm_flush_unlocked, and scm_fill_input
instead of scm_fill_input_unlocked.
* libguile/read.c (scm_i_scan_for_encoding): Use scm_c_read instead of
the _unlocked function.
* libguile/ports.h (scm_get_byte_or_eof_unlocked)
(scm_peek_byte_or_eof_unlocked): Remove inline functions. The
important uses are in ports.c anyway and we will use a static function
there.
(scm_slow_get_byte_or_eof_unlocked)
(scm_slow_peek_byte_or_eof_unlocked): Remove declarations without
definitions.
* libguile/ports.c (looking_at_bytes): Use scm_peek_byte_or_eof instead
of the _unlocked variant.
(get_byte_or_eof, peek_byte_or_eof): New static functions.
(scm_get_byte_or_eof, scm_peek_byte_or_eof): Don't lock: the port
buffer mechanism means that we won't crash. More comments to come.
(get_utf8_codepoint, get_latin1_codepoint, get_iconv_codepoint): Use
new static functions.
* libguile/read.c (read_token, scm_read_semicolon_comment): Use
scm_get_byte_or_eof, not scm_get_byte_or_eof_unlocked.
* libguile/ports-internal.h (scm_port_buffer_bytevector)
(scm_port_buffer_cur, scm_port_buffer_set_cur)
(scm_port_buffer_end, scm_port_buffer_set_end)
(scm_port_buffer_has_eof_p, scm_port_buffer_set_has_eof_p): New
helpers.
* libguile/ports-internal.h (scm_port_buffer_size)
(scm_port_buffer_reset, scm_port_buffer_reset_end)
(scm_port_buffer_can_take, scm_port_buffer_can_put)
(scm_port_buffer_can_putback, scm_port_buffer_did_take)
(scm_port_buffer_did_put, scm_port_buffer_take_pointer)
(scm_port_buffer_put_pointer, scm_port_buffer_take)
(scm_port_buffer_put, scm_port_buffer_putback): Adapt to treat port
buffers as SCM values and use helpers to access them.
* libguile/ports.c (scm_i_clear_pending_eof, scm_i_set_pending_eof)
(scm_c_make_port_buffer, scm_i_read_unlocked)
(scm_c_read_bytes_unlocked, scm_i_unget_bytes_unlocked)
(scm_setvbuf, scm_fill_input, scm_take_from_input_buffers)
(scm_drain_input, scm_end_input_unlocked, scm_flush_unlocked)
(scm_fill_input_unlocked, scm_i_write_unlocked)
(scm_c_write_bytes_unlocked, scm_c_write_unlocked)
(scm_char_ready_p): Adapt to treat port buffers as SCM values and use
helpers to access them.
(scm_port_read_buffer, scm_port_write_buffer): New functions,
allowing (ice-9 ports) to access port buffers.
* libguile/ports.h: Update comments on port buffers. Replace
scm_t_port_buffer structure with a Scheme vector whose fields are
enumerated by "enum scm_port_buffer_field".
(scm_get_byte_or_eof_unlocked, scm_peek_byte_or_eof_unlocked): Adapt
these implementations to port buffer representation change.
* libguile/r6rs-ports.c (scm_get_bytevector_some):
* libguile/read.c (scm_i_scan_for_encoding):
* libguile/rw.c (scm_write_string_partial): Port buffers are Scheme
objects.
* libguile/ports-internal.h (scm_port_buffer_size): Verify that the
bytevector field is a bytevector, in anticipation of Schemification.
(scm_port_buffer_can_take, scm_port_buffer_can_put)
(scm_port_buffer_can_putback): Enforce invariants on cur and end
here.
(scm_port_buffer_did_take, scm_port_buffer_did_put): Relax to not call
other functions.
* libguile/ports.h (scm_get_byte_or_eof_unlocked)
(scm_peek_byte_or_eof_unlocked): Refactor to call no functions on the
fast path.
* libguile/ports.h (scm_t_port_buffer): Change "cur" and "end" members
to be SCM values, in preparation for changing port buffers to be
Scheme vectors.
(scm_get_byte_or_eof_unlocked, scm_peek_byte_or_eof_unlocked): Adapt.
* libguile/ports.c (scm_c_make_port_buffer): Initialize cur and end
members.
(looking_at_bytes): Use helper instead of incrementing cur.
(scm_i_read_unlocked): Adapt to end type change.
(CONSUME_PEEKED_BYTE): Use helper instead of incrementing cur.
(scm_i_unget_bytes_unlocked): Use helper instead of comparing cur.
(scm_i_write_unlocked): Fix for changing end/cur types.
* libguile/read.c (scm_i_scan_for_encoding): Use helpers instead of
addressing cursors directly.
* libguile/rw.c (scm_write_string_partial): Likewise.
* libguile/ports-internal.h (scm_port_buffer_reset):
(scm_port_buffer_reset_end, scm_port_buffer_can_take):
(scm_port_buffer_can_put, scm_port_buffer_can_putback):
(scm_port_buffer_did_take, scm_port_buffer_did_put):
(scm_port_buffer_take_pointer, scm_port_buffer_put_pointer):
(scm_port_buffer_putback): Adapt to data types.
* libguile/ports-internal.h (scm_port_buffer_reset_end): New helper.
(scm_port_buffer_putback): New helper.
* libguile/ports.h (scm_t_port_buffer): Remove "buf" field.
(scm_get_byte_or_eof_unlocked, scm_peek_byte_or_eof_unlocked): Adapt.
* libguile/ports.c (scm_c_make_port_buffer): No more "buf" field.
(scm_i_unget_bytes_unlocked): Use helper.
* libguile/read.c (scm_i_scan_for_encoding): No more "buf" field.
* libguile/ports.h (scm_t_port_buffer): Rename has_eof member to
has_eof_p, and be a Scheme value, in anticipation of moving the port
buffers to be Scheme objects.
* libguile/vports.c (struct soft_port): Inline the encoding buffer so as
to not use scm_t_port_buffer, in anticipation of changing the port
buffer representations. Adapt users.
* module/ice-9/ports.scm: New file.
* am/bootstrap.am (SOURCES): Add ice-9/ports.scm.
* libguile/fports.c (scm_init_ice_9_fports): New function.
(scm_init_fports): Arrange for scm_init_ice_9_fports to be called via
load-extension, and load snarfed things there. Move open-file
definition early, to allow ports to bootstrap.
* libguile/ioext.c (scm_init_ice_9_ioext): New function.
(scm_init_ioext): Similarly, register scm_init_ice_9_ioext as an
extension.
* libguile/ports.c (scm_set_current_input_port)
(scm_set_current_output_port, scm_set_current_error_port): Don't
define Scheme bindings; do that in Scheme.
* libguile/ports.c (scm_i_set_default_port_encoding):
(scm_i_default_port_encoding, scm_i_default_port_conversion_handler):
(scm_i_set_default_port_conversion_handler): Since we now init
encoding early, remove the "init" flags on these encoding/strategy
vars.
(scm_init_ice_9_ports): New function.
(scm_init_ports): Register scm_init_ice_9_ports extension, and define
some bindings needed by the bootstrap.
* module/Makefile.am (SOURCES): Add ice-9/ports.scm.
* module/ice-9/boot-9.scm: Remove code that's not on the boot path,
moving it to ice-9/ports.scm. At the end, load (ice-9 ports).
* module/ice-9/psyntax.scm (include): Use close-port instead of
close-input-port.
* module/ice-9/psyntax-pp.scm (include): Regenerate.
* module/ice-9/r6rs-libraries.scm (resolve-r6rs-interface): In Guile, a
module's public interface is just another module, and that means that
it can import other modules as well. Allow for R6RS modules that
import module whose interfaces import other modules to access all
visible bindings.
* test-suite/tests/rnrs-libraries.test ("import features"): Update
test.
* libguile/struct.c (scm_init_struct): Use scm_from_latin1_string to
avoid locale-dependency for what is really a latin1 string. Also
avoids an early dependency on the default port conversion handler,
though I wonder if using port conversion handlers in strings is the
right thing.
* module/ice-9/boot-9.scm (exception-printers): Fix error in which, for
a pure bootstrap with no compiled files, the exception printer would
use false-with-exception before it has been defined, which doesn't
work for macros. We wouldn't see this problem normally because,
oddly, the macro is indeed defined normally because of boot reasons.
* libguile/ports.c (scm_i_write_bytes_unlocked): Allow incomplete
writes from the implementation.
(scm_c_write_bytes_unlocked): Use scm_i_write_bytes_unlocked helper to
call the write function.
* libguile/r6rs-ports.c (custom_binary_output_port_write): Don't loop;
core Guile will do that.
This will allow better Scheme integration for ports.
* libguile/ports.h (scm_t_port_buffer): Change "holder" member to be a
bytevector defined to have "buf" as its starting point.
(scm_t_ptob_descriptor): Change read and write functions to take
bytevectors as arguments and to return the number of octets read or
written.
(scm_make_port_type): Adapt accordingly.
(scm_c_read_bytes, scm_c_write_bytes): New functions that take
bytevectors.
* libguile/ports.c (scm_make_port_type): Adapt to read/write function
prototype change.
(scm_c_make_port_buffer): Arrange to populate the "bytevector" field.
(scm_i_read_bytes_unlocked): New function.
(scm_i_read_unlocked): Use scm_i_read_bytes_unlocked.
(scm_c_read_bytes_unlocked): New function.
(scm_c_read_unlocked): Update comment, and always go through the
buffer.
(scm_c_read_bytes): New function.
(scm_flush_unlocked): Use scm_i_write_unlocked instead of the port's
write function.
(scm_i_write_bytes_unlocked): New function.
(scm_i_write_unlocked): Use scm_i_write_bytes_unlocked.
(scm_c_write_bytes_unlocked): New function.
(scm_c_write_unlocked): Always write through the buffer.
(scm_c_write_bytes): New function.
(scm_truncate_file): Remove unused variable.
(void_port_read, void_port_write): Adapt to read/write prototype
change.
* libguile/fports.c (fport_read, fport_write):
* libguile/r6rs-ports.c (bytevector_input_port_read)
(custom_binary_input_port_read, bytevector_output_port_write)
(custom_binary_output_port_write, transcoded_port_write)
(transcoded_port_read): Adapt to read/write prototype
change.
(scm_get_bytevector_n, scm_get_bytevector_n_x)
(scm_get_bytevector_all): Use scm_c_read_bytes.
(scm_put_bytevector): Use scm_c_write_bytes.
* libguile/strports.c (string_port_read, string_port_write):
* libguile/vports.c (soft_port_write, soft_port_read): Adapt to
read/write prototype change.
* test-suite/standalone/test-scm-c-read.c (custom_port_read): Fix for
read API change.