* libguile/ports-internal.h (scm_port_buffer_reset_end): New helper.
(scm_port_buffer_putback): New helper.
* libguile/ports.h (scm_t_port_buffer): Remove "buf" field.
(scm_get_byte_or_eof_unlocked, scm_peek_byte_or_eof_unlocked): Adapt.
* libguile/ports.c (scm_c_make_port_buffer): No more "buf" field.
(scm_i_unget_bytes_unlocked): Use helper.
* libguile/read.c (scm_i_scan_for_encoding): No more "buf" field.
* libguile/ports.h (struct scm_t_port_buffer): New data type.
(struct scm_t_port): Refactor to use port buffers instead of
implementation-managed read and write pointers. Add "read_buffering"
member.
(SCM_INITIAL_PUTBACK_BUF_SIZE, SCM_READ_BUFFER_EMPTY_P): Remove.
(scm_t_ptob_descriptor): Rename "fill_input" function to "read", and
take a port buffer, returning void. Likewise "write" takes a port
buffer and returns void. Remove "end_input"; instead if there is
buffered input and rw_random is true, then there must be a seek
function, so just seek back if needed. Remove "flush"; instead all
calls to the "write" function implicitly include a "flush", since the
buffering happens in the generic port code now. Remove "setvbuf", but
add "get_natural_buffer_sizes"; instead the generic port code can
buffer any port.
(scm_make_port_type): Adapt to read and write prototype changes.
(scm_set_port_flush, scm_set_port_end_input, scm_set_port_setvbuf):
Remove.
(scm_slow_get_byte_or_eof_unlocked)
(scm_slow_get_peek_or_eof_unlocked): Remove; the slow path is to call
scm_fill_input.
(scm_set_port_get_natural_buffer_sizes): New function.
(scm_c_make_port_buffer): New internal function.
(scm_port_non_buffer): Remove. This was a function for
implementations that is no longer needed. Instead open with BUF0 or
use (setvbuf port 'none).
(scm_fill_input, scm_fill_input_unlocked): Return the filled port
buffer.
(scm_get_byte_or_eof_unlocked, scm_peek_byte_or_eof_unlocked): Adapt
to changes in buffering and EOF management.
* libguile/ports.c: Adapt to port interface changes.
(initialize_port_buffers): New function, using the port mode flags to
set up appropriate initial buffering for all ports.
(scm_c_make_port_with_encoding): Create port buffers here instead of
delegating to implementations.
(scm_close_port): Flush the port if needed instead of delegating to
the implementation.
* libguile/filesys.c (set_element): Adapt to buffering changes.
* libguile/fports.c (fport_get_natural_buffer_sizes): New function,
replacing scm_fport_buffer_add.
(fport_write, fport_read): Update to let the generic ports code do the
buffering.
(fport_flush, fport_end_input): Remove.
(fport_close): Don't flush in a dynwind; that's the core ports' job.
(scm_make_fptob): Adapt.
* libguile/ioext.c (scm_redirect_port): Adapt to buffering changes.
* libguile/poll.c (scm_primitive_poll): Adapt to buffering changes.
* libguile/ports-internal.h (struct scm_port_internal): Remove
pending_eof flag; this is now set on the read buffer.
* libguile/r6rs-ports.c (struct bytevector_input_port): New type. The
new buffering arrangement means that there's now an intermediate
buffer between the bytevector and the user of the port; this could
lead to a perf degradation, but on the other hand there are some other
speedups enabled by the buffering refactor, so probably the memcpy
cost is dwarfed by the cost of the other parts of the ports
machinery.
(make_bytevector_input_port, bytevector_input_port_read):
(bytevector_input_port_seek, initialize_bytevector_input_ports): Adapt
to new buffering arrangement.
(struct custom_binary_port): Remove read buffer, as Guile handles that
now.
(custom_binary_input_port_setvbuf): Remove; now handled by Guile.
(make_custom_binary_input_port, custom_binary_input_port_read)
(initialize_custom_binary_input_ports): Adapt.
(scm_get_bytevector_some): Adapt to new EOF management.
(scm_t_bytevector_output_port_buffer): Hold on to the underlying port,
so we can flush it if it's open.
(make_bytevector_output_port, bytevector_output_port_write):
(bytevector_output_port_seek): Adapt.
(bytevector_output_port_procedure): Flush the port as appropriate, so
that we get all the bytes.
(make_custom_binary_output_port, custom_binary_output_port_write):
Adapt.
(make_transcoded_port): Don't muck with buffering.
(transcoded_port_write): Simply forward the write to the underlying
port.
(transcoded_port_read): Likewise.
(transcoded_port_close): No need to flush.
(initialize_transcoded_ports): Adapt.
* libguile/read.c (scm_i_scan_for_encoding): Adapt to buffering
changes.
* libguile/rw.c (scm_write_string_partial): Adapt to buffering changes.
* libguile/strports.c: Adapt to the fact that we don't manage the
buffer. Probably room for speed improvements here...
* libguile/vports.c (soft_port_get_natural_buffer_sizes): New function.
Adapt the rest of the file for the new buffering regime.
* test-suite/tests/r6rs-ports.test ("8.2.10 Output ports"): Custom
binary output ports need to be flushed before you can rely on the
write! procedure having been called. Add necessary flush-port
invocations.
("8.2.6 Input and output ports"): Transcoded ports now have an
internal buffer by default. This test checks that the characters are
transcoded one at a time, so to do that, call setvbuf on the
transcoded port to remove the buffer.
* test-suite/tests/web-client.test (run-with-http-transcript): Fix for
different flushing regime on soft ports. (The vestigial flush
procedure is now called after each write, which is not what the test
was expecting.)
* test-suite/standalone/test-scm-c-read.c: Update for changes to the C
interface for defining port types.
* doc/ref/api-io.texi (Ports): Update to discuss buffering in a generic
way, and to remove a hand-wavey paragraph describing string ports as
"interesting and powerful".
(Reading, Writing): Remove placeholder comments. Document
`scm_lfwrite'.
(Buffering): New section.
(File Ports): Link to buffering.
(I/O Extensions): Join subnodes into parent and describe new API,
including buffering API.
* doc/ref/posix.texi (Ports and File Descriptors): Link to buffering.
Remove unread-char etc, as they are documented elsewhere.
(Pipes, Network Sockets and Communication): Link to buffering.
* libguile/fports.c (fport_flush, fport_end_input): Move rw_active
handling to ports.c.
* libguile/ioext.c (scm_redirect_port): Use scm_flush_unlocked instead
of calling the flush function directly.
* libguile/ports.c (scm_c_make_port_with_encoding): Ports default to
"rw_random" mode when they have a seek function.
(scm_c_read_unlocked, scm_i_unget_bytes_unlocked)
(scm_slow_get_byte_or_eof_unlocked)
(scm_slow_peek_byte_or_eof_unlocked): Flush write buffer and set
rw_active always in the same way, and only if rw_random is true.
(scm_end_input_unlocked, scm_flush_unlocked): Clear rw_active here
unconditionally.
(scm_c_write_unlocked): Flush read buffer and set rw_active always in
the same way, but only if rw_random is true.
(scm_c_write, scm_lfwrite): Whitespace fixes.
(scm_lfwrite_substr): Don't flush read buffer; lower-level code will
do this.
(scm_truncate_file): Use scm_flush_unlocked instead of calling the
flush function directly.
* libguile/r6rs-ports.c (transcoded_port_flush): Don't muck with
rw_active.
* libguile/read.c (scm_i_scan_for_encoding): Flush write buffer if
needed in same way as ports.c.
* libguile/strports.c (st_end_input): Don't muck with rw_active.
(scm_mkstrport): rw_random defaults to 1 now.
Suggested by David Kastrup <dak@gnu.org> in <http://bugs.gnu.org/13644>.
* libguile/read.c (scm_read_string_like_syntax): Accept "\(" as
equivalent to "(".
* doc/ref/api-data.texi (String Syntax): Document it.
* test-suite/tests/reader.test ("reading"): Add test.
Fixes <http://bugs.gnu.org/16463>.
Reported by Sree Harsha Totakura <sreeharsha@totakura.in>.
* libguile/read.c (ENCODING_NAME_MAX_SIZE): New macro.
(SCM_ENCODING_SEARCH_SIZE): Change to 500 + ENCODING_NAME_MAX_SIZE.
(scm_i_scan_for_encoding): Return NULL if there's less than
ENCODING_NAME_MAX_SIZE bytes once "coding: *" has been read.
* test-suite/tests/coding.test ("line
comment")["http://bugs.gnu.org/16463"]: New test.
* libguile/private-options.h (SCM_R7RS_SYMBOLS_P): New macro.
(SCM_N_READ_OPTIONS): Increment.
* libguile/read.c (scm_read_opts): Add entry for 'r7rs-symbols'.
(t_read_opts): Add field for 'r7rs_symbols_p'.
(scm_read_string_like_syntax): New function based on earlier
'scm_read_string' that handles either string literals or R7RS quoted
symbols (delimited by vertical bars), depending on the value of 'chr'.
(scm_read_string): Reimplement based on 'scm_read_string_like_syntax'.
(scm_read_r7rs_symbol): New static function.
* doc/ref/api-data.texi (Symbol Read Syntax): Briefly describe the R7RS
symbol syntax, mention the 'r7rs-symbols' read option, and give some
examples.
* doc/ref/api-evaluation.texi (Scheme Read): Mention the 'r7rs-symbols'
read option.
* test-suite/tests/reader.test ("reading"): Add test.
* libguile/ports.c (scm_i_port_property, scm_i_set_port_property_x):
Lock the port mutex while accessing the port alist.
* libguile/read.c (set_port_read_option): Lock the port mutex
while modifying port read options.
* libguile/ports.c (scm_i_port_alist, scm_i_set_port_alist_x): Removed.
(scm_i_port_property, scm_i_set_port_property_x): New procedures,
available from Scheme as '%port-property' and '%set-port-property!'.
* libguile/ports.h (scm_i_port_alist, scm_i_set_port_alist_x): Removed.
(scm_i_port_property, scm_i_set_port_property_x): New prototypes.
* libguile/read.c (set_port_read_option, init_read_options): Adapt to
use scm_i_port_property and scm_i_set_port_property_x.
* libguile/ports-internal.h (struct scm_port_internal): Add 'alist'
member.
* libguile/ports.c (scm_i_port_alist, scm_i_set_port_alist_x): New
internal functions.
(scm_i_port_weak_hash): Update comment: the hash table is no longer
used to store the port's alist.
(scm_new_port_table_entry): Initialize 'alist'. Store SCM_BOOL_F in
the port weak hash, not SCM_EOL.
* libguile/ports.h (scm_i_port_alist, scm_i_set_port_alist_x): Add
protoypes.
* libguile/read.c (set_port_read_option, init_read_options): Access the
port's alist via 'scm_i_port_alist' and 'scm_i_set_port_alist_x'.
* libguile/ports.h:
* libguile/ports.c (scm_consume_byte_order_mark): New procedure.
* libguile/fports.c (scm_open_file): Call consume-byte-order-mark if we
are opening a file in "r" mode.
* libguile/read.c (scm_i_scan_for_encoding): Don't do anything about
byte-order marks.
* libguile/load.c (scm_primitive_load): Add a note about the duplicate
encoding scan.
* test-suite/tests/filesys.test: Add tests for UTF-8, UTF-16BE, and
UTF-16LE BOM handling.
* libguile/ports.c (scm_c_read_unlocked, scm_ungetc_unlocked):
* libguile/read.c (scm_read_character):
* libguile/vports.c (sf_fill_input): Port encodings cannot be NULL any
more, now that encodings are canonicalized, so simplify these.
* libguile/ports.c (ascii_toupper, encoding_matches)
(canonicalize_encoding): New helpers.
(scm_c_make_port_with_encoding):
(scm_i_set_default_port_encoding):
(scm_i_set_port_encoding_x): Use the new helpers to be
case-insensitive and also to canonicalize the internal representation
to upper-case ASCII names.
(scm_i_default_port_encoding): Never return NULL.
(scm_port_encoding): The encoding is always a string.
* libguile/read.c (scm_i_scan_for_encoding): Use a locale-independent
check instead of isalnum. Don't upcase the result: the port code will
handle that.
* test-suite/tests/web-response.test ("example-1"): Adapt test to expect
normalized (upper-case) encoding for the response port.
Moved scm_i_struct_hash from struct.c to hash.c and made it static.
The port's alist is now a field of 'scm_t_port'.
Conflicts:
libguile/arrays.c
libguile/hash.c
libguile/ports.c
libguile/print.h
libguile/read.c
* libguile/private-options.h: Add SCM_CURLY_INFIX_P macro, and increment
SCM_N_READ_OPTIONS.
* libguile/read.c (sym_nfx, sym_bracket_list, sym_bracket_apply): New
variables.
(scm_read_opts): Add curly-infix reader option. Reformat to comply
with GNU coding standards.
(scm_t_read_opts): Add curly_infix_p and neoteric_p fields.
(init_read_options): Initialize new fields.
(CHAR_IS_DELIMITER): Add '{', '}', '[', and ']' as delimiters if
curly_infix_p is set.
(set_port_square_brackets_p, set_port_curly_infix_p): New functions.
(read_inner_expression): New function which contains the code that was
previously in 'scm_read_expression'. Handle curly braces when
curly_infix_p is set. If curly_infix_p is set and square_brackets_p
is unset, follow the Kawa convention: [...] => ($bracket-list$ ...)
(scm_read_expression): New function body to handle neoteric
expressions where appropriate.
(scm_read_shebang): Handle the new reader directives: '#!curly-infix'
and the non-standard '#!curly-infix-and-bracket-lists'.
(scm_read_sexp): Handle curly infix lists.
* module/ice-9/boot-9.scm (%cond-expand-features): Add srfi-105 feature
identifier.
* doc/ref/srfi-modules.texi (SRFI-105): Add stub doc for SRFI-105.
* doc/ref/api-evaluation.texi (Scheme Read): Add documentation for the
'curly-infix' read option, and the '#!curly-infix' and
'#!curly-infix-and-bracket-lists' reader directives.
* doc/ref/api-options.texi (Runtime Options): Add 'curly-infix' to the
list of read options.
* test-suite/Makefile.am: Add tests/srfi-105.test.
* test-suite/tests/srfi-105.test: New file.
* libguile/read.c (scm_t_read_opts): Update comment to mention the
per-port read options.
(sym_port_read_options): New variable.
(set_port_read_option): New function.
(init_read_options): Add new 'port' parameter, and consult the
per-port read option overrides when initializing the 'scm_t_read_opts'
struct. Move to bottom of file.
(scm_read): Pass 'port' parameter to init_read_options.
* libguile/read.c (scm_read_sharp_extension): Attach source properties
to the result of a custom token reader if the returned datum is not
immediate. Previously, source properties were added to pairs only.
* libguile/read.c (enum t_keyword_style, struct t_read_opts,
scm_t_read_opts): New types.
(init_read_options): New function.
(CHAR_IS_DELIMITER): Look up square-brackets option via local 'opts'.
(scm_read): Call 'init_read_options', and pass 'opts' to helpers.
(flush_ws, maybe_annotate_source, read_complete_token, read_token,
scm_read_bytevector, scm_read_character,
scm_read_commented_expression, scm_read_expression,
scm_read_guile_bit_vector, scm_read_keyword,
scm_read_mixed_case_symbol, scm_read_nil, scm_read_number,
scm_read_number_and_radix, scm_read_quote, scm_read_sexp,
scm_read_sharp, scm_read_sharp_extension, scm_read_shebang,
scm_read_srfi4_vector, scm_read_string, scm_read_syntax,
scm_read_vector, scm_read_array): Add 'opts' as an additional
parameter, and use it to look up read options. Previously the global
read options were consulted directly.
* libguile/read.c (CHAR_IS_R5RS_DELIMITER, CHAR_IS_DELIMITER): Move the
'[' and ']' delimiters from CHAR_IS_R5RS_DELIMITER to
CHAR_IS_DELIMITER. Parenthesize all references to the macro
parameter. Don't check the global square-brackets read option until
after we know the character is '[' or ']'.
(scm_read_sexp): Don't check the global square-brackets read option
until after we know the character is ']'.
* libguile/arrays.c (read_decimal_integer): Move to read.c.
(scm_i_read_array): Remove. Incorporate the code into the
'scm_read_array' static function in read.c.
* libguile/arrays.h (scm_i_read_array): Remove prototype.
* libguile/read.c (read_decimal_integer): Move here from read.c.
(scm_read_array): Incorporate the code from 'scm_i_read_array'. Call
'scm_read_vector' and 'scm_read_sexp' instead of 'scm_read'.
According to the new benchmarks, this leads a 5% speed improvement when
reading small strings, and a 27% improvement when reading large strings.
* libguile/read.c (READER_STRING_BUFFER_SIZE): Change to 128; update
comment to mention codepoints.
(scm_read_string): Make `str' a list of strings, instead of a string.
Store characters read in buffer `c_str'. Cons to STR when C_STR is
full, and concatenate/reverse at the end.
* benchmark-suite/benchmarks/read.bm (small, large): New variables.
Set %DEFAULT-PORT-ENCODING to "UTF-8".
("read")["small strings", "large strings"]: New benchmarks.
* libguile/read.c (read_token): Remove unneeded `const' before `size_t'.
(read_complete_token): Remove `overflow_buffer' parameter; return
`char *' instead of `int'. Allocate the overflow buffer with
`scm_gc_malloc_pointerless' instead of `scm_malloc'. Return either
the overflow buffer or BUFFER.
(scm_read_number, scm_read_mixed_case_symbol,
scm_read_number_and_radix): Rename `buffer' to `local_buffer', and
`overflow_buffer' to `buffer'. Remove `overflow'. Adjust code to new
`read_complete_token'.
* libguile/read.c (scm_read_number): Set source properties on
non-immediate numbers if the 'positions' reader option is set.
* doc/ref/api-debug.texi (Source Properties): Update manual.
* libguile/read.c (scm_read_array): New internal helper that
calls scm_i_read_array and sets its source property if the
'positions' reader option is set.
(scm_read_string): Set source properties on strings if the 'positions'
reader option is set.
(scm_read_vector, scm_read_srfi4_vector, scm_read_bytevector,
scm_read_guile_bitvector, scm_read_sharp): Add new arguments for the
'line' and 'column' of the first character of the datum being read.
Set source properties if the 'positions' reader option is set.
(scm_read_expression): Pass 'line' and 'column' to scm_read_sharp.
* doc/ref/api-debug.texi (Source Properties): Update manual.
* libguile/read.c (scm_read_string): Return a freshly allocated string
every time, even for empty strings. The motivation is to allow source
properties to be added to all strings. Previously, the shared global
'scm_nullstr' was returned for empty strings. Note that empty strings
still share a common global 'null_stringbuf'.
* test-suite/tests/srfi-13.test (substring/shared): Fix tests to reflect
the fact that empty string literals are no longer guaranteed to be
'eq?' to each other.