When encountering the #!r6rs directive, apply the appropriate reader
settings to the port.
* libguile/read.scm (read-string-as-list): New helper procedure.
(scm_read_shebang): Set reader options implied by the R6RS syntax
upon encountering the #!r6rs directive.
* test-suite/tests/reader.test (per-port-read-options): Add tests for
the #!r6rs directive.
Suggested by David Kastrup <dak@gnu.org> in <http://bugs.gnu.org/13644>.
* libguile/read.c (scm_read_string_like_syntax): Accept "\(" as
equivalent to "(".
* doc/ref/api-data.texi (String Syntax): Document it.
* test-suite/tests/reader.test ("reading"): Add test.
Fixes <http://bugs.gnu.org/16463>.
Reported by Sree Harsha Totakura <sreeharsha@totakura.in>.
* libguile/read.c (ENCODING_NAME_MAX_SIZE): New macro.
(SCM_ENCODING_SEARCH_SIZE): Change to 500 + ENCODING_NAME_MAX_SIZE.
(scm_i_scan_for_encoding): Return NULL if there's less than
ENCODING_NAME_MAX_SIZE bytes once "coding: *" has been read.
* test-suite/tests/coding.test ("line
comment")["http://bugs.gnu.org/16463"]: New test.
* libguile/private-options.h (SCM_R7RS_SYMBOLS_P): New macro.
(SCM_N_READ_OPTIONS): Increment.
* libguile/read.c (scm_read_opts): Add entry for 'r7rs-symbols'.
(t_read_opts): Add field for 'r7rs_symbols_p'.
(scm_read_string_like_syntax): New function based on earlier
'scm_read_string' that handles either string literals or R7RS quoted
symbols (delimited by vertical bars), depending on the value of 'chr'.
(scm_read_string): Reimplement based on 'scm_read_string_like_syntax'.
(scm_read_r7rs_symbol): New static function.
* doc/ref/api-data.texi (Symbol Read Syntax): Briefly describe the R7RS
symbol syntax, mention the 'r7rs-symbols' read option, and give some
examples.
* doc/ref/api-evaluation.texi (Scheme Read): Mention the 'r7rs-symbols'
read option.
* test-suite/tests/reader.test ("reading"): Add test.
* libguile/ports.c (scm_i_port_alist, scm_i_set_port_alist_x): Removed.
(scm_i_port_property, scm_i_set_port_property_x): New procedures,
available from Scheme as '%port-property' and '%set-port-property!'.
* libguile/ports.h (scm_i_port_alist, scm_i_set_port_alist_x): Removed.
(scm_i_port_property, scm_i_set_port_property_x): New prototypes.
* libguile/read.c (set_port_read_option, init_read_options): Adapt to
use scm_i_port_property and scm_i_set_port_property_x.
* libguile/ports-internal.h (struct scm_port_internal): Add 'alist'
member.
* libguile/ports.c (scm_i_port_alist, scm_i_set_port_alist_x): New
internal functions.
(scm_i_port_weak_hash): Update comment: the hash table is no longer
used to store the port's alist.
(scm_new_port_table_entry): Initialize 'alist'. Store SCM_BOOL_F in
the port weak hash, not SCM_EOL.
* libguile/ports.h (scm_i_port_alist, scm_i_set_port_alist_x): Add
protoypes.
* libguile/read.c (set_port_read_option, init_read_options): Access the
port's alist via 'scm_i_port_alist' and 'scm_i_set_port_alist_x'.
* libguile/ports.h:
* libguile/ports.c (scm_consume_byte_order_mark): New procedure.
* libguile/fports.c (scm_open_file): Call consume-byte-order-mark if we
are opening a file in "r" mode.
* libguile/read.c (scm_i_scan_for_encoding): Don't do anything about
byte-order marks.
* libguile/load.c (scm_primitive_load): Add a note about the duplicate
encoding scan.
* test-suite/tests/filesys.test: Add tests for UTF-8, UTF-16BE, and
UTF-16LE BOM handling.
* libguile/private-options.h: Add SCM_CURLY_INFIX_P macro, and increment
SCM_N_READ_OPTIONS.
* libguile/read.c (sym_nfx, sym_bracket_list, sym_bracket_apply): New
variables.
(scm_read_opts): Add curly-infix reader option. Reformat to comply
with GNU coding standards.
(scm_t_read_opts): Add curly_infix_p and neoteric_p fields.
(init_read_options): Initialize new fields.
(CHAR_IS_DELIMITER): Add '{', '}', '[', and ']' as delimiters if
curly_infix_p is set.
(set_port_square_brackets_p, set_port_curly_infix_p): New functions.
(read_inner_expression): New function which contains the code that was
previously in 'scm_read_expression'. Handle curly braces when
curly_infix_p is set. If curly_infix_p is set and square_brackets_p
is unset, follow the Kawa convention: [...] => ($bracket-list$ ...)
(scm_read_expression): New function body to handle neoteric
expressions where appropriate.
(scm_read_shebang): Handle the new reader directives: '#!curly-infix'
and the non-standard '#!curly-infix-and-bracket-lists'.
(scm_read_sexp): Handle curly infix lists.
* module/ice-9/boot-9.scm (%cond-expand-features): Add srfi-105 feature
identifier.
* doc/ref/srfi-modules.texi (SRFI-105): Add stub doc for SRFI-105.
* doc/ref/api-evaluation.texi (Scheme Read): Add documentation for the
'curly-infix' read option, and the '#!curly-infix' and
'#!curly-infix-and-bracket-lists' reader directives.
* doc/ref/api-options.texi (Runtime Options): Add 'curly-infix' to the
list of read options.
* test-suite/Makefile.am: Add tests/srfi-105.test.
* test-suite/tests/srfi-105.test: New file.
* libguile/read.c (scm_t_read_opts): Update comment to mention the
per-port read options.
(sym_port_read_options): New variable.
(set_port_read_option): New function.
(init_read_options): Add new 'port' parameter, and consult the
per-port read option overrides when initializing the 'scm_t_read_opts'
struct. Move to bottom of file.
(scm_read): Pass 'port' parameter to init_read_options.
* libguile/read.c (scm_read_sharp_extension): Attach source properties
to the result of a custom token reader if the returned datum is not
immediate. Previously, source properties were added to pairs only.
* libguile/read.c (enum t_keyword_style, struct t_read_opts,
scm_t_read_opts): New types.
(init_read_options): New function.
(CHAR_IS_DELIMITER): Look up square-brackets option via local 'opts'.
(scm_read): Call 'init_read_options', and pass 'opts' to helpers.
(flush_ws, maybe_annotate_source, read_complete_token, read_token,
scm_read_bytevector, scm_read_character,
scm_read_commented_expression, scm_read_expression,
scm_read_guile_bit_vector, scm_read_keyword,
scm_read_mixed_case_symbol, scm_read_nil, scm_read_number,
scm_read_number_and_radix, scm_read_quote, scm_read_sexp,
scm_read_sharp, scm_read_sharp_extension, scm_read_shebang,
scm_read_srfi4_vector, scm_read_string, scm_read_syntax,
scm_read_vector, scm_read_array): Add 'opts' as an additional
parameter, and use it to look up read options. Previously the global
read options were consulted directly.
* libguile/read.c (CHAR_IS_R5RS_DELIMITER, CHAR_IS_DELIMITER): Move the
'[' and ']' delimiters from CHAR_IS_R5RS_DELIMITER to
CHAR_IS_DELIMITER. Parenthesize all references to the macro
parameter. Don't check the global square-brackets read option until
after we know the character is '[' or ']'.
(scm_read_sexp): Don't check the global square-brackets read option
until after we know the character is ']'.
* libguile/arrays.c (read_decimal_integer): Move to read.c.
(scm_i_read_array): Remove. Incorporate the code into the
'scm_read_array' static function in read.c.
* libguile/arrays.h (scm_i_read_array): Remove prototype.
* libguile/read.c (read_decimal_integer): Move here from read.c.
(scm_read_array): Incorporate the code from 'scm_i_read_array'. Call
'scm_read_vector' and 'scm_read_sexp' instead of 'scm_read'.
According to the new benchmarks, this leads a 5% speed improvement when
reading small strings, and a 27% improvement when reading large strings.
* libguile/read.c (READER_STRING_BUFFER_SIZE): Change to 128; update
comment to mention codepoints.
(scm_read_string): Make `str' a list of strings, instead of a string.
Store characters read in buffer `c_str'. Cons to STR when C_STR is
full, and concatenate/reverse at the end.
* benchmark-suite/benchmarks/read.bm (small, large): New variables.
Set %DEFAULT-PORT-ENCODING to "UTF-8".
("read")["small strings", "large strings"]: New benchmarks.
* libguile/read.c (read_token): Remove unneeded `const' before `size_t'.
(read_complete_token): Remove `overflow_buffer' parameter; return
`char *' instead of `int'. Allocate the overflow buffer with
`scm_gc_malloc_pointerless' instead of `scm_malloc'. Return either
the overflow buffer or BUFFER.
(scm_read_number, scm_read_mixed_case_symbol,
scm_read_number_and_radix): Rename `buffer' to `local_buffer', and
`overflow_buffer' to `buffer'. Remove `overflow'. Adjust code to new
`read_complete_token'.
* libguile/read.c (scm_read_number): Set source properties on
non-immediate numbers if the 'positions' reader option is set.
* doc/ref/api-debug.texi (Source Properties): Update manual.
* libguile/read.c (scm_read_array): New internal helper that
calls scm_i_read_array and sets its source property if the
'positions' reader option is set.
(scm_read_string): Set source properties on strings if the 'positions'
reader option is set.
(scm_read_vector, scm_read_srfi4_vector, scm_read_bytevector,
scm_read_guile_bitvector, scm_read_sharp): Add new arguments for the
'line' and 'column' of the first character of the datum being read.
Set source properties if the 'positions' reader option is set.
(scm_read_expression): Pass 'line' and 'column' to scm_read_sharp.
* doc/ref/api-debug.texi (Source Properties): Update manual.
* libguile/read.c (scm_read_string): Return a freshly allocated string
every time, even for empty strings. The motivation is to allow source
properties to be added to all strings. Previously, the shared global
'scm_nullstr' was returned for empty strings. Note that empty strings
still share a common global 'null_stringbuf'.
* test-suite/tests/srfi-13.test (substring/shared): Fix tests to reflect
the fact that empty string literals are no longer guaranteed to be
'eq?' to each other.
* libguile/read.c (scm_read_r6rs_block_comment):
* test-suite/tests/reader.test ("reading"): Fix reading of #||||#,
originally reported in bug debbugs.gnu.org/9672, by Bruno Haible.
Thanks, Bruno!
* libguile/read.c (scm_read_sexp): Don't confuse `#{.}#' with `.' for
the purpose of reading dotted pairs. Thanks to CRLF0710 for the
report.
* test-suite/tests/reader.test ("#{}#"): Add test.
* libguile/srcprop.h: Remove internal scm_source_whash declaration.
* libguile/srcprop.c (scm_i_set_source_properties_x)
(scm_i_has_source_properties): New helpers.
(scm_source_whash): Make static.
* libguile/read.c (scm_read_sexp): Remove register declarations here;
let's trust the compiler. Remove code to incrementally build up a
copy; instead let's let scm_i_set_source_properties_x handle copying
the expression if needed.
(scm_read_quote, scm_read_syntax): Use scm_i_set_source_properties_x.
(recsexpr): Remove this helper from 1996.
(scm_read_sharp_extension): Instead of trying to recursively label
sharp-read subforms with source properties, just label the outside
form and rely on the macro-expander to propagate it down.
* libguile/numbers.c (scm_logand): Fix a type error (comparing a SCM
against an int, when we really wanted to compare the unpacked
fixnum).
* libguile/ports.c (scm_i_set_conversion_strategy_x): Check
scm_conversion_strategy_init, not scm_conversion_strategy.
* libguile/read.c (recsexpr): Fix loops to avoid strange test of SCM
values.
* libguile/tags.h (SCM_MAKE_ITAG8_BITS): New helper, produces a
scm_t_bits instead of a SCM, because SCM_UNPACK is not a constant
expression with SCM_DEBUG_TYPING_STRICTNESS==2.
(SCM_MAKIFLAG_BITS): Remove SCM_MAKIFLAG, and replace with this, which
returns bits.
(SCM_BOOL_F_BITS, SCM_ELISP_NIL_BITS, SCM_EOL_BITS, SCM_BOOL_T_BITS):
(SCM_UNSPECIFIED_BITS, SCM_UNDEFINED_BITS, SCM_EOF_VAL_BITS):
(SCM_UNBOUND_BITS): New definitions. Defined SCM_BOOL_F, etc in terms
of them.
(SCM_XXX_ANOTHER_BOOLEAN_DONT_USE_0):
(SCM_XXX_ANOTHER_BOOLEAN_DONT_USE_1):
(SCM_XXX_ANOTHER_BOOLEAN_DONT_USE_2):
(SCM_XXX_ANOTHER_LISP_FALSE_DONT_USE): Be bits instead of SCM values.
(SCM_BITS_DIFFER_IN_EXACTLY_ONE_BIT_POSITION):
(SCM_BITS_DIFFER_IN_EXACTLY_TWO_BIT_POSITIONS): Rename from
SCM_VALUES_DIFFER_..., and take unpacked bits as the args.
* libguile/boolean.c: Update verify block to use
SCM_BITS_DIFFER_IN_EXACTLY_TWO_BIT_POSITIONS et al.
* libguile/debug.c (scm_debug_opts):
* libguile/print.c (scm_print_opts):
* libguile/read.c (scm_read_opts): Use iflags bits for initializers.
* libguile/hash.c (scm_hasher): Use _BITS for iflags as case labels.
* libguile/pairs.c: Nil/null compile-time check uses
SCM_ELISP_NIL_BITS.
* libguile/deprecated.h:
* libguile/deprecated.c (scm_whash_get_handle, SCM_WHASHFOUNDP)
(SCM_WHASHREF, SCM_WHASHSET, scm_whash_create_handle)
(scm_whash_lookup, scm_whash_insert): Deprecate this API.
* libguile/srcprop.c:
* libguile/srcprop.h:
* libguile/read.c (scm_read_sexp): Use the hashq API instead of the
whash API.
* libguile/read.c (scm_read_extended_symbol): Interpret '\' as an escape
character. Due to some historical oddities we have to support '\'
before any character, but since we never emitted '\' in front of
"normal" characters like 'x' we can interpret "\x..;" to be an R6RS
hex escape.
* test-suite/tests/reader.test ("#{}#"): Add tests.
* libguile/read.c (scm_read_sharp): Move the "#c..." case outside of
#if SCM_ENABLE_DEPRECATED, and to the same section which handles
"#s...", "#u..." and "#f...".
Thanks to Andreas Rottmann <a.rottmann@gmx.at> for the bug report.
* libguile/read.c (scm_i_scan_for_encoding): Fix for coding on first
line #! and for !# immediately following the coding.
* test-suite/Makefile.am:
* test-suite/tests/coding.test: Add tests.
* libguile/read.c (scm_i_scan_for_encoding): If possible, just use the
read buffer for the encoding scan, and avoid seeking. Fixes
`(open-input-file "/dev/urandom")', because /dev/urandom can't be
seeked backwards.
* libguile/read.c (scm_read_scsh_block_comment): Use `scm_getc' instead
of `scm_get_byte_or_eof'.
* test-suite/tests/reader.test ("read-options")["position of SCSH block
comment"]: New test.