Reported by Massimiliano Gubinelli <m.gubinelli@gmail.com> in
<https://lists.gnu.org/archive/html/guile-user/2019-05/msg00070.html>.
* libguile/strings.c (scm_to_stringn): Check for (encoding == NULL)
before passing it to 'c_strcasecmp'. Eliminate redundant 'enc'
variable.
* libguile/bytevectors.c (INTEGER_ACCESSOR_PROLOGUE)
(scm_bytevector_copy_x, bytevector_large_set): Rewrite checks to reliably
detect overflows.
(make_bytevector): Constrain the bytevector length to avoid later
overflows during allocation.
(make_bytevector_from_buffer): Fix indentation.
(scm_bytevector_length): Use 'scm_from_size_t' to convert a 'size_t',
not 'scm_from_uint'.
* libguile/fports.c (fport_seek): Check for overflow before the implicit
conversion of the return value.
* libguile/guardians.c (guardian_print): Use 'scm_from_ulong' to convert
an 'unsigned long', not 'scm_from_uint'.
* libguile/ports.c (scm_unread_string): Change a variable to type 'size_t'.
(scm_seek, scm_truncate_file): Use 'scm_t_off' instead of
'off_t_or_off64_t' to avoid implicit type conversions that could
overflow, because 'ptob->seek' and 'ptob->truncate' use 'scm_t_off'.
* libguile/r6rs-ports.c (bytevector_input_port_seek)
(custom_binary_port_seek, bytevector_output_port_seek): Rewrite offset
calculations to reliably detect overflows. Use 'scm_from_off_t' to
convert a 'scm_t_off', not 'scm_from_long' nor 'scm_from_int'.
(scm_get_bytevector_n_x, scm_get_bytevector_all, scm_unget_bytevector)
(bytevector_output_port_write): Rewrite checks to reliably detect
overflows. Use 'size_t' where appropriate.
(bytevector_output_port_buffer_grow): Rewrite size calculations to
reliably detect overflows. Minor change in the calculation of the new
size: now it is max(min_size, 2*current_size), whereas previously it
would multiply current_size by the smallest power of 2 needed to surpass
min_size.
* libguile/strings.c (make_stringbuf): Constrain the stringbuf length to
avoid later overflows during allocation.
(scm_string_append): Change overflow check to use INT_ADD_OVERFLOW.
* libguile/strports.c (string_port_write): Rewrite size calculations to
reliably detect overflows.
(string_port_seek): Rewrite offset calculations to reliably detect
overflows. Use 'scm_from_off_t' to convert a 'scm_t_off', not
'scm_from_long'.
(string_port_truncate): Use 'scm_from_off_t' to convert a 'scm_t_off',
not 'scm_from_off_t_or_off64_t'.
* libguile/vectors.c (scm_c_make_vector): Change a variable to type
'size_t'.
* libguile/i18n.c (SCM_MAX_ALLOCA): New macro.
(SCM_STRING_TO_U32_BUF): Accept an additional variable to remember
whether we used malloc to allocate the buffer. Use malloc if the
allocation size is greater than SCM_MAX_ALLOCA.
(SCM_CLEANUP_U32_BUF): New macro.
(compare_u32_strings, compare_u32_strings_ci, str_to_case): Adapt.
* libguile/strings.c (SCM_MAX_ALLOCA): New macro.
(normalize_str, unistring_escapes_to_r6rs_escapes): Use malloc if the
allocation size is greater than SCM_MAX_ALLOCA.
* test-suite/tests/i18n.test, test-suite/tests/strings.test: Add tests.
This reverts the change to SCM_MAKE_CHAR made in the previous commit
63818453ad, which used an arithmetic trick
to avoid evaluating its argument more than once.
Here, we restore the previous implementation of SCM_MAKE_CHAR, which
evaluates its argument twice. Instead, we introduce a new inlinable
function 'scm_c_make_char' and replace uses of SCM_MAKE_CHAR with calls
to 'scm_c_make_char' where appropriate.
* libguile/chars.h (scm_c_make_char): New inline function.
* libguile/inline.c: Include chars.h.
* libguile/srfi-13.c (REF_IN_CHARSET, scm_string_any, scm_string_every)
(scm_string_trim, scm_string_trim_right, scm_string_trim_both)
(scm_string_index, scm_string_index_right, scm_string_skip)
(scm_string_skip_right, scm_string_count, string_titlecase_x)
(string_reverse_x, scm_string_fold, scm_string_fold_right)
(scm_string_for_each, scm_string_filter, scm_string_delete):
Use 'scm_c_make_char' instead of 'SCM_MAKE_CHAR' in cases where the
argument calls a function.
* libguile/chars.c (scm_char_upcase, scm_char_downcase, scm_char_titlecase),
libguile/ports.c (scm_port_decode_char),
libguile/print.c (scm_simple_format),
libguile/read.c (scm_read_character),
libguile/strings.c (scm_string_ref, scm_c_string_ref),
Reported by Jeffrey Walton <noloader@gmail.com> in
<https://lists.gnu.org/archive/html/guile-devel/2019-03/msg00001.html>.
Note that C11 section 7.1.4 (Use of library functions) states that:
"unless explicitly stated otherwise in the detailed descriptions [of
library functions] that follow: If an argument to a function has an
invalid value (such as ... a null pointer ...) ..., the behavior is
undefined." Note that 'strxfrm' is an example of a standard C function
that explicitly states otherwise, allowing NULL to be passed in the
first argument if the size argument is zero, but no similar allowance is
specified for 'memcpy' or 'memcmp'.
* libguile/bytevectors.c (scm_uniform_array_to_bytevector): Call memcpy
only if 'byte_len' is non-zero.
* libguile/srfi-14.c (charsets_equal): Call memcmp only if the number of
ranges is non-zero.
* libguile/stime.c (setzone): Pass 1-character buffer to
'scm_to_locale_stringbuf', instead of NULL.
* libguile/strings.c (scm_to_locale_stringbuf): Call memcpy only if the
number of bytes to copy is non-zero.
* libguile/i18n.c (SCM_MAX_ALLOCA): New macro.
(SCM_STRING_TO_U32_BUF): Accept an additional variable to remember
whether we used malloc to allocate the buffer. Use malloc if the
allocation size is greater than SCM_MAX_ALLOCA.
(SCM_CLEANUP_U32_BUF): New macro.
(compare_u32_strings, compare_u32_strings_ci, str_to_case): Adapt.
* libguile/strings.c (SCM_MAX_ALLOCA): New macro.
(normalize_str, unistring_escapes_to_r6rs_escapes): Use malloc if the
allocation size is greater than SCM_MAX_ALLOCA.
* test-suite/tests/i18n.test, test-suite/tests/strings.test: Add tests.
This reverts the change to SCM_MAKE_CHAR made in the previous commit
63818453ad, which used an arithmetic trick
to avoid evaluating its argument more than once.
Here, we restore the previous implementation of SCM_MAKE_CHAR, which
evaluates its argument twice. Instead, we introduce a new inlinable
function 'scm_c_make_char' and replace uses of SCM_MAKE_CHAR with calls
to 'scm_c_make_char' where appropriate.
* libguile/chars.h (scm_c_make_char): New inline function.
* libguile/inline.c: Include chars.h.
* libguile/srfi-13.c (REF_IN_CHARSET, scm_string_any, scm_string_every)
(scm_string_trim, scm_string_trim_right, scm_string_trim_both)
(scm_string_index, scm_string_index_right, scm_string_skip)
(scm_string_skip_right, scm_string_count, string_titlecase_x)
(string_reverse_x, scm_string_fold, scm_string_fold_right)
(scm_string_for_each, scm_string_filter, scm_string_delete):
Use 'scm_c_make_char' instead of 'SCM_MAKE_CHAR' in cases where the
argument calls a function.
* libguile/chars.c (scm_char_upcase, scm_char_downcase, scm_char_titlecase),
libguile/ports.c (scm_port_decode_char),
libguile/print.c (scm_simple_format),
libguile/read.c (scm_read_character),
libguile/strings.c (scm_string_ref, scm_c_string_ref),
libguile/vm-engine.c ("string-ref"): Ditto.
Reported by Jeffrey Walton <noloader@gmail.com> in
<https://lists.gnu.org/archive/html/guile-devel/2019-03/msg00001.html>.
Note that C11 section 7.1.4 (Use of library functions) states that:
"unless explicitly stated otherwise in the detailed descriptions [of
library functions] that follow: If an argument to a function has an
invalid value (such as ... a null pointer ...) ..., the behavior is
undefined." Note that 'strxfrm' is an example of a standard C function
that explicitly states otherwise, allowing NULL to be passed in the
first argument if the size argument is zero, but no similar allowance is
specified for 'memcpy' or 'memcmp'.
* libguile/bytevectors.c (scm_uniform_array_to_bytevector): Call memcpy
only if 'byte_len' is non-zero.
* libguile/srfi-14.c (charsets_equal): Call memcmp only if the number of
ranges is non-zero.
* libguile/stime.c (setzone): Pass 1-character buffer to
'scm_to_locale_stringbuf', instead of NULL.
* libguile/strings.c (scm_to_locale_stringbuf): Call memcpy only if the
number of bytes to copy is non-zero.
As the FSF advises, 'There is no legal significance to using the
three-character sequence “(C)”, but it does no harm.' It does take up
space though! For that reason, we remove it here from our C files.
* libguile/array-handle.c (initialize_vector_handle): Add mutable_p
argument. Unless the vector handle is mutable, null out its
writable_elements member.
(scm_array_get_handle): Adapt to determine mutability of the various
arrays.
(scm_array_handle_elements, scm_array_handle_writable_elements):
Reverse the sense: instead of implementing read-only in terms of
read-write, go the other way around, adding an assertion in the
read-write case that the array handle is mutable.
* libguile/array-map.c (racp): Assert that the destination is mutable.
* libguile/bitvectors.c (SCM_F_BITVECTOR_IMMUTABLE, IS_BITVECTOR):
(IS_MUTABLE_BITVECTOR): Add a flag to indicate immutability.
(scm_i_bitvector_bits): Fix indentation.
(scm_i_is_mutable_bitvector): New helper.
(scm_array_handle_bit_elements)
((scm_array_handle_bit_writable_elements): Build writable_elements in
terms of elements.
(scm_bitvector_elements, scm_bitvector_writable_elements): Likewise.
(scm_c_bitvector_set_x): Require a mutable bitvector for the
fast-path.
(scm_bitvector_to_list, scm_bit_count): Use read-only elements()
function.
* libguile/bitvectors.h (scm_i_is_mutable_bitvector): New decl.
* libguile/bytevectors.c (INTEGER_ACCESSOR_PROLOGUE):
(INTEGER_GETTER_PROLOGUE, INTEGER_SETTER_PROLOGUE):
(INTEGER_REF, INTEGER_NATIVE_REF, INTEGER_SET, INTEGER_NATIVE_SET):
(GENERIC_INTEGER_ACCESSOR_PROLOGUE):
(GENERIC_INTEGER_GETTER_PROLOGUE, GENERIC_INTEGER_SETTER_PROLOGUE):
(LARGE_INTEGER_NATIVE_REF, LARGE_INTEGER_NATIVE_SET):
(IEEE754_GETTER_PROLOGUE, IEEE754_SETTER_PROLOGUE):
(IEEE754_REF, IEEE754_NATIVE_REF, IEEE754_SET, IEEE754_NATIVE_SET):
Setters require a mutable bytevector.
(SCM_BYTEVECTOR_SET_FLAG): New helper.
(SCM_BYTEVECTOR_SET_CONTIGUOUS_P, SCM_BYTEVECTOR_SET_ELEMENT_TYPE):
Remove helpers.
(SCM_VALIDATE_MUTABLE_BYTEVECTOR): New helper.
(make_bytevector, make_bytevector_from_buffer): Use
SCM_SET_BYTEVECTOR_FLAGS.
(scm_c_bytevector_set_x, scm_bytevector_fill_x)
(scm_bytevector_copy_x): Require a mutable bytevector.
* libguile/bytevectors.h (SCM_F_BYTEVECTOR_CONTIGUOUS)
(SCM_F_BYTEVECTOR_IMMUTABLE, SCM_MUTABLE_BYTEVECTOR_P): New
definitions.
* libguile/bytevectors.h (SCM_BYTEVECTOR_CONTIGUOUS_P): Just access one
bit.
* libguile/srfi-4.c (DEFINE_SRFI_4_C_FUNCS): Implement
writable_elements() in terms of elements().
* libguile/strings.c (scm_i_string_is_mutable): New helper.
* libguile/uniform.c (scm_array_handle_uniform_elements):
(scm_array_handle_uniform_writable_elements): Implement
writable_elements in terms of elements.
* libguile/vectors.c (SCM_VALIDATE_MUTABLE_VECTOR): New helper.
(scm_vector_elements, scm_vector_writable_elements): Implement
writable_elements in terms of elements.
(scm_c_vector_set_x): Require a mutable vector.
* libguile/vectors.h (SCM_F_VECTOR_IMMUTABLE, SCM_I_IS_MUTABLE_VECTOR):
New definitions.
* libguile/vm-engine.c (VM_VALIDATE_MUTABLE_BYTEVECTOR):
(VM_VALIDATE_MUTABLE_VECTOR, vector-set!, vector-set!/immediate)
(BV_BOUNDED_SET, BV_SET): Require mutable bytevector/vector.
* libguile/vm.c (vm_error_not_a_mutable_bytevector):
(vm_error_not_a_mutable_vector): New definitions.
* module/system/vm/assembler.scm (link-data): Mark residualized vectors,
bytevectors, and bitvectors as being read-only.
* libguile/strings.c (STRINGBUF_SET_MUTABLE): New helper.
(scm_i_string_ensure_mutable_x): Use new helper.
(scm_make_string): Mark stringbuf as mutable.
* libguile/snarf.h (SCM_IMMUTABLE_STRINGBUF): Remove shared flag.
Stringbufs are immutable by default.
* libguile/strings.c: Rewrite blurb. Change to have stringbufs be
immutable by default and mutable only when marked as such. Going
mutable means making a private copy.
(STRINGBUF_MUTABLE, STRINGBUF_F_MUTABLE): New definitions.
(SET_STRINGBUF_SHARED): Remove.
(scm_i_print_stringbuf): Simplify to just alias the stringbuf as-is.
(substring_with_immutable_stringbuf): New helper.
(scm_i_substring, scm_i_substring_read_only, scm_i_substring_copy):
use new helper.
(scm_i_string_ensure_mutable_x): New helper.
(scm_i_substring_shared): Use scm_i_string_ensure_mutable_x.
(stringbuf_write_mutex): Remove; yaaaaaaaay.
(scm_i_string_start_writing): Use scm_i_string_ensure_mutable_x. No
more mutex.
(scm_i_string_stop_writing): Now a no-op.
(scm_i_make_symbol): Use substring/copy.
(scm_sys_string_dump, scm_sys_symbol_dump): Update.
* libguile/strings.h (SCM_I_STRINGBUF_F_SHARED): Remove.
(SCM_I_STRINGBUF_F_MUTABLE): Add.
* module/system/vm/assembler.scm (link-data): Don't add shared flag any
more. Existing compiled flags are harmless tho.
* test-suite/tests/strings.test ("string internals"): Update.
* libguile/root.h:
* libguile/root.c: Remove these files.
* libguile/deprecated.h:
* libguile/deprecated.c (scm_internal_cwdr, scm_call_with_dynamic_root)
(scm_dynamic_root, scm_apply_with_dynamic_root): Deprecate.
Remove all root.h usage, which was vestigial.
* module/ice-9/serialize.scm: Use (current-thread) instead
of (dynamic-root).
* libguile/strings.c (scm_from_port_stringn, scm_to_port_stringn):
Access the conversion_strategy directly to make sure these functions
can run while the port is flushing on close. Fixes web-http.test to
allow the closed flag to be atomically cleared on a port before
flushing.
* libguile/ports.h (scm_t_port): Represent the conversion strategy as a
symbol, to make things easier for Scheme. Rename to
"conversion_strategy".
(scm_c_make_port_with_encoding): Change to take encoding and
conversion_strategy arguments as symbols.
(scm_i_string_failed_conversion_handler): New internal helper, to turn
a symbol to a scm_t_string_failed_conversion_handler.
(scm_i_default_port_encoding): Return the default port encoding as a
symbol.
(scm_i_default_port_conversion_strategy)
(scm_i_set_default_port_conversion_strategy): Rename from
scm_i_default_port_conversion_handler et al. Take and return Scheme
symbols.
* libguile/foreign.c (scm_string_to_pointer, scm_pointer_to_string): Use
scm_i_default_string_failed_conversion_handler instead of
scm_i_default_port_conversion_handler.
* libguile/print.c (PORT_CONVERSION_HANDLER): Update definition.
(print_normal_symbol): Use PORT_CONVERSION_HANDLER.
* libguile/r6rs-ports.c (make_bytevector_input_port):
(make_custom_binary_input_port, make_bytevector_output_port): Adapt to
changes in scm_c_make_port_with_encoding.
* libguile/strings.h:
* libguile/strings.c (scm_i_default_string_failed_conversion_handler):
New helper.
(scm_from_locale_stringn, scm_from_port_stringn):
(scm_to_locale_stringn, scm_to_port_stringn): Adapt to interface
changes.
* libguile/strports.c (scm_mkstrport): Adapt to
scm_c_make_port_with_encoding change.
* libguile/ports.c (scm_c_make_port): Adapt to
scm_c_make_port_with_encoding change.
(ascii_toupper, encoding_matches, canonicalize_encoding): Move down in
the file.
(peek_codepoint, get_codepoint, scm_ungetc): Adapt to port conversion
strategy change. Remove duplicate case in get_codepoint.
(scm_init_ports): Move symbol initializations to the same place.
* libguile/ports-internal.h (scm_t_port_internal): Remove encoding_mode
member.
* libguile/ports.h (scm_t_port): "encoding" member is now a SCM symbol.
* libguile/ports.c (scm_init_ports): Define symbols for the encodings
that we handle explicitly.
(encoding_matches): Adapt to check against an encoding as a symbol.
(canonicalize_encoding): Return an encoding as a symbol.
(scm_c_make_port_with_encoding, scm_i_set_default_port_encoding)
(decide_utf16_encoding, decide_utf32_encoding)
(scm_i_port_iconv_descriptors, scm_i_set_port_encoding_x)
(scm_port_encoding, peek_codepoint, scm_ungetc): Adapt to encoding
change.
* libguile/print.c (display_string_using_iconv, display_string):
* libguile/read.c (scm_read_character):
* libguile/strings.c (scm_from_port_stringn, scm_to_port_stringn): Adapt
to port encoding change.
* libguile/strings.c (scm_from_utf8_stringn): Use 'u8_mbtoucr' and check
for a decoding error by its 'nbytes' return value. Previously we used
'u8_mbtouc' and improperly assumed that a U+FFFD character indicated a
decoding error.
* libguile/symbols.c (utf8_string_equals_wide_string): Likewise.
* test-suite/tests/bytevectors.test (exception:decoding-error): New
variable.
("2.9 Operations on Strings"): Add tests.
* libguile/array-handle.h (scm_t_vector_ref, scm_t_vector_set): Rename
from scm_t_array_ref, scm_t_array_set. These were named
scm_i_t_array_ref and scm_i_t_array_set in 1.8 and 2.0. Change to
take the vector directly, instead of the array handle. In this way,
generic array handles are layered on top of specific implementations
of backing stores.
Remove scm_t_array_implementation, introduced in 2.0 but never
documented. It was a failed attempt to layer the array implementation
that actually introduced too many layers, as it prevented the "vref"
and "vset" members of scm_t_array_handle (called "ref" and "set" in
1.8, not present in 2.0) from specializing on array backing stores.
(scm_i_register_array_implementation) (scm_i_array_implementation_for_obj):
Remove these internal interfaces.
(scm_t_array_handle): Adapt to scm_t_vector_ref / scm_t_vector_set
change.
(scm_array_handle_ref, scm_array_handle_set): Adapt to change in
vref/vset prototype.
* libguile/array-handle.c (scm_array_get_handle): Inline all the
necessary initializations here for all specific array types.
* libguile/array-map.c (rafill, racp, ramap, rafe, array_index_map_1):
* libguile/arrays.c: Remove array implementation code.
* libguile/bitvectors.h:
* libguile/bitvectors.c: Remove array implementation code.
(scm_i_bitvector_bits): New internal interface.
* libguile/bytevectors.c: Remove array implementation code.
* libguile/srfi-4.h: Remove declarations for internal procedures that
don't exist (!).
* libguile/strings.c: Remove array implementation code.
* libguile/vectors.c: Remove array implementation code.
* libguile/strings.h:
* libguile/strings.c (scm_i_print_stringbuf):
* libguile/print.c (iprin1): Add a printer for stringbufs. The
disassembler can print a stringbuf.
* libguile/strings.c (scm_string_append): Check for numerical overflow
while computing the length of the result. Double-check that we don't
overflow the result string, and that it is the correct length in the
end (in case another thread changed the list). When copying a narrow
string to a wide result, avoid calling 'scm_i_string_length' and
'scm_i_string_chars' on each character.