* libguile/strings.c (scm_from_stringn): Always return a freshly
allocated string from scm_from_stringn, even when asked to construct
the null string, in accordance with the R5RS. Previously, we
optimized the null string case by returning a reference to a global
null string object (scm_nullstr).
* libguile/strings.c (scm_i_is_narrow_string, scm_i_try_narrow_string,
scm_i_string_set_x): Check to see if the provided string is a
mutation-sharing substring, and do the right thing in that case.
Previously, if such a string was passed to these functions, they would
behave very badly: while trying to fetch and/or mutate the cell
containing the stringbuf, they were actually fetching or mutating the
cell containing the original shared string. That's because
mutation-sharing substrings store the original string in CELL_1,
whereas all other strings store the stringbuf there.
* libguile/strings.c (scm_init_strings): Make scm_nullstr mutable. It
is still usable as a common object, because of course it contains no
characters to mutate anyway. It is returned by several procedures
that are specified to return mutable strings, and string mutators
raise errors when passed an immutable string, even if it is the null
string.
* libguile/strings.c (decoding_error): Factor out of scm_from_stringn,
properly handling errno.
(scm_from_stringn): Adapt.
(scm_from_utf8_stringn): Inline the conversion here, to avoid going
through iconv.
* libguile/tags.h (SCM_UNPACK_POINTER, SCM_PACK_POINTER): New macros.
The old SCM2PTR and PTR2SCM were defined in such a way that
round-tripping through a pointer could lose precision, even in the
case in which you weren't interested in actually dereferencing the
pointer, it was simply that you needed to plumb a SCM through APIs
that take pointers. These new macros are more like SCM_PACK and
SCM_UNPACK, but for pointer types. The bit representation of the
pointer should be the same as the scm_t_bits representation.
* libguile/gc.h (PTR2SCM, SCM2PTR): Remove support for (old) UNICOS
pointers. We are going to try tagging the SCM object itself in the
future, and I don't think that keeping this support is worth its
cost. It probably doesn't work anyway.
* libguile/backtrace.c:
* libguile/bytevectors.c:
* libguile/continuations.c:
* libguile/fluids.c:
* libguile/foreign.c:
* libguile/gc.h:
* libguile/guardians.c:
* libguile/hashtab.c:
* libguile/load.c:
* libguile/numbers.c:
* libguile/ports.c:
* libguile/smob.c:
* libguile/strings.c:
* libguile/symbols.c:
* libguile/vm.c:
* libguile/weak-set.c:
* libguile/weak-table.c:
* libguile/weak-vector.c: Update many sites to use the new macros.
This was a pretty big merge involving a fair amount of porting,
especially to peval and its tests. I did not update psyntax-pp.scm,
that comes in the next commit.
Conflicts:
module/ice-9/boot-9.scm
module/ice-9/psyntax-pp.scm
module/language/ecmascript/compile-tree-il.scm
module/language/tree-il.scm
module/language/tree-il/analyze.scm
module/language/tree-il/inline.scm
test-suite/tests/tree-il.test
* libguile/strings.c (scm_to_latin1_stringn): Fix for substrings.
* test-suite/standalone/Makefile.am:
* test-suite/standalone/test-scm-to-latin1-string.c: Add test case.
Thanks to David Hansen for the bug report and test case, and Stefan
Israelsson Tampe for the fix.
* libguile/bytevectors.h:
* libguile/bytevectors.c (scm_c_take_gc_bytevector): Rename this
internal function, from scm_c_take_bytevector. This indicates that
unlike the other scm_take_* functions, this one takes GC-managed
memory.
* libguile/objcodes.c (scm_objcode_to_bytecode):
* libguile/vm.c (really_make_boot_program): Use
scm_gc_malloc_pointerless, not scm_malloc. Thanks to Stefan
Israelsson Tampe!
* libguile/r6rs-ports.c:
* libguile/strings.c: Adapt to renames.
* libguile/strings.c (scm_i_allocate_string_pointers): Encode strings
using the current locale. Previously, Latin-1 was used. Indirectly,
this affects the encoding of strings in `system*', `execl', `execlp',
`execle', `environ', and `dynamic-args-call'.
(scm_makfromstrs): In header comment, clarify that the C strings are
interpreted according to the current locale encoding.
* NEWS: Add NEWS entry.
* libguile/inline.h:
* libguile/deprecated.h:
* libguile/deprecated.c (scm_immutable_cell, scm_immutable_double_cell):
Deprecate these, as the GC_STUBBORN API doesn't do anything any more.
* libguile/strings.c (scm_i_c_make_symbol): Change the one use of
scm_immutable_double_cell to scm_double_cell.
* libguile/async.c:
* libguile/async.h:
* libguile/debug.h:
* libguile/deprecated.c:
* libguile/deprecated.h:
* libguile/evalext.h:
* libguile/gc-malloc.c:
* libguile/gc.h:
* libguile/gen-scmconfig.c:
* libguile/numbers.c:
* libguile/ports.c:
* libguile/ports.h:
* libguile/procprop.c:
* libguile/procprop.h:
* libguile/read.c:
* libguile/socket.c:
* libguile/srfi-4.h:
* libguile/strings.c:
* libguile/strings.h:
* libguile/tags.h:
* module/ice-9/boot-9.scm:
* module/ice-9/deprecated.scm: Remove all deprecated code. CPP defines
that were not previously issuing warnings were changed so that their
expansions would indicate the replacement forms to use,
e.g. scm_sizet__GONE__REPLACE_WITH__size_t.
The two exceptions were SCM_LISTN, which did not produce warnings
before, and the string-filter argument order stuff.
Drops the initial dirty memory usage of Guile down to 2.8 MB on my
machine, from 4.4 MB.
* libguile/bytevectors.h (SCM_BYTEVECTOR_HEADER_SIZE): Bump, giving
bytevectors another word: a parent pointer. Will allow for
sub-bytevectors and efficient mmap bindings.
* libguile/bytevectors.c (make_bytevector):
(make_bytevector_from_buffer): Init parent to #f.
(scm_c_take_bytevector, scm_c_take_typed_bytevector): Another
argument, the parent, which gets set in the bytevector.
* libguile/foreign.c (scm_pointer_to_bytevector): Use the parent field
instead of registering a weak reference from bytevector to foreign
pointer.
* libguile/objcodes.c (scm_objcode_to_bytecode): Use the parent field to
avoid copying the objcode.
* libguile/srfi-4.c (DEFINE_SRFI_4_C_FUNCS):
* libguile/strings.c (scm_from_stringn):
* libguile/vm.c (really_make_boot_program):
* libguile/r6rs-ports.c (scm_get_bytevector_some)
(scm_get_bytevector_all, bytevector_output_port_procedure): Set the
parent to #f.
* libguile/strings.c (scm_encoding_error, scm_decoding_error): Use
scm_from_latin1_string for the subr and message args, as these are
internal functions, and we know their callers.
* libguile/strings.c (scm_to_locale_stringn, scm_from_locale_stringn):
Use the encoding of the current locale, not of the current i/o ports.
Also use the current conversion strategy.
* doc/ref/api-data.texi (Conversion to/from C): Update docs.
* libguile/ports.c (scm_read_char): Mention `decoding-error' in the
docstring.
(get_codepoint): Change to return an error code; add `codepoint'
output parameter. Don't raise an error from here.
(scm_getc): Raise an error with `scm_decoding_error' if
`get_codepoint' returns an error.
(scm_peek_char): Likewise. Update docstring.
* libguile/strings.c (scm_decoding_error_key): New variable.
(scm_decoding_error): New function.
(scm_from_stringn): Use `scm_decoding_error' instead of
`scm_encoding_error'.
* libguile/strings.h (scm_decoding_error): New declaration.
* test-suite/tests/ports.test ("string ports")["read-char, wrong
encoding, error"]: Change to expect `decoding-error'. Make sure PORT
points past the error.
["read-char, wrong encoding, escape"]: Likewise.
["peek-char, wrong encoding, error"]: New test.
* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
Output")["put-bytevector with wrong-encoding string port"]: Change to
expect `decoding-error'.
("8.2.6 Input and output ports")["transcoded-port [error handling
mode = raise]"]: Likewise.
* test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
error, substitute"]: New tests.
* doc/ref/api-io.texi (Reading): Update documentation of `read-char' and
`peek-char'.
(Line/Delimited): Update documentation of `read-line'.
* libguile/strings.c (scm_from_latin1_stringn): Directly return a narrow
string instead of going through `scm_from_stringn'.
(scm_to_latin1_stringn): Directly return a copy of STR's raw bytes when
it's narrow.
* libguile/bytevectors.c:
* libguile/eval.c:
* libguile/goops.c:
* libguile/i18n.c:
* libguile/load.c:
* libguile/memoize.c:
* libguile/modules.c:
* libguile/ports.c:
* libguile/print.c:
* libguile/procs.c:
* libguile/programs.c:
* libguile/read.c:
* libguile/script.c:
* libguile/srfi-14.c:
* libguile/stacks.c:
* libguile/strings.c:
* libguile/throw.c:
* libguile/vm.c: Use scm_from_latin1_symboln to make symbols from string
literals, because they aren't in the user's locale -- they are in
ASCII, and we can optimize this case.
* libguile/vm-i-loader.c: Also use scm_from_latin1_symboln when loading
narrow symbols.
* libguile/strings.h:
* libguile/strings.c (scm_from_latin1_string, scm_to_latin1_string): New
functions, in terms of the latin1_stringn variants.
(scm_from_utf8_string, scm_from_utf8_stringn)
(scm_to_utf8_string, scm_to_utf8_stringn): New functions.
(scm_i_from_utf8_string, scm_i_to_utf8_string): Removed these internal
functions.
(scm_from_stringn): Handle -1 as a length. Unlike the previous
behavior of scm_from_locale_string (NULL), which returned the empty
string, we now raise an error. The null pointer is not the same as
the empty string.
* libguile/stime.c (scm_strftime, scm_strptime): Adapt to publishing of
utf8 functions.
* libguile/gc-malloc.c: Add a note that the gc-malloc does not clear the
memory block, so users need to make sure it is initialized.
* libguile/bitvectors.c (scm_c_make_bitvector):
* libguile/bytevectors.c (scm_make_bytevector):
* libguile/strings.c (scm_c_make_string): If no initializer is given,
initialize the bytes to 0. Prevents information leakage if an app uses
make-string et al without initializers.
* libguile/foreign.c (make_cif): Initialize this too, to prevent leakage
in the struct holes. Paranoia...
Reported by Mike Gran <spk121@yahoo.com>.
* libguile/strings.c (scm_i_unistring_escapes_to_guile_escapes,
scm_i_unistring_escapes_to_r6rs_escapes): Augment comments.
(scm_to_stringn): When `handler ==
SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE && SCM_R6RS_ESCAPES_P', realloc
BUF so that it's large enough for the worst case.
* libguile/print.c (display_character): When `result != NULL && strategy
== SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE && SCM_R6RS_ESCAPES_P', make
LOCALE_ENCODED large enough to hold an R6RS escape.
* doc/ref/api-data.texi: document scm_to_stringn, scm_from_stringn,
scm_to_latin1_stringn, and scm_from_latin1_stringn
* libguile/strings.h (scm_to_stringn): make public
(scm_to_latin1_stringn): new declaration
(scm_from_latin1_stringn): new declaration
* libguile/strings.c (scm_to_latin1_stringn): new function
(scm_from_latin1_stringn): new function
* libguile/strings.c (STRINGBUF_CONTENTS): New macro.
(STRINGBUF_CHARS, STRINGBUF_WIDE_CHARS): Use it.
(scm_i_string_data): New function.
* libguile/strings.h (scm_i_string_data): New declaration.
* libguile/strings.c (scm_encoding_error): Change arguments to convey
more information. Raise the error with `scm_throw ()', passing all
the information to the handler.
(scm_from_stringn, scm_to_stringn): Update accordingly.
* test-suite/tests/ports.test ("string ports")["wrong encoding"]: Check
the arguments passed to the `throw' handler.
* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
Output")["put-bytevector with wrong-encoding string port"]: Likewise.
scm_to_stringn failed to do the necessary escape conversion for
R6RS hex escapes
* libguile/strings.c (unistring_escapes_to_r6rs_escapes): new function
(scm_to_stringn): use new function when r6rs hex escapes are enabled
* test-suite/tests/reader.test: new test for string display
* libguile/strings.c (SCM_ARRAY_IMPLEMENTATION): The mask for the string
array implementation should be 0x7f, without masking out 0x2.
Otherwise numbers were being thought to be vectors!
* test-suite/tests/unif.test: Add test.
* libguile/vectors.c (SCM_ARRAY_IMPLEMENTATION): Only register one
implementation, because weak vectors can be checked with the mask &
~2, and the functions are the same.
* libguile/root.h
* libguile/root.c (scm_sys_protects): It used to be that for some reason
we'd define a special array of "protected" values. This was a little
silly, always, but with the BDW GC it's completely unnecessary. Also
many of these variables were unused, and none of them were good API.
So remove this array, and either eliminate, make static, or make
internal the various values.
* libguile/snarf.h: No need to generate calls to scm_permanent_object.
* guile-readline/readline.c (scm_init_readline): No need to call
scm_permanent_object.
* libguile/array-map.c (ramap, rafe): Remove the dubious nullvect
optimizations.
* libguile/async.c (scm_init_async): No need to init scm_asyncs, it is
no more.
* libguile/eval.c (scm_init_eval): No need to init scm_listofnull, it is
no more.
* libguile/gc.c: Make scm_protects a static var.
(scm_storage_prehistory): Change the sanity check to use the address
of protects.
(scm_init_gc_protect_object): No need to clear the scm_sys_protects,
as it is no more.
* libguile/keywords.c: Make the keyword obarray a static var.
* libguile/numbers.c: Make flo0 a static var.
* libguile/objprop.c: Make object_whash a static var.
* libguile/properties.c: Make properties_whash a static var.
* libguile/srcprop.h:
* libguile/srcprop.c: Make scm_source_whash a global with internal
linkage.
* libguile/strings.h:
* libguile/strings.c: Make scm_nullstr a global with internal linkage.
* libguile/vectors.c (scm_init_vectors): No need to init scm_nullvect,
it's unused.
The intent is to allow compilation with `-Wundef', which in turn should
make it easier to catch erroneous uses of nonexistent macros.
* libguile/__scm.h: Don't assume `BUILDING_LIBGUILE' is defined.
* libguile/conv-uinteger.i.c (SCM_TO_TYPE_PROTO): Remove unneeded CPP
conditional on `TYPE_MIN == 0'.
* libguile/fports.c: Check for the definition of `HAVE_CHSIZE' and
`HAVE_FTRUNCATE', not for their value.
* libguile/ports.c: Likewise.
* libguile/numbers.c (guile_ieee_init): Likewise with `HAVE_DINFINITY'
and `HAVE_DQNAN'.
* test-suite/standalone/test-conversion.c (ieee_init): Likewise.
* libguile/strings.c: Likewise with `SCM_STRING_LENGTH_HISTOGRAM'.
* libguile/strings.h: Likewise.
* libguile/tags.h: Likewise with `HAVE_INTTYPES_H' and `HAVE_STDINT_H'.
* libguile/threads.c: Likewise with `HAVE_PTHREAD_GET_STACKADDR_NP'.
* libguile/vm-engine.c (VM_NAME): Likewise with `VM_CHECK_IP'.
* libguile/gen-scmconfig.c (main): Use "#ifdef HAVE_", not "#if HAVE_".
* libguile/socket.c (scm_setsockopt): Likewise.