* test-suite/tests/regexp.test (mysetlocale, set-latin-1): new functions
(with-latin1-locale): removed
(regexp-quote tests): try to print test names in locale but run tests
in ISO-8859-1.
* libguile/threads.c (scm_t_guile_ticket): Remove type.
(resume, scm_enter_guile, suspend, scm_leave_guile): Remove.
(scm_i_init_thread_for_guile): Set `t->top' to NULL, which has the
same effect as calling `scm_enter_guile ()'.
(scm_leave_guile_cleanup, scm_i_with_guile_and_parent): Remove
`scm_leave_guile ()' call.
* doc/ref/api-memory.texi (Garbage Collection
Functions)[scm_gc_protect_object]: Explain that it's equivalent to
storing in a global variable.
* doc/ref/api-smobs.texi (Smobs)[scm_set_smob_free]: Expand on the
relationship with `scm_gc_malloc ()'.
[scm_set_smob_mark]: Explain that it's usually not needed.
* gc-benchmarks/run-benchmark.scm (pretty-print-result)[ref-heap,
ref-time]: New variable.
[distance, score, score-string]: New procedures.
[print-line]: Use `score-string'.
(print-raw-result): New procedure.
(%options)["raw", "load-results"]: New options.
(%default-options): Add `printer' pair.
(show-help): Update.
(main): Add support for `--raw' and `--load-results'.
Since the regex library expects 8-bit clean characters and
an 8-bit locale, tests of 8-bit characters need to occur within
the context of an 8-bit locale.
* test-suite/tests/regexp.test (regexp-quote tests): wrap them in an
ISO-8859-1 locale
This requires separate small fixes.
Readline has internal logic to deal with multi-byte characters, so
it wants bytes, not characters.
scm_c_read gets called by the vm when readline is activated, and it was
truncating multi-byte characters because soft ports didn't have the
UCS-4 capability.
Soft ports need the capability to read UCS-4 characters. Since soft ports
may have a single byte buffer, full characters need to be stored into the
pushback buffer.
This broke the optimizations in scm_c_read for using an alternate buffer
for single-byte-buffered ports, because the opimization wasn't expecting
anything in the pushback buffer.
* libguile/vports.c (sf_fill_input): store complete chars, not single bytes
* libguile/ports.c (scm_c_read): don't use optimized path for non Latin-1.
Add debug prints.
* libguile/string.h: make scm_i_from_stringn and scm_i_string_ref public
so that readline can use them
* guile-readline/readline.c: read bytes, not complete chars, from the
input port. Convert output to the output port's locale
* NEWS
* doc/ref/scheme-scripts.texi: doc updates for character encoding of
source code
* doc/ref/api-evaluation.texi: doc updates for character encoding of
source code
String ports should be able to accept any string characters, regardless
of the current locale. Setting it to UTF-8 achieves that.
* libguile/strports.c (scm_i_mkstrport): set port's locale to UTF-8
(scm_mkstrport): convert input string to UTF-8
* libguile/unidata_to_charset.pl (designated): renamed from full
* libguile/srfi-14.c (scm_char_set_designated): new char-set
* libguile/srfi-14.i.c (cs_designated): renamed from cs_full
Since combining characters, such as accents, modify the appearance of the
previous letter, it looks awkward in its character literal form (#\name)
since it modified the backslash. This instead prints the combining
character on a small circle.
* libguile/chars.h (SCM_CODEPOINT_DOTTED_CIRCLE): new #define
* libguile/print.c (iprint1): print combining characters on dotted circles
* libguile/read.c (scm_read_character): parse the combination of combining
characters and dotted circles
This is a followup to commit d7e7a02a62
("Fix leaky behavior of `scm_take_TAGvector ()'.") and a reminder that
the uniform vector implementation can't do away with the cell->buffer
indirection.
* test-suite/standalone/Makefile.am (test_scm_take_u8vector_SOURCES,
test_scm_take_u8vector_CFLAGS, test_scm_take_u8vector_LDADD): New.
(check_PROGRAMS, TESTS): Add `test-scm-take-u8vector'.
* test-suite/standalone/test-scm-take-u8vector.c: New file.
* libguile/srfi-14.c (scm_i_ucs_range_to_char_set): new function that
contains the functionality of ucs_range_to_char_set, fixes
off-by-one, and doesn't store surroges
(scm_ucs_range_to_char_set, scm_ucs_range_to_char_set_x): call
scm_i_ucs_range_to_char_set
(scm_i_charset_set_range): new helper function
char-set-xor! was not modifying its input parameter. It isn't
technically required to do so by the spec, but, the other similar
functions do it.
* libguile/srfi-14.c (scm_char_set_xor_x): modify the input parameter
* libguile/srfi-4.c (free_user_data): New function.
* libguile/srfi-4.i.c (scm_take_TAGvector): Register `free_user_data ()'
as a finalizer for DATA.
* libguile/objcodes.c (scm_objcode_to_bytecode): Allocate with
`scm_malloc ()' since the memory taken by `scm_take_u8vector ()' will
eventually be free(3)d.
* libguile/vm.c (really_make_boot_program): Likewise.
* module/language/tree-il/compile-glil.scm (flatten): Fix compilation of
loops within loops in non-tail positions. Will add a test case soon,
but one way to reproduce it was with the following function:
(define (test)
(let lp ()
(pk 'zero)
(let ((fk (lambda ()
(let ((fk2 (lambda () (pk 'two))))
(let ((fk3 (lambda () (if #t (pk 'three) (fk2)))))
(if #t
(fk3)
(fk2)))))))
(pk 'one)
(fk))
(lp)))
One would expect to see a sequence of "zero one three", but in fact zero
only showed once.
This should fix simplex as well.
* libguile/strings.c (STRINGBUF_HEADER_SIZE, STRINGBUF_HEADER_BYTES):
New macros.
(STRINGBUF_F_INLINE, STRINGBUF_INLINE, STRINGBUF_OUTLINE_CHARS,
STRINGBUF_OUTLINE_LENGTH, STRINGBUF_INLINE_CHARS,
STRINGBUF_INLINE_LENGTH, STRINGBUF_MAX_INLINE_LEN): Remove.
(STRINGBUF_CHARS, STRINGBUF_WIDE_CHARS): Adjust to return a fixed
location.
(STRINGBUF_LENGTH): Get the length from word 1.
(make_stringbuf, make_wide_stringbuf): Adjust to use a contiguous
memory region.
(wide_stringbuf): Renamed from `widen_stringbuf'. Adjust similarly.
Return the new stringbuf. Callers updated.
(narrow_stringbuf): Likewise.
(scm_sys_string_dump, scm_sys_symbol_dump): Remove `stringbuf-inline'
pair.
* test-suite/tests/strings.test ("string internals")["null strings are
inlined", "short Latin-1 encoded strings are inlined", "long Latin-1
encoded strings are not inlined", "short UCS-4 encoded strings are not
inlined", "long UCS-4 encoded strings are not inlined"]: Remove.
* test-suite/tests/symbols.test ("symbol internals")["null symbols are
inlined", "short Latin-1 encoded symbols are inlined", "long Latin-1
encoded symbols are not inlined", "short UCS-4 encoded symbols are not
inlined", "long UCS-4 encoded symbols are not inlined"]: Remove.