This requres the creation of a new type
scm_t_string_failed_conversion_handler to replace libunistring's
enum iconveh_ilseq_handler.
* libguile/strings.h: don't include <uniconv.h>
(scm_t_string_failed_conversion_handler): new enum type
(SCM_FAILED_CONVERSION_ERROR, SCM_FAILED_CONVERSION_QUESTION_MARK):
(SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE): new enum type values
* libguile/strings.c (scm_to_stringn): now takes type
scm_t_string_failed_conversion_handler. All callers changed.
* libguile/print.c: include <uniconv.h>
* libguile/ports.c (scm_lfwrite_substr): use
scm_t_string_conversion_handler's constants
* libguile/gen-scmconfig.c (SCM_ICONVEH_ERROR):
(SCM_ICONVEH_QUESTION_MARK, SCM_ICONVEH_ESCAPE_SEQUENCE): store
iconveh_ilseq_hander constants as #define's
Also, scm_charprint is renamed to scm_i_charprint.
* libguile/strings.h: make scm_i_string_wide_chars internal.
* libguile/print.h: rename scm_charprint to scm_i_charprint. Make
internal.
* libguile/print.c (scm_i_charprint): renamed from scm_charprint
(scm_charprint): renamed to scm_i_charprint. All callers changed.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
This adds the 32-bit standalone characters. Strings are still
8-bit. Characters larger than 8-bit can only be entered or
displayed in octal format at this point. At this point, the
terminal's display encoding is expected to be Latin-1.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
add 32-bit char
* module/language/assembly.scm (object->assembly): add 32-bit char
(assembly->object): add 32-bit char
* libguile/vm-i-system.c (make-char32): new op
* libguile/print.c (iprin1): print 32-bit char
* libguile/numbers.h: add type scm_t_wchar
* libguile/numbers.c: add type scm_t_wchar
* libguile/chars.h: new type scm_t_wchar
(SCM_CODEPOINT_MAX): new
(SCM_IS_UNICODE_CHAR): new
(SCM_MAKE_CHAR): operate on 32-bit char
* libguile/chars.c: comparison operators now use Unicode
codepoints
(scm_c_upcase): now receives and returns scm_t_wchar
(scm_c_downcase): now receives and returns scm_t_wchar
The global variables scm_charnames and scm_charnums are replaced with
the accessor functions scm_i_charname and scm_i_charname_to_num.
Also, the incomplete and broken EBCDIC support is removed.
* libguile/print.c (iprin1): use new func scm_i_charname
* libguile/read.c (scm_read_character): use new func
scm_i_charname_to_num
* libguile/chars.c (scm_i_charname): new function
(scm_i_charname_to_char): new function
(scm_charnames, scm_charnums): removed
* libguile/chars.h: new declarations
The idea is to introduce `gsubrs' whose arity is encoded in their type
(more precisely in the sizeof (void *) - 8 MSBs). This removes the
indirection introduced by cclos and simplifies the code.
* libguile/__scm.h (CCLO): Remove.
* libguile/debug.c (scm_procedure_source, scm_procedure_environment):
Remove references to `scm_tc7_cclo'.
* libguile/eval.c (scm_trampoline_0, scm_trampoline_1,
scm_trampoline_2): Replace `scm_tc7_cclo' with `scm_tc7_gsubr'.
* libguile/eval.i.c (CEVAL): Likewise. No longer make PROC the first
argument. Directly invoke `scm_gsubr_apply ()' instead of jump to the
`evap(N+1)' label or call to `SCM_APPLY ()'.
* libguile/evalext.c (scm_self_evaluating_p): Remove reference to
`scm_tc7_cclo'.
* libguile/gc-card.c (scm_i_sweep_card, scm_i_tag_name): Likewise.
* libguile/gc-mark.c (scm_gc_mark_dependencies): Likewise.
* libguile/goops.c (scm_class_of): Likewise.
* libguile/print.c (iprin1): Likewise.
* libguile/gsubr.c (create_gsubr): Use `unsigned int's for REQ, OPT and
RST. Use `scm_tc7_gsubr' instead of `scm_makcclo ()' in the default
case.
(scm_gsubr_apply): Remove calls to `SCM_GSUBR_PROC ()'.
(scm_f_gsubr_apply): Remove.
* libguile/gsubr.h (SCM_GSUBR_TYPE): New definition.
(SCM_GSUBR_MAX): Changed to 33.
(SCM_SET_GSUBR_TYPE, SCM_GSUBR_PROC, SCM_SET_GSUBR_PROC,
scm_f_gsubr_apply): Remove.
* libguile/procprop.c (scm_i_procedure_arity): Remove reference to
`scm_tc7_cclo'; add proper handling of `scm_tc7_gsubr'.
* libguile/procs.c (scm_makcclo, scm_make_cclo): Remove.
(scm_procedure_p): Remove reference to `scm_tc7_cclo'.
(scm_thunk_p): Likewise, plus add proper `scm_tc7_gsubr' handling.
* libguile/procs.h (SCM_CCLO_LENGTH, SCM_MAKE_CCLO_TAG,
SCM_SET_CCLO_LENGTH, SCM_CCLO_BASE, SCM_SET_CCLO_BASE, SCM_CCLO_REF,
SCM_CCLO_SET, SCM_CCLO_SUBR, SCM_SET_CCLO_SUBR, scm_makcclo,
scm_make_cclo): Remove.
* libguile/stacks.c (read_frames): Remove reference to `scm_f_gsubr_apply'.
* libguile/tags.h (scm_tc7_cclo): Remove.
(scm_tc7_gsubr): New.
(scm_tcs_subrs): Add `scm_tc7_gsubr'.
* libguile/print.c (iprin1): When displaying a weak vector, access
elements via `scm_c_vector_ref ()', not via the macro.
git-archimport-id: lcourtes@laas.fr--2005-libre/guile-core--boehm-gc--1.9--patch-14
the value at its top. This fixes a reference leak.
(PUSH_REF): Perform `pstate->top++' after calling
`PSTATE_STACK_SET ()' in order to avoid undesired potential side
effects.
New.
(sym_reader): New.
(scm_print_opts): Added "quote-keywordish-symbols" option.
(quote_keywordish_symbol): New, for evaluating the option.
(scm_print_symbol_name): Use it.
(scm_init_print): Initialize new option to sym_reader.
Removed ref_stack field.
(PSTATE_STACK_REF, PSTATE_STACK_SET): New, for accessing the stack
of a print state. Use them everywhere instead of ref_stack.
print.c, ports.c, mallocs.c, hooks.c, hashtab.c, fports.c,
guardians.c, filesys.c, coop-pthreads.c, continuations.c: Use
scm_uintprint to print unsigned integers, raw heap words, and
adresses, using a cast to scm_t_bits to turn pointers into
integers.
SCM_PRINT_HIGHLIGHT_SUFFIX): New printer options.
(scm_iprin1): Use them instead of the previoulsy hardcoded
strings.
(scm_init_print): Initialize them.
* print.c (make_print_state, scm_free_print_state): Initialize it
to SCM_EOL.
(scm_iprin1): Wrap output in '{...}' when object is contained in
highlight_objects.
(SCM_VALIDATE_STRING_COPY): Deprecated. Replaced all uses with
SCM_VALIDATE_STRING plus SCM_I_STRING_CHARS or
scm_to_locale_string, etc.
(SCM_VALIDATE_SUBSTRING_SPEC_COPY): Deprecated. Replaced as
above, plus scm_i_get_substring_spec.
* regex-posix.c, read.c, random.c, ramap.c, print.c, numbers.c,
hash.c, gc.c, gc-card.c, convert.i.c, backtrace.c, strop.c,
strorder.c, strports.c, struct.c, symbols.c, unif.c, ports.c: Use
SCM_I_STRING_CHARS, SCM_I_STRING_UCHARS, and SCM_I_STRING_LENGTH
instead of SCM_STRING_CHARS, SCM_STRING_UCHARS, and
SCM_STRING_LENGTH, respectively. Also, replaced scm_return_first
with more explicit scm_remember_upto_here_1, etc, or introduced
them in the first place.
SCM_INUM): Deprecated by reenaming them to SCM_I_INUMP, SCM_I_NINUMP
and SCM_I_INUM, respectively and adding deprecated versions to
deprecated.h and deprecated.c. Changed all uses to either use the
SCM_I_ variants or scm_is_*, scm_to_*, or scm_from_*, as appropriate.
SCM_NEGATE_BOOL, SCM_BOOLP): Deprecated by moving into "deprecated.h".
Replaced all uses with scm_is_false, scm_is_true, scm_from_bool, and
scm_is_bool, respectively.
scm_i_unmemoize_expr for unmemoizing a memoized object holding a
single memoized expression.
* debug.c (memoized_print): Don't try to unmemoize the memoized
object, since we can't know whether it holds a single expression
or a body.
(scm_mem_to_proc): Removed check for lambda expression, since it
was moot anyway. Whoever uses these functions for debugging
purposes should know what they do: Creating invalid memoized code
will cause crashes, independent of whether this check is present
or not.
(scm_proc_to_mem): Take the closure's code as it is and don't
append a SCM_IM_LAMBDA isym. To allow easier debugging, the
memoized code should not be modified.
* debug.[ch] (scm_unmemoize, scm_i_unmemoize_expr): Removed
scm_unmemoize from public use, but made scm_i_unmemoize_expr
available as a guile internal function instead. However,
scm_i_unmemoize_expr will only work on memoized objects that hold
a single memoized expression. It won't work with bodies.
* debug.c (scm_procedure_source), macros.c (macro_print), print.c
(scm_iprin1): Call scm_i_unmemocopy_body for unmemoizing a body,
i. e. a list of expressions.
* eval.c (unmemoize_exprs): Drop internal body markers from the
output during unmemoization.
* eval.[ch] (scm_unmemocopy, scm_i_unmemocopy_expr,
scm_i_unmemocopy_body): Removed scm_unmemocopy from public use,
but made scm_i_unmemocopy_expr and scm_i_unmemocopy_body available
as guile internal functions instead. scm_i_unmemoize_expr will
only work on a single memoized expression, while
scm_i_unmemocopy_body will only work on bodies.
* deprecated.h (SCM_IFRINC, SCM_ICDR, SCM_IFRAME, SCM_IDIST,
SCM_ICDRP), eval.c (SCM_IFRINC, SCM_ICDR, SCM_IFRAME, SCM_IDIST,
SCM_ICDRP), eval.h (SCM_ICDR, SCM_IFRINC, SCM_IFRAME, SCM_IDIST,
SCM_ICDRP): Deprecated and added to deprecated.h. Moved from
eval.h to eval.c.
* deprecated.c (scm_isymnames), deprecated.h (scm_isymnames,
SCM_ISYMNUM, SCM_ISYMCHARS), eval.c (SCM_ISYMNUM, isymnames,
scm_unmemocopy, CEVAL), print.c (scm_isymnames), tags.h
(SCM_ISYMNUM, scm_isymnames, SCM_ISYMCHARS): Deprecated
scm_isymnames, SCM_ISYMNUM and SCM_ISYMCHARS and added to
deprecated.[hc]. Moved scm_isymnames from print.c to eval.c and
renamed to isymnames. Moved SCM_ISYMNUM from tags.h to eval.c and
renamed to ISYMNUM.
* eval.c (scm_i_print_iloc, scm_i_print_isym), eval.h
(scm_i_print_iloc, scm_i_print_isym), print.c (scm_iprin1):
Extracted printing of ilocs and isyms to guile internal functions
scm_i_print_iloc, scm_i_print_isym of eval.c.