* Renumbers the IFLAG constants.
* Adds several macros related to boolean type tests, null tests, and
boolean-truth testing (including lisp-style boolean-truth tests).
* Adds compile-time checks to verify the necessary IFLAG numbering
properties needed for the checks to work properly.
* Changes some existing code to use the new optimized macros, without
changing the semantics of the code at all (except that scm_is_bool
is changed from a function to a macro).
I added the following macros, whose names explicitly state how %nil
should be handled. See the comments in the patch for more information
about these.
scm_is_false_assume_not_lisp_nil scm_is_true_assume_not_lisp_nil
scm_is_false_and_not_lisp_nil scm_is_true_or_lisp_nil
scm_is_false_or_lisp_nil scm_is_true_and_not_lisp_nil
scm_is_lisp_false scm_is_lisp_true
scm_is_null_assume_not_lisp_nil
scm_is_null_and_not_lisp_nil
scm_is_null_or_lisp_nil
scm_is_bool_and_not_lisp_nil
scm_is_bool_or_lisp_nil
The following already-existing macros are defined as aliases, such
that their semantics is unchanged (although scm_is_bool used to be a
function and is now a macro).
scm_is_null --> scm_is_null_and_not_lisp_nil
scm_is_false --> scm_is_false_and_not_lisp_nil
scm_is_true --> scm_is_true_or_lisp_nil
scm_is_bool --> scm_is_bool_and_not_lisp_nil
(I still believe that these should be changed to versions that handle
%nil properly, but await approval on that point, so these patches do
not make those changes)
Also, if the preprocessor macro SCM_ENABLE_ELISP is not true (this
macro already existed and was used in lang.h), all overheads
associated with %nil handling are eliminated from the above macros.
* libguile/tags.h (SCM_BOOL_F, SCM_BOOL_T, SCM_UNSPECIFIED)
(SCM_UNDEFINED, SCM_UNBOUND, SCM_ELISP_NIL): Renumber, so that a
number of important distinctions (false versus true, end-of-list, etc)
can be made by masking a single bit. Also define a number of
build-time tests to assert that this condition holds.
* libguile/boolean.h (scm_is_false_and_not_nil, scm_is_true_or_nil)
(scm_is_false_assume_not_nil, scm_is_true_assume_not_nil):
(scm_is_false_or_nil, scm_is_true_and_not_nil)
(scm_is_bool_or_nil, scm_is_bool_and_not_nil): New exciting macros to
test certain boolean/end-of-list properties.
(scm_is_false, scm_is_true): Use a restrictive definition, where only
SCM_BOOL_F is false. Should probably change in the future.
(scm_is_bool): Incompatible change: changed to be a macro. Was a
function before. Probably should allow nil as a boolean, but that will
be for a later patch.
(scm_is_lisp_false, scm_is_lisp_true): New macros, implementing the
standard Lisp boolean predicates, where '() is actually false.
* libguile/eval.i.c (CEVAL): Fix a number of false-or-nil and similar
tests to use the new macros.
* libguile/lang.h (SCM_NULL_OR_NIL_P): Use scm_is_null_or_nil.
* libguile/pairs.c: Add a compile-time check that null and nil differ by
only one bit.
* libguile/pairs.h (scm_is_null_and_not_nil, scm_is_null_assume_not_nil)
(scm_is_null_or_nil): New exciting macros!
(scm_is_null): Just be scm_is_null_and_not_nil, for now.
* libguile/print.c: Adapt to the reordering, and print suitably nasty
things for the not-to-be-used values.
Since combining characters, such as accents, modify the appearance of the
previous letter, it looks awkward in its character literal form (#\name)
since it modified the backslash. This instead prints the combining
character on a small circle.
* libguile/chars.h (SCM_CODEPOINT_DOTTED_CIRCLE): new #define
* libguile/print.c (iprint1): print combining characters on dotted circles
* libguile/read.c (scm_read_character): parse the combination of combining
characters and dotted circles
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
* libguile/hash.c (scm_i_string_hash): new function
(scm_hasher): don't unpack string: use scm_i_string_hash
* libguile/hash.h: new declaration for scm_i_string_hash
* libguile/print.c (quote_keywordish_symbol): use symbol accessors
(scm_i_print_symbol_name): new function
(scm_print_symbol_name): call scm_i_print_symbol_name
(iprin1): use scm_i_print_symbol_name to print symbols
* libguile/print.h: new declaration for scm_i_print_symbol_name
* libguile/symbols.c (lookup_interned_symbol): now takes scheme string
instead of c string; callers changed
(lookup_interned_symbol): add wide symbol support
(scm_i_c_mem2symbol): removed
(scm_i_mem2symbol): removed and replaced with scm_i_str2symbol
(scm_i_str2symbol): new function
(scm_i_mem2uninterned_symbol): removed and replaced with
scm_i_str2uninterned_symbol
(scm_i_str2uninterned_symbol): new function
(scm_make_symbol, scm_string_to_symbol, scm_from_locale_symbol)
(scm_from_locale_symboln): use scm_i_str2symbol
* test-suite/tests/symbols.test: new tests
* libguile/tags.h (scm_tc7_program):
* libguile/programs.h: Programs now have their own tc7 code. Fix up the
macros appropriately.
* libguile/programs.c: Remove smobby bits, leaving marking, printing,
and application for other parts of Guile.
* libguile/debug.c (scm_procedure_source):
* libguile/eval.c (scm_trampoline_0, scm_trampoline_1)
(scm_trampoline_2): Add cases for tc7_program.
* libguile/eval.i.c (CEVAL, SCM_APPLY):
* libguile/evalext.c (scm_self_evaluating_p):
* libguile/gc-card.c (scm_i_sweep_card, scm_i_tag_name):
* libguile/gc-mark.c (1):
* libguile/print.c (iprin1):
* libguile/procs.c (scm_procedure_p, scm_thunk_p)
* libguile/vm-i-system.c (make-closure): Adapt to new procedure
representation.
* libguile/procprop.c (scm_i_procedure_arity): Do the right thing for
programs.
* test-suite/tests/procprop.test ("procedure-arity"): Arity test now
succeeds.
* libguile/goops.c (scm_class_of): Programs now belong to the class
<procedure>, not a smob class.
* libguile/vm.h (struct vm, struct vm_cont):
* libguile/vm-engine.c (vm_engine):
* libguile/frames.h (SCM_FRAME_BYTE_CAST, struct vm_frame):
* libguile/frames.c (scm_c_make_vm_frame): Fix usages of scm_byte_t,
changing them to scm_t_uint8.
This requres the creation of a new type
scm_t_string_failed_conversion_handler to replace libunistring's
enum iconveh_ilseq_handler.
* libguile/strings.h: don't include <uniconv.h>
(scm_t_string_failed_conversion_handler): new enum type
(SCM_FAILED_CONVERSION_ERROR, SCM_FAILED_CONVERSION_QUESTION_MARK):
(SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE): new enum type values
* libguile/strings.c (scm_to_stringn): now takes type
scm_t_string_failed_conversion_handler. All callers changed.
* libguile/print.c: include <uniconv.h>
* libguile/ports.c (scm_lfwrite_substr): use
scm_t_string_conversion_handler's constants
* libguile/gen-scmconfig.c (SCM_ICONVEH_ERROR):
(SCM_ICONVEH_QUESTION_MARK, SCM_ICONVEH_ESCAPE_SEQUENCE): store
iconveh_ilseq_hander constants as #define's
Also, scm_charprint is renamed to scm_i_charprint.
* libguile/strings.h: make scm_i_string_wide_chars internal.
* libguile/print.h: rename scm_charprint to scm_i_charprint. Make
internal.
* libguile/print.c (scm_i_charprint): renamed from scm_charprint
(scm_charprint): renamed to scm_i_charprint. All callers changed.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
This adds the 32-bit standalone characters. Strings are still
8-bit. Characters larger than 8-bit can only be entered or
displayed in octal format at this point. At this point, the
terminal's display encoding is expected to be Latin-1.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
add 32-bit char
* module/language/assembly.scm (object->assembly): add 32-bit char
(assembly->object): add 32-bit char
* libguile/vm-i-system.c (make-char32): new op
* libguile/print.c (iprin1): print 32-bit char
* libguile/numbers.h: add type scm_t_wchar
* libguile/numbers.c: add type scm_t_wchar
* libguile/chars.h: new type scm_t_wchar
(SCM_CODEPOINT_MAX): new
(SCM_IS_UNICODE_CHAR): new
(SCM_MAKE_CHAR): operate on 32-bit char
* libguile/chars.c: comparison operators now use Unicode
codepoints
(scm_c_upcase): now receives and returns scm_t_wchar
(scm_c_downcase): now receives and returns scm_t_wchar
The global variables scm_charnames and scm_charnums are replaced with
the accessor functions scm_i_charname and scm_i_charname_to_num.
Also, the incomplete and broken EBCDIC support is removed.
* libguile/print.c (iprin1): use new func scm_i_charname
* libguile/read.c (scm_read_character): use new func
scm_i_charname_to_num
* libguile/chars.c (scm_i_charname): new function
(scm_i_charname_to_char): new function
(scm_charnames, scm_charnums): removed
* libguile/chars.h: new declarations
The idea is to introduce `gsubrs' whose arity is encoded in their type
(more precisely in the sizeof (void *) - 8 MSBs). This removes the
indirection introduced by cclos and simplifies the code.
* libguile/__scm.h (CCLO): Remove.
* libguile/debug.c (scm_procedure_source, scm_procedure_environment):
Remove references to `scm_tc7_cclo'.
* libguile/eval.c (scm_trampoline_0, scm_trampoline_1,
scm_trampoline_2): Replace `scm_tc7_cclo' with `scm_tc7_gsubr'.
* libguile/eval.i.c (CEVAL): Likewise. No longer make PROC the first
argument. Directly invoke `scm_gsubr_apply ()' instead of jump to the
`evap(N+1)' label or call to `SCM_APPLY ()'.
* libguile/evalext.c (scm_self_evaluating_p): Remove reference to
`scm_tc7_cclo'.
* libguile/gc-card.c (scm_i_sweep_card, scm_i_tag_name): Likewise.
* libguile/gc-mark.c (scm_gc_mark_dependencies): Likewise.
* libguile/goops.c (scm_class_of): Likewise.
* libguile/print.c (iprin1): Likewise.
* libguile/gsubr.c (create_gsubr): Use `unsigned int's for REQ, OPT and
RST. Use `scm_tc7_gsubr' instead of `scm_makcclo ()' in the default
case.
(scm_gsubr_apply): Remove calls to `SCM_GSUBR_PROC ()'.
(scm_f_gsubr_apply): Remove.
* libguile/gsubr.h (SCM_GSUBR_TYPE): New definition.
(SCM_GSUBR_MAX): Changed to 33.
(SCM_SET_GSUBR_TYPE, SCM_GSUBR_PROC, SCM_SET_GSUBR_PROC,
scm_f_gsubr_apply): Remove.
* libguile/procprop.c (scm_i_procedure_arity): Remove reference to
`scm_tc7_cclo'; add proper handling of `scm_tc7_gsubr'.
* libguile/procs.c (scm_makcclo, scm_make_cclo): Remove.
(scm_procedure_p): Remove reference to `scm_tc7_cclo'.
(scm_thunk_p): Likewise, plus add proper `scm_tc7_gsubr' handling.
* libguile/procs.h (SCM_CCLO_LENGTH, SCM_MAKE_CCLO_TAG,
SCM_SET_CCLO_LENGTH, SCM_CCLO_BASE, SCM_SET_CCLO_BASE, SCM_CCLO_REF,
SCM_CCLO_SET, SCM_CCLO_SUBR, SCM_SET_CCLO_SUBR, scm_makcclo,
scm_make_cclo): Remove.
* libguile/stacks.c (read_frames): Remove reference to `scm_f_gsubr_apply'.
* libguile/tags.h (scm_tc7_cclo): Remove.
(scm_tc7_gsubr): New.
(scm_tcs_subrs): Add `scm_tc7_gsubr'.
* libguile/print.c (iprin1): When displaying a weak vector, access
elements via `scm_c_vector_ref ()', not via the macro.
git-archimport-id: lcourtes@laas.fr--2005-libre/guile-core--boehm-gc--1.9--patch-14
the value at its top. This fixes a reference leak.
(PUSH_REF): Perform `pstate->top++' after calling
`PSTATE_STACK_SET ()' in order to avoid undesired potential side
effects.
New.
(sym_reader): New.
(scm_print_opts): Added "quote-keywordish-symbols" option.
(quote_keywordish_symbol): New, for evaluating the option.
(scm_print_symbol_name): Use it.
(scm_init_print): Initialize new option to sym_reader.
Removed ref_stack field.
(PSTATE_STACK_REF, PSTATE_STACK_SET): New, for accessing the stack
of a print state. Use them everywhere instead of ref_stack.
print.c, ports.c, mallocs.c, hooks.c, hashtab.c, fports.c,
guardians.c, filesys.c, coop-pthreads.c, continuations.c: Use
scm_uintprint to print unsigned integers, raw heap words, and
adresses, using a cast to scm_t_bits to turn pointers into
integers.