* libguile/ports.h: Slight reorder.
* libguile/ports.c: Reorder ports implementation to match the header
file. This will make it easier to add locking and _unlocked
variants.
* libguile/ports.h (SCM_PORT_DESCRIPTOR): New macro, to get at a port
descriptor in the third word of a port instead of looking it up in a
table.
(scm_c_port_type_ref, scm_c_port_type_add_x): New API for working with
numbered ptob descriptors.
(SCM_PTOBNAME): Implement in terms of scm_c_port_type_ref.
(scm_get_byte_or_eof, scm_peek_byte_or_eof): Use SCM_PORT_DESCRIPTOR.
* libguile/ports.c (scm_c_num_port_types, scm_c_port_type_ref)
(scm_c_port_type_add_x, scm_make_port_type): Protect scm_ptobs access
with a mutex. Have it be an array of pointers instead of an array of
structures. Adapt users to the new APIs.
(scm_c_make_port_with_encoding): Allocate ports with three words. The
third word is the ptob descriptor.
* libguile/backtrace.c:
* libguile/goops.c:
* libguile/ioext.c:
* libguile/print.c: Adapt to use scm_c_port_type_ref and
SCM_PORT_DESCRIPTOR.
* libguile/ports.h (scm_grow_port_cbuf, scm_pt_size, scm_pt_member):
Remove declarations of unimplemented functions. Move a couple of
other definitions around.
* libguile/ports.h (scm_c_lock_port, scm_c_try_lock_port)
(scm_c_unlock_port): New inline functions.
(scm_t_port): Add a lock field, if threads are enabled. This is a
first step towards threadsafe ports.
* libguile/ports.c (scm_c_make_port_with_encoding): Init the port's
lock.
* libguile/inline.c: Residualize the inline functions from ports.h.
* libguile/ports.c (scm_c_make_port_with_encoding, scm_c_make_port): New
functions, to replace scm_new_port_table_entry. Use a weak set
instead of a weak table.
(scm_i_remove_port):
(scm_c_port_for_each, scm_port_for_each): Adapt to use weak set.
(scm_i_void_port): Use scm_c_make_port.
(scm_init_ports): Make a weak set.
* libguile/fports.c:
* libguile/ioext.c:
* libguile/r6rs-ports.c:
* libguile/strports.c:
* libguile/vports.c: Adapt to use the new scm_c_make_port API.
* libguile/async.c:
* libguile/async.h:
* libguile/debug.h:
* libguile/deprecated.c:
* libguile/deprecated.h:
* libguile/evalext.h:
* libguile/gc-malloc.c:
* libguile/gc.h:
* libguile/gen-scmconfig.c:
* libguile/numbers.c:
* libguile/ports.c:
* libguile/ports.h:
* libguile/procprop.c:
* libguile/procprop.h:
* libguile/read.c:
* libguile/socket.c:
* libguile/srfi-4.h:
* libguile/strings.c:
* libguile/strings.h:
* libguile/tags.h:
* module/ice-9/boot-9.scm:
* module/ice-9/deprecated.scm: Remove all deprecated code. CPP defines
that were not previously issuing warnings were changed so that their
expansions would indicate the replacement forms to use,
e.g. scm_sizet__GONE__REPLACE_WITH__size_t.
The two exceptions were SCM_LISTN, which did not produce warnings
before, and the string-filter argument order stuff.
Drops the initial dirty memory usage of Guile down to 2.8 MB on my
machine, from 4.4 MB.
* libguile/ports.h (scm_i_remove_port): Remove declaration, as it was
SCM_INTERNAL.
* libguile/ports.c (scm_add_to_port_table): Issue a deprecation
warning if this function is called. Remove needless SCM_API
declaration, it was already declared as such in ports.h. Safely
access the port table.
(scm_i_remove_port): Remove bogus comment about lack of need for
threadsafety. Take the port table mutex.
(scm_close_port): No need to take port table mutex around calling
scm_i_remove_port.
* libguile/ports.c (scm_i_set_default_port_encoding,
scm_i_default_port_encoding): New function. Replace
`scm_i_set_port_encoding_x' and `scm_i_get_port_encoding' with
PORT == SCM_BOOL_F.
(scm_i_set_port_encoding_x): Assume PORT is a port.
(scm_i_get_port_encoding): Remove.
(scm_port_encoding): Adjust accordingly.
(scm_new_port_table_entry): Use `scm_i_default_port_encoding'.
* libguile/ports.h (scm_i_get_port_encoding): Remove declarations.
(scm_i_default_port_encoding, scm_i_set_default_port_encoding): New
declarations.
* libguile/posix.c (setlocale): Use `scm_i_set_default_port_encoding'.
Thanks to Bruno Haible for his suggestions. See
<http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>,
for details.
* libguile/ports.c (register_finalizer_for_port): Always register a
finalizer for PORT.
(finalize_port): Close ENTRY->input_cd and ENTRY->output_cd.
(scm_new_port_table_entry): Initialize the `input_cd' and `output_cd'
fields.
(utf8_to_codepoint): New function.
(get_codepoint): Rewrite to use `iconv' instead of libunistring.
(scm_i_set_port_encoding_x): Initialize the `input_cd' and `output_cd'
fields.
(update_port_lf): Move upward. Use `switch' instead of `if's.
* libguile/ports.h (scm_t_port)[input_cd, output_cd]: New fields.
* libguile/print.c (codepoint_to_utf8, display_string): New functions.
(display_character): Use `display_string'.
(write_combining_character): Likewise.
(iprin1): Use `display_string' instead of `scm_lfwrite_str', and
`display_character' instead of `scm_putc'.
(write_character): Likewise.
(write_character_escaped): New function.
* test-suite/tests/encoding-escapes.test ("display output
escapes")["Rashomon"]: Use lower-case escapes.
["fake escape"]: New test.
* libguile/ports.h:
* libguile/ports.c (scm_ports_prehistory): Remove this function. Just
initialize global vars to NULL/0 instead of needing a prehistory
function.
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
The port position macros incorrectly required enclosing braces
when used within if statements.
* libguile/ports.h (SCM_INCLINE, SCM_ZEROCOL, SCM_INCCOL)
(SCM_DECCOL, SCM_TABCOL): enclose macro in do/while
* libguile/ports.c (update_port_lf): remove extra braces
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
* libguile/gen-scmconfig.c (main): Produce a definition for
`scm_t_off'.
* libguile/ports.h (scm_t_port)[read_buf_size, saved_read_buf_size,
write_buf_size, seek, truncate]: Use `scm_t_off' instead of `off_t' so
that the layout and size of the structure does not depend on the
application's `_FILE_OFFSET_BITS' value. Reported by Bill
Schottstaedt, see
http://lists.gnu.org/archive/html/bug-guile/2009-06/msg00018.html.
(scm_set_port_seek, scm_set_port_truncate): Update.
* libguile/ports.c (scm_set_port_seek, scm_set_port_truncate): Use
`scm_t_off' and `off_t_or_off64_t'.
* libguile/fports.c (fport_seek, fport_truncate): Use `scm_t_off'
instead of `off_t'.
* libguile/r6rs-ports.c (bip_seek, cbp_seek, bop_seek): Use `scm_t_off'
instead of `off_t'.
* libguile/rw.c (scm_write_string_partial): Likewise.
* libguile/strports.c (st_resize_port, st_seek, st_truncate): Likewise.
* doc/ref/api-io.texi (Port Implementation): Update prototype of
`scm_set_port_seek ()' and `scm_set_port_truncate ()'.
* NEWS: Update.
scm_fdes_to_port, but take mode bits directly instead of as a C
string.
(scm_i_fdes_to_port): Implement using above.
(scm_open_file): Use scm_i_fdes_to_port together with
scm_i_mode_bits to avoid accessing internals of SCM string from C.
* vports.c (scm_make_soft_port): Use scm_i_fdes_to_port together
with scm_i_mode_bits to avoid accessing internals of SCM string
from C.
* ports.h (scm_i_mode_bits): New, same as scm_mode_bits but with a
SCM string as argument.
* ports.c (scm_i_void_port): New, like scm_void_port but take mode
bits directly instead of C string.
(scm_void_port): Implement using above.
(scm_sys_make_void_port): Use scm_i_void_port together with
scm_i_mode_bits to avoid accessing internals of SCM string.
* convert.i.c, backtrace.c, strop.c, strorder.c, strports.c,
struct.c, unif.c, ports.c: Use SCM_I_STRING_CHARS,
SCM_I_STRING_UCHARS, and SCM_I_STRING_LENGTH instead of
SCM_STRING_CHARS, SCM_STRING_UCHARS, and SCM_STRING_LENGTH,
respectively. Also, replaced scm_return_first with more explicit
scm_remember_upto_here_1, etc, or introduced them in the first
place.
malloc.
* gc-segment.c (scm_i_get_new_heap_segment): remove cluster cruft:
only use SCM_MIN_HEAP_SEG_SIZE.
* ports.c (scm_add_to_port_table): add backwards compatibility
function
* ports.h: use scm_i_ prefix for port table and port table size.
* *.c: add space after commas everywhere.
* *.c: use SCM_VECTOR_SET everywhere, where a vector is written.
Document cases where SCM_WRITABLE_VELTS() is used.
* vectors.h (SCM_VELTS): prepare for write barrier, and let
SCM_VELTS() return a const pointer
(SCM_VECTOR_SET): add macro.
* autogen.sh (mscripts): find and check version number of
autoconf. Complain if 2.53 is not found.