The open-file port should use the 8-bit ISO-8859-1 encoding when
a file is opened using mode "b". Also, it should honor a "coding:"
declaration at the top of a file when reading files where it is present.
* libguile/fports.c (scm_open_file): modified
* test-suite/tests/ports.test: more tests for open-file
* doc/ref/api-io.texi (File Ports): more documentation for open-file
* libguile/strings.c (scm_encoding_error): Change arguments to convey
more information. Raise the error with `scm_throw ()', passing all
the information to the handler.
(scm_from_stringn, scm_to_stringn): Update accordingly.
* test-suite/tests/ports.test ("string ports")["wrong encoding"]: Check
the arguments passed to the `throw' handler.
* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
Output")["put-bytevector with wrong-encoding string port"]: Likewise.
* libguile/strports.c (scm_i_mkstrport): Remove.
(scm_mkstrport): Don't change the port's encoding to UTF-8; convert
STR to the default port encoding.
(scm_strport_to_string): Fix documentation & indentation.
* libguile/strports.h (scm_i_mkstrport): Remove.
* test-suite/lib.scm (exception:encoding-error): New variable.
(format-test-name): Set `%default-port-encoding' to "UTF-8".
* test-suite/tests/ports.test ("string ports")["%default-port-encoding
is honored", "suitable encoding [latin-1]", "suitable encoding
[latin-3]", "wrong encoding"]: New tests.
* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
Output")["put-bytevector with UTF-16 string port", "put-bytevector
with wrong-encoding string port"]: New tests.
* test-suite/tests/reader.test (read-string): Set
`%default-port-encoding' to `#f'.
("reading")["unprintable symbol"]: Use a string that doesn't contain
zeros.
* doc/ref/api-io.texi (String Ports): Document encoding issues with
`call-with-output-string' and `with-output-to-string'.
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
(main data-file-name test-file-name): Exported.
((guile-user)::main): New function, wrapper for function
(test-suite guile-test)::main.
* tests/load.test: Wrapped in module (test-suite test-load).
* tests/ports.test: Wrapped in module (test-suite test-ports).
* tests/r4rs.test: Wrapped in module (test-suite test-r4rs).
Added comments about the required structure of the file itself,
since it is subject to some tests. Removed some now unnecessary
undefine operations.
* tests/syntax.test: Wrapped in module (test-suite test-syntax)
* NEWS: Corrected remarks about SCM_API.
* configure.in: Defining USE_DLL_IMPORT definition to indicate
usage of DLL import macros in `libguile/__scm.h'.
(LIBOBJS): Removed `fileblocks.o' from the list of object files.
Somehow Jim Blandy's patch from 1997 did not survive.
2001-11-04 Stefan Jahn <stefan@lkcc.org>
* configure.in (EXTRA_DEFS): Follow-up patch. Using SCM_IMPORT
instead of __SCM_IMPORT__.
* readline.c (scm_readline_init_ports): Disable input/output
stream redirection for Win32. The readline package for Win32
does not support this. The guile-readline library works fine
for command line editing.
* readline.h (SCM_RL_API): Renamed __FOO__ macros into FOO.
2001-11-04 Stefan Jahn <stefan@lkcc.org>
* Makefile.am (libguile_la_LIBADD): Added $(THREAD_LIBS_LOCAL)
here (was at guile_LDADD) which describes the dependency
correctly and allows a clean build on Win32.
* __scm.h (SCM_API): Follow-up patch. Renamed __FOO__ macros
into FOO.
* __scm.h: USE_DLL_IMPORT indicates the usage of the DLL
import macros for external libraries (libcrypt, libqthreads,
libreadline and libregex).
* coop-defs.h: Include <winsock2.h> for `struct timeval'.
* posix.c (flock): Added support for flock() in M$-Windows.
* guile.c (SCM_IMPORT): Follow-up patch. Use SCM_IMPORT instead
of __SCM_IMPORT__.
* fports.c (getflags): Differentiate reading and writing pipes
descriptors.
* filesys.c (S_IS*): Redefine all of the S_IS*() macros for
M$-Windows.
* coop.c (coop_condition_variable_timed_wait_mutex): Use
conditionalized error code if `ETIMEDOUT' is not available.
(scm_thread_usleep): Remove bogus declaration of `struct timeval
timeout'.
* numbers.c (PTRDIFF_MIN): Moved this definition where it actually
belongs. That is because NO_PREPRO_MAGIC gets undefined after
each inclusion of `num2integral.i.c'.
(SIZE_MAX): Define NO_PREPRO_MAGIC if SIZE_MAX is undefined.
2001-11-04 Stefan Jahn <stefan@lkcc.org>
* md/Makefile.am (EXTRA_DIST): Added `i386.asm'.
* md/i386.asm: New file. Contains the Intel syntax version for
nasm/tasm/masm of the file `i386.s'.
* qt.h.in: Definition of QT_API, QT_IMPORT and QT_EXPORT.
Prefixed each symbols which is meant to go into a DLL.
* Makefile.am (libqthreads_la_LDFLAGS): Put `-no-undefined'
into LDFLAGS to support linkers which do not allow unresolved
symbols inside shared libraries.
(EXTRA_DIST): Add `libqthreads.def', which is an export file
definition for M$-Windows. It defines exported symbols. This is
necessary because the M$VC linker does not know how to export
assembler symbols into a DLL.
2001-11-04 Stefan Jahn <stefan@lkcc.org>
* srfi-13.h, srfi-14.h, srfi-4.h: Follow-up patch. Renamed
__FOO__ macros into FOO.
2001-11-04 Stefan Jahn <stefan@lkcc.org>
* tests/ports.test: Run (close-port) before (delete-file) if
necessary/advisory.
have run.
* tests/ports.test (test-file), tests/load.test (temp-dir):
redefined using data-file-name instead of tmpnam. the test files
will be created in the build directory instead of /var/tmp or
whereever tmpnam puts them.