* libguile/strports.c (scm_mkstrport): Use UTF-8; ignore
%default-port-encoding. Rename 'str_len' and 'c_pos' to
'num_bytes' and 'c_byte_pos'. Interpret 'pos' argument
as a character index instead of a byte index.
* module/ice-9/boot-9.scm (%cond-expand-features): Add srfi-6 to the
list of core features.
* module/srfi/srfi-6.scm (open-input-string, open-output-string): Simply
re-export these, since the core versions are now compliant.
* doc/ref/api-io.texi (String Ports): Remove text that describes
non-compliant behavior of string ports with regard to encoding.
* doc/ref/srfi-modules.texi (SRFI-0): Add srfi-6 to the list of
core features.
(SRFI-6): Remove text that mentions non-compliant behavior of
core string ports.
* module/ice-9/format.scm (format):
* module/ice-9/pretty-print.scm (truncated-print):
* module/rnrs/io/ports.scm (open-string-input-port,
open-string-output-port):
* test-suite/test-suite/lib.scm (format-test-name):
* test-suite/tests/chars.test ("combining accent is pretty-printed",
"combining X is pretty-printed"):
* test-suite/tests/ecmascript.test (eread, eread/1):
* test-suite/tests/rdelim.test:
* test-suite/tests/reader.test (read-string):
* test-suite/tests/regexp.test:
* test-suite/tests/srfi-105.test (read-string): Don't set
%default-port-encoding before creating string ports.
* benchmark-suite/benchmarks/ports.bm (%latin1-port): Use
'set-port-encoding!' to set the string port encoding.
(%utf8/ascii-port, %utf8/wide-port, "rdelim"): Don't set
%default-port-encoding before creating string ports.
* test-suite/tests/r6rs-ports.test ("lookahead-u8 non-ASCII"): Don't set
%default-port-encoding before creating string ports.
("put-bytevector with UTF-16 string port", "put-bytevector with
wrong-encoding string port"): Use 'set-port-encoding!' to set the
string port encoding.
* test-suite/tests/print.test (tprint): Use 'set-port-encoding!' to set
the string port encoding.
("truncated-print"): Use 'pass-if-equal'.
* test-suite/tests/ports.test ("encoding failure leads to exception",
"%default-port-encoding is honored", "peek-char [latin-1]", "peek-char
[utf-8]", "peek-char [utf-16]"): Remove tests.
("%default-port-encoding is ignored", "peek-char"): Add tests.
("suitable encoding [latin-1]", "suitable encoding [latin-3]",
"wrong encoding, error", "wrong encoding, substitute",
"wrong encoding, escape"): Use 'set-port-encoding!' to set the
string port encoding.
("%default-port-encoding, wrong encoding"): Rewrite to use
a file port instead of a string port.
* libguile/fports.c (scm_open_file_with_encoding): New API function,
containing the code previously found in 'scm_open_file', but modified
to accept the new 'guess_encoding' and 'encoding' arguments.
(scm_open_file): Now just a simple wrapper that calls
'scm_open_file_with_encoding'.
(scm_i_open_file): New implementation of 'open-file' that accepts
keyword arguments '#:guess-encoding' and '#:encoding', and calls
'scm_open_file_with_encoding'.
(scm_init_fports_keywords): New initialization function that gets
called after keywords are initialized.
* libguile/fports.h (scm_open_file_with_encoding,
scm_init_fports_keywords): Add prototypes.
* libguile/init.c (scm_i_init_guile): Call 'scm_init_fports_keywords'.
* module/ice-9/boot-9.scm: Add enhanced versions of 'open-input-file',
'open-output-file', 'call-with-input-file', 'call-with-output-file',
'with-input-from-file', 'with-output-to-file', and
'with-error-to-file', that accept keyword arguments '#:binary',
'#:encoding', and (for input port constructors) '#:guess-encoding'.
* doc/ref/api-io.texi (File Ports): Update documentation.
* test-suite/tests/ports.test ("keyword arguments for file openers"):
Add tests.
* libguile/ports.c (scm_i_unget_bytes): New static function.
(scm_unget_bytes): New API function.
(scm_unget_byte): Rewrite to simply call 'scm_i_unget_bytes'.
(scm_ungetc, scm_peek_char, looking_at_bytes): Use 'scm_i_unget_bytes'.
* libguile/ports.h: Add prototype for 'scm_unget_bytes'.
* libguile/fports.c (scm_setvbuf): Use 'scm_unget_bytes'.
* libguile/r6rs-ports.c (scm_unget_bytevector): New procedure.
* module/ice-9/binary-ports.scm (unget-bytevector): New export.
* doc/ref/api-io.texi (R6RS Binary Input): Add documentation.
(R6RS I/O Ports): Update brief description of (ice-9 binary-ports) to
reflect the new reality: it is no longer a subset of (rnrs io ports).
* test-suite/tests/ports.test ("unget-bytevector"): Add test.
* libguile/fports.c (scm_open_file): Do not scan for coding
declarations. Replace 'use_encoding' local variable with
'binary'. Update documentation string.
* module/ice-9/psyntax.scm (include): Add the same file-encoding
logic that's used in compile-file and scm_primitive_load.
* module/ice-9/psyntax-pp.scm: Regenerate.
* doc/ref/api-io.texi (File Ports): Update docs.
* test-suite/tests/ports.test: Change "open-file HONORS file coding
declarations" test to "open-file IGNORES file coding declaration".
* test-suite/tests/coding.test (scan-coding): Use 'file-encoding' to
scan for the encoding, since 'open-input-file' no longer does so.
* libguile/ports-internal.h (struct scm_port_internal): Add new members
'at_stream_start_for_bom_read' and 'at_stream_start_for_bom_write'.
(SCM_UNICODE_BOM): New macro.
(scm_i_port_iconv_descriptors): Add 'mode' parameter to prototype.
* libguile/ports.c (scm_new_port_table_entry): Initialize
'at_stream_start_for_bom_read' and 'at_stream_start_for_bom_write'.
(get_iconv_codepoint): Pass new 'mode' parameter to
'scm_i_port_iconv_descriptors'.
(get_codepoint): After reading a codepoint at stream start, record
that we're no longer at stream start, and consume a BOM where
appropriate.
(scm_seek): Set the stream start flags according to the new position.
(looking_at_bytes): New static function.
(scm_utf8_bom, scm_utf16be_bom, scm_utf16le_bom, scm_utf32be_bom,
scm_utf32le_bom): New static const arrays.
(decide_utf16_encoding, decide_utf32_encoding): New static functions.
(scm_i_port_iconv_descriptors): Add new 'mode' parameter. If the
specified encoding is UTF-16 or UTF-32, make that precise by deciding
what byte order to use, and construct iconv descriptors based on the
precise encoding.
(scm_i_set_port_encoding_x): Record that we are now at stream start.
Do not open the new iconv descriptors immediately; let them be
initialized lazily.
* libguile/print.c (display_string_using_iconv): Record that we're no
longer at stream start. Write a BOM if appropriate.
* doc/ref/api-io.texi (BOM Handling): New node.
* test-suite/tests/ports.test ("set-port-encoding!, wrong encoding"):
Adapt test to cope with the fact that 'set-port-encoding!' does not
immediately open the iconv descriptors.
(bv-read-test): New procedure.
("unicode byte-order marks (BOMs)"): New test prefix.
* libguile/r6rs-ports.c (scm_get_bytevector_some): Rewrite to
efficiently take the contents of the read/putback buffers. In the
docstring, clarify that it might not return all available bytes.
* doc/ref/api-io.texi (R6RS Binary Input): Clarify that
'get-bytevector-some' might not return all available bytes.
* test-suite/tests/r6rs-ports.test ("get-bytevector-some [only-some]"):
Remove bogus test, which requires more than the R6RS requires.
Fixes <http://bugs.gnu.org/11468>.
* libguile/ports.c (scm_conversion_strategy): Remove.
(default_conversion_strategy_var, sym_error, sym_substitute,
sym_escape): New variables.
(scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x):
Remove.
(scm_i_default_port_conversion_handler,
scm_i_set_default_port_conversion_handler): New functions.
(scm_port_conversion_strategy): Use
`scm_i_default_port_conversion_handler' when PORT is #f.
(scm_set_port_conversion_strategy_x): Use SYM_ERROR, SYM_SUBSTITUTE,
and SYM_ESCAPE. Use `scm_i_set_default_port_conversion_handler' when
PORT is #f.
(scm_init_ports): Initialize DEFAULT_CONVERSION_STRATEGY_VAR.
* libguile/ports.h: Update declarations accordingly.
* libguile/foreign.c: Change
`scm_i_get_conversion_strategy (SCM_BOOL_F)' to
`scm_i_default_port_conversion_handler ()'.
* libguile/strings.c: Likewise.
* test-suite/tests/ports.test ("%default-port-conversion-strategy"): New
test prefix.
* test-suite/tests/foreign.test ("pointer<->string")["%default-port-conversion-strategy
is error", "%default-port-conversion-strategy is soft"]: New tests.
* test-suite/test-suite/lib.scm (exception:encoding-error): Allow the
regexp to match `scm_to_stringn' error messages.
* doc/ref/api-io.texi (Ports): Document `%default-port-conversion-strategy'.
* doc/ref/api-compound.texi
* doc/ref/api-evaluation.texi
* doc/ref/api-foreign.texi
* doc/ref/api-io.texi
* doc/ref/posix.texi
* doc/ref/srfi-modules.texi: Add missing parentheses and commas to definitions
of C functions.
* doc/ref/api-data.texi: Change from @deffn to @deftypefn for C function
with arguments not of SCM type.
* doc/ref/api-evaluation.texi (Scheme Read): Note that read-set! is
syntax.
(Scheme Write): Likewise for print-set!.
* doc/ref/api-io.texi (Writing): Remove reference to
print-options-interface.
* doc/ref/repl-modules.texi (Readline Options): Update, and add entries
for readline-options, readline-set! et al.
* libguile/ports.c (scm_read_char): Mention `decoding-error' in the
docstring.
(get_codepoint): Change to return an error code; add `codepoint'
output parameter. Don't raise an error from here.
(scm_getc): Raise an error with `scm_decoding_error' if
`get_codepoint' returns an error.
(scm_peek_char): Likewise. Update docstring.
* libguile/strings.c (scm_decoding_error_key): New variable.
(scm_decoding_error): New function.
(scm_from_stringn): Use `scm_decoding_error' instead of
`scm_encoding_error'.
* libguile/strings.h (scm_decoding_error): New declaration.
* test-suite/tests/ports.test ("string ports")["read-char, wrong
encoding, error"]: Change to expect `decoding-error'. Make sure PORT
points past the error.
["read-char, wrong encoding, escape"]: Likewise.
["peek-char, wrong encoding, error"]: New test.
* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
Output")["put-bytevector with wrong-encoding string port"]: Change to
expect `decoding-error'.
("8.2.6 Input and output ports")["transcoded-port [error handling
mode = raise]"]: Likewise.
* test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
error, substitute"]: New tests.
* doc/ref/api-io.texi (Reading): Update documentation of `read-char' and
`peek-char'.
(Line/Delimited): Update documentation of `read-line'.
* doc/ref/api-evaluation.texi (Scheme Read): Fold all reader options
docs into this section. Undocument read-options-interface.
(Scheme Write): New section for `write' and `display', and the print
options. print-enable/print-disable are not documented, as there are
no boolean print options. print-options-interface is likewise
undocumented.
* doc/ref/api-options.texi: Remove discussion of options in
general. Move read options to Scheme Read, and print options to Scheme
Write.
* doc/ref/api-io.texi (Reading): Link to Scheme Read.
(Writing): Move write and display to Scheme Write, and link there.
* doc/ref/srfi-modules.texi:
* doc/ref/api-debug.texi:
* doc/ref/api-data.texi: Update xrefs.
The open-file port should use the 8-bit ISO-8859-1 encoding when
a file is opened using mode "b". Also, it should honor a "coding:"
declaration at the top of a file when reading files where it is present.
* libguile/fports.c (scm_open_file): modified
* test-suite/tests/ports.test: more tests for open-file
* doc/ref/api-io.texi (File Ports): more documentation for open-file
Conflicts:
doc/ref/api-procedures.texi
doc/ref/misc-modules.texi
(Caused by me removing `@page' from a couple of sections that have been modified
by others.)
* libguile/strports.c (scm_i_mkstrport): Remove.
(scm_mkstrport): Don't change the port's encoding to UTF-8; convert
STR to the default port encoding.
(scm_strport_to_string): Fix documentation & indentation.
* libguile/strports.h (scm_i_mkstrport): Remove.
* test-suite/lib.scm (exception:encoding-error): New variable.
(format-test-name): Set `%default-port-encoding' to "UTF-8".
* test-suite/tests/ports.test ("string ports")["%default-port-encoding
is honored", "suitable encoding [latin-1]", "suitable encoding
[latin-3]", "wrong encoding"]: New tests.
* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
Output")["put-bytevector with UTF-16 string port", "put-bytevector
with wrong-encoding string port"]: New tests.
* test-suite/tests/reader.test (read-string): Set
`%default-port-encoding' to `#f'.
("reading")["unprintable symbol"]: Use a string that doesn't contain
zeros.
* doc/ref/api-io.texi (String Ports): Document encoding issues with
`call-with-output-string' and `with-output-to-string'.
* doc/ref/api-evaluation.texi (Character Encoding of Source Files):
Mention IANA as the list of supported character encodings. Thanks to
Bruno Haible for pointing this out.
* doc/ref/api-io.texi (Ports): Likewise. Improve documentation of
`%default-port-encoding'.
because it looks better in the DVI output. Exceptions are
- wide examples, which would cause overfull hboxes if they
used the bigger @lisp font
- very large examples, which may look too big at the @lisp size.
* libguile/gen-scmconfig.c (main): Produce a definition for
`scm_t_off'.
* libguile/ports.h (scm_t_port)[read_buf_size, saved_read_buf_size,
write_buf_size, seek, truncate]: Use `scm_t_off' instead of `off_t' so
that the layout and size of the structure does not depend on the
application's `_FILE_OFFSET_BITS' value. Reported by Bill
Schottstaedt, see
http://lists.gnu.org/archive/html/bug-guile/2009-06/msg00018.html.
(scm_set_port_seek, scm_set_port_truncate): Update.
* libguile/ports.c (scm_set_port_seek, scm_set_port_truncate): Use
`scm_t_off' and `off_t_or_off64_t'.
* libguile/fports.c (fport_seek, fport_truncate): Use `scm_t_off'
instead of `off_t'.
* libguile/r6rs-ports.c (bip_seek, cbp_seek, bop_seek): Use `scm_t_off'
instead of `off_t'.
* libguile/rw.c (scm_write_string_partial): Likewise.
* libguile/strports.c (st_resize_port, st_seek, st_truncate): Likewise.
* doc/ref/api-io.texi (Port Implementation): Update prototype of
`scm_set_port_seek ()' and `scm_set_port_truncate ()'.
* NEWS: Update.
* scheme-compound.texi: Renamed to api-compound.texi.
* scheme-control.texi: Renamed to api-control.texi.
* scheme-data.texi: Renamed to api-data.texi.
* scheme-debug.texi: Renamed to api-debug.texi.
* deprecated.texi: Renamed to api-deprecated.texi.
* scheme-evaluation.texi: Renamed to api-evaluation.texi.
* ref-init.texi: Renamed to api-init.texi.
* scheme-io.texi: Renamed to api-io.texi.
* scheme-memory.texi: Renamed to api-memory.texi.
* scheme-modules.texi: Renamed to api-modules.texi.
* scheme-options.texi: Renamed to api-options.texi.
* scm.texi: Renamed to api-overview.texi.
* scheme-procedures.texi: Renamed to api-procedures.texi.
* scheme-scheduling.texi: Renamed to api-scheduling.texi.
* scheme-scm.texi: Renamed to api-scm.texi.
* scheme-smobs.texi: Renamed to api-smobs.texi.
* scheme-snarf.texi: Renamed to api-snarf.texi.
* scheme-translation.texi: Renamed to api-translation.texi.
* scheme-utility.texi: Renamed to api-utility.texi.
* debugging.texi: Renamed to scheme-debugging.texi.
* scripts.texi: Renamed to scheme-scripts.texi.
* program.texi: Renamed to libguile-program.texi.