* libguile/ports-internal.h (struct scm_port_internal): Add
setvbuf' field. Change 'pending_eof' to a 1-bit unsigned char.
* libguile/ports.c (scm_new_port_table_entry): Clear 'pti->setvbuf'.
* libguile/fports.c (scm_setvbuf): Accept any open port, and error out
when PORT's setvbuf' field is NULL. Remove explicit 'scm_gc_free' calls.
(scm_i_fdes_to_port): Set PORT's 'setvbuf' field.
* test-suite/tests/ports.test ("setvbuf")["closed port", "string port"]:
New tests.
* doc/ref/posix.texi (Ports and File Descriptors): Suggest that
'setvbuf' works for different port types.
* libguile/fports.c (scm_open_file_with_encoding): New API function,
containing the code previously found in 'scm_open_file', but modified
to accept the new 'guess_encoding' and 'encoding' arguments.
(scm_open_file): Now just a simple wrapper that calls
'scm_open_file_with_encoding'.
(scm_i_open_file): New implementation of 'open-file' that accepts
keyword arguments '#:guess-encoding' and '#:encoding', and calls
'scm_open_file_with_encoding'.
(scm_init_fports_keywords): New initialization function that gets
called after keywords are initialized.
* libguile/fports.h (scm_open_file_with_encoding,
scm_init_fports_keywords): Add prototypes.
* libguile/init.c (scm_i_init_guile): Call 'scm_init_fports_keywords'.
* module/ice-9/boot-9.scm: Add enhanced versions of 'open-input-file',
'open-output-file', 'call-with-input-file', 'call-with-output-file',
'with-input-from-file', 'with-output-to-file', and
'with-error-to-file', that accept keyword arguments '#:binary',
'#:encoding', and (for input port constructors) '#:guess-encoding'.
* doc/ref/api-io.texi (File Ports): Update documentation.
* test-suite/tests/ports.test ("keyword arguments for file openers"):
Add tests.
* libguile/ports.c (scm_i_unget_bytes): New static function.
(scm_unget_bytes): New API function.
(scm_unget_byte): Rewrite to simply call 'scm_i_unget_bytes'.
(scm_ungetc, scm_peek_char, looking_at_bytes): Use 'scm_i_unget_bytes'.
* libguile/ports.h: Add prototype for 'scm_unget_bytes'.
* libguile/fports.c (scm_setvbuf): Use 'scm_unget_bytes'.
* libguile/r6rs-ports.c (scm_unget_bytevector): New procedure.
* module/ice-9/binary-ports.scm (unget-bytevector): New export.
* doc/ref/api-io.texi (R6RS Binary Input): Add documentation.
(R6RS I/O Ports): Update brief description of (ice-9 binary-ports) to
reflect the new reality: it is no longer a subset of (rnrs io ports).
* test-suite/tests/ports.test ("unget-bytevector"): Add test.
* libguile/fports.c (scm_open_file): Do not scan for coding
declarations. Replace 'use_encoding' local variable with
'binary'. Update documentation string.
* module/ice-9/psyntax.scm (include): Add the same file-encoding
logic that's used in compile-file and scm_primitive_load.
* module/ice-9/psyntax-pp.scm: Regenerate.
* doc/ref/api-io.texi (File Ports): Update docs.
* test-suite/tests/ports.test: Change "open-file HONORS file coding
declarations" test to "open-file IGNORES file coding declaration".
* test-suite/tests/coding.test (scan-coding): Use 'file-encoding' to
scan for the encoding, since 'open-input-file' no longer does so.
* libguile/fports.c: Reorder includes to put system includes first;
fixes windows/winsock2 include error problem. Rely on Gnulib's
select module.
(fport_input_waiting): Use select instead of scm_std_select.
* libguile/socket.c (scm_init_socket): Remove mingw-specific code.
* libguile/fports.c: Remove ftruncate redefine; mingw is fine.
(scm_i_fdes_to_port): If we have no F_GETFL, just do an fstat. The
right place for an F_GETFL replacement would be in gnulib.
(fport_input_waiting): Remove an outdated comment.
* libguile/error.c (SCM_I_STRERROR, SCM_I_ERRNO): Remove, replacing uses
with strerror and errno.
* libguile/win32-socket.c:
* libguile/win32-socket.h: Remove. Mingw has suitable replacements.
* configure.ac:
* libguile/Makefile.am (EXTRA_libguile_@GUILE_EFFECTIVE_VERSION@_la_SOURCES):
(noinst_HEADERS): Update for win32-socket removal.
* libguile/ports.h:
* libguile/ports.c (scm_consume_byte_order_mark): New procedure.
* libguile/fports.c (scm_open_file): Call consume-byte-order-mark if we
are opening a file in "r" mode.
* libguile/read.c (scm_i_scan_for_encoding): Don't do anything about
byte-order marks.
* libguile/load.c (scm_primitive_load): Add a note about the duplicate
encoding scan.
* test-suite/tests/filesys.test: Add tests for UTF-8, UTF-16BE, and
UTF-16LE BOM handling.
* libguile/fports.c (scm_setvbuf): Use `scm_take_from_input_buffers'
directly instead of `scm_drain_input'; use `scm_unget_byte' instead of
`scm_unread_string' to put the drained input back to PORT. This
leaves PORT's line/column numbers unchanged, whereas they'd previously
be decreased by the `scm_unread_string' call.
* libguile/ports.c (scm_take_from_input_buffers): Update description and
variable names to refer to "bytes", not "chars".
* test-suite/tests/ports.test ("setvbuf"): New test prefix.
* libguile/fports.c (scm_revealed_count, scm_port_revealed)
(scm_set_port_revealed_x, scm_adjust_port_revealed_x): Move these APIs
here, and only operate on fports. To keep revealed ports alive, now
we will just keep them in a data structure that the GC knows about --
a static list.
* libguile/fports.h: Add revealed count to scm_t_fport, and move decls
of revealed-count functions here.
* libguile/ports.h:
* libguile/ports.c: Adapt to change. Remove SCM_REVEALED and
SCM_SETREVEALED; since they only apply to fports now, keeping them
around would be inviting type errors.
(finalize_port): We don't need to worry about resuscitating ports
here.
* libguile/init.c: Use the scm_set_port_revealed_x function to set the
revealed counts on stream ports.
* libguile/fports.c (close_the_fd, fport_close): Arrange to always close
the fd, even if the flush procedure throws an exception. Perhaps the
port machinery should do this for us, though. Don't wrap the close
call in SCM_SYSCALL, EINTR leaves the fd in an unspecified state.
Don't bother freeing buffers, the collector will handle that; simply
drop references via scm_port_non_buffer.
* libguile/ports.c (do_free, finalize_port): Catch exceptions caused by
the free procedure. Don't bother setting the stream to 0 at all.
(scm_close_port): Ensure that exceptions thrown by the "close"
procedure don't prevent the port from being marked as closed.
* libguile/ports.c (scm_putc, scm_puts):
* libguile/ports.h (scm_putc_unlocked, scm_puts_unlocked): Separate into
_unlocked and locked variants. Change all callers to use the
_unlocked versions.
* libguile/ports.c (scm_c_make_port_with_encoding, scm_c_make_port): New
functions, to replace scm_new_port_table_entry. Use a weak set
instead of a weak table.
(scm_i_remove_port):
(scm_c_port_for_each, scm_port_for_each): Adapt to use weak set.
(scm_i_void_port): Use scm_c_make_port.
(scm_init_ports): Make a weak set.
* libguile/fports.c:
* libguile/ioext.c:
* libguile/r6rs-ports.c:
* libguile/strports.c:
* libguile/vports.c: Adapt to use the new scm_c_make_port API.
* libguile/ports.c (scm_drain_input): Slight optimization.
* libguile/fports.c (scm_setvbuf): If there is buffered output, flush
it. If there is input, drain it, and then unread it after updating
the buffers. Much more sensible than dropping it silently...
The open-file port should use the 8-bit ISO-8859-1 encoding when
a file is opened using mode "b". Also, it should honor a "coding:"
declaration at the top of a file when reading files where it is present.
* libguile/fports.c (scm_open_file): modified
* test-suite/tests/ports.test: more tests for open-file
* doc/ref/api-io.texi (File Ports): more documentation for open-file
* libguile/filesys.h:
* libguile/filesys.c (scm_i_relativize_path): New function, moved here
from fports.c. Internal for now; we can make it external though if
people like its interface.
* libguile/fports.c (fport_canonicalize_filename): Move all of the
tricky bits to filesys.c. Also fixes a bug in which a delimiter wasn't
stripped.
* libguile/fports.c (%file-port-name-canonicalization): New global var.
(fport_canonicalize_filename): New helper. If
%file-port-name-canonicalization is 'absolute, then run file port
names through canonicalize_path; if it's 'relative, then canonicalize
the name, but strip off load paths; otherwise leave the port name
alone.
(scm_open_file): Use fport_canonicalize_filename.
(scm_init_fports): Define %file-port-name-canonicalization.
The intent is to allow compilation with `-Wundef', which in turn should
make it easier to catch erroneous uses of nonexistent macros.
* libguile/__scm.h: Don't assume `BUILDING_LIBGUILE' is defined.
* libguile/conv-uinteger.i.c (SCM_TO_TYPE_PROTO): Remove unneeded CPP
conditional on `TYPE_MIN == 0'.
* libguile/fports.c: Check for the definition of `HAVE_CHSIZE' and
`HAVE_FTRUNCATE', not for their value.
* libguile/ports.c: Likewise.
* libguile/numbers.c (guile_ieee_init): Likewise with `HAVE_DINFINITY'
and `HAVE_DQNAN'.
* test-suite/standalone/test-conversion.c (ieee_init): Likewise.
* libguile/strings.c: Likewise with `SCM_STRING_LENGTH_HISTOGRAM'.
* libguile/strings.h: Likewise.
* libguile/tags.h: Likewise with `HAVE_INTTYPES_H' and `HAVE_STDINT_H'.
* libguile/threads.c: Likewise with `HAVE_PTHREAD_GET_STACKADDR_NP'.
* libguile/vm-engine.c (VM_NAME): Likewise with `VM_CHECK_IP'.
* libguile/gen-scmconfig.c (main): Use "#ifdef HAVE_", not "#if HAVE_".
* libguile/socket.c (scm_setsockopt): Likewise.
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
* libguile/gen-scmconfig.c (main): Produce a definition for
`scm_t_off'.
* libguile/ports.h (scm_t_port)[read_buf_size, saved_read_buf_size,
write_buf_size, seek, truncate]: Use `scm_t_off' instead of `off_t' so
that the layout and size of the structure does not depend on the
application's `_FILE_OFFSET_BITS' value. Reported by Bill
Schottstaedt, see
http://lists.gnu.org/archive/html/bug-guile/2009-06/msg00018.html.
(scm_set_port_seek, scm_set_port_truncate): Update.
* libguile/ports.c (scm_set_port_seek, scm_set_port_truncate): Use
`scm_t_off' and `off_t_or_off64_t'.
* libguile/fports.c (fport_seek, fport_truncate): Use `scm_t_off'
instead of `off_t'.
* libguile/r6rs-ports.c (bip_seek, cbp_seek, bop_seek): Use `scm_t_off'
instead of `off_t'.
* libguile/rw.c (scm_write_string_partial): Likewise.
* libguile/strports.c (st_resize_port, st_seek, st_truncate): Likewise.
* doc/ref/api-io.texi (Port Implementation): Update prototype of
`scm_set_port_seek ()' and `scm_set_port_truncate ()'.
* NEWS: Update.
This fixes bug #24009 reported by Martin Pitt.
* libguile/threads.c (guilify_self_1): Check the return value of
pipe(2).
(scm_std_select): Use `full_read ()' instead of `read ()' when reading
from WAKEUP_FD.
* libguile/async.c (scm_i_queue_async_cell): Use `full_write ()' instead
of write(2) when writing to SLEEP_FD.
* libguile/fports.c (fport_flush): Likewise.
* libguile/posix.c (getgroups): Use the return value of getgroups(2) as
NGROUPS.
(scm_nice): Get the return value of nice(2) to make glibc happy.
* libguile/scmsigs.c (take_signal): Use `full_write ()' instead of
write(2).