The open-file port should use the 8-bit ISO-8859-1 encoding when
a file is opened using mode "b". Also, it should honor a "coding:"
declaration at the top of a file when reading files where it is present.
* libguile/fports.c (scm_open_file): modified
* test-suite/tests/ports.test: more tests for open-file
* doc/ref/api-io.texi (File Ports): more documentation for open-file
* libguile/filesys.h:
* libguile/filesys.c (scm_i_relativize_path): New function, moved here
from fports.c. Internal for now; we can make it external though if
people like its interface.
* libguile/fports.c (fport_canonicalize_filename): Move all of the
tricky bits to filesys.c. Also fixes a bug in which a delimiter wasn't
stripped.
* libguile/fports.c (%file-port-name-canonicalization): New global var.
(fport_canonicalize_filename): New helper. If
%file-port-name-canonicalization is 'absolute, then run file port
names through canonicalize_path; if it's 'relative, then canonicalize
the name, but strip off load paths; otherwise leave the port name
alone.
(scm_open_file): Use fport_canonicalize_filename.
(scm_init_fports): Define %file-port-name-canonicalization.
The intent is to allow compilation with `-Wundef', which in turn should
make it easier to catch erroneous uses of nonexistent macros.
* libguile/__scm.h: Don't assume `BUILDING_LIBGUILE' is defined.
* libguile/conv-uinteger.i.c (SCM_TO_TYPE_PROTO): Remove unneeded CPP
conditional on `TYPE_MIN == 0'.
* libguile/fports.c: Check for the definition of `HAVE_CHSIZE' and
`HAVE_FTRUNCATE', not for their value.
* libguile/ports.c: Likewise.
* libguile/numbers.c (guile_ieee_init): Likewise with `HAVE_DINFINITY'
and `HAVE_DQNAN'.
* test-suite/standalone/test-conversion.c (ieee_init): Likewise.
* libguile/strings.c: Likewise with `SCM_STRING_LENGTH_HISTOGRAM'.
* libguile/strings.h: Likewise.
* libguile/tags.h: Likewise with `HAVE_INTTYPES_H' and `HAVE_STDINT_H'.
* libguile/threads.c: Likewise with `HAVE_PTHREAD_GET_STACKADDR_NP'.
* libguile/vm-engine.c (VM_NAME): Likewise with `VM_CHECK_IP'.
* libguile/gen-scmconfig.c (main): Use "#ifdef HAVE_", not "#if HAVE_".
* libguile/socket.c (scm_setsockopt): Likewise.
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
* libguile/gen-scmconfig.c (main): Produce a definition for
`scm_t_off'.
* libguile/ports.h (scm_t_port)[read_buf_size, saved_read_buf_size,
write_buf_size, seek, truncate]: Use `scm_t_off' instead of `off_t' so
that the layout and size of the structure does not depend on the
application's `_FILE_OFFSET_BITS' value. Reported by Bill
Schottstaedt, see
http://lists.gnu.org/archive/html/bug-guile/2009-06/msg00018.html.
(scm_set_port_seek, scm_set_port_truncate): Update.
* libguile/ports.c (scm_set_port_seek, scm_set_port_truncate): Use
`scm_t_off' and `off_t_or_off64_t'.
* libguile/fports.c (fport_seek, fport_truncate): Use `scm_t_off'
instead of `off_t'.
* libguile/r6rs-ports.c (bip_seek, cbp_seek, bop_seek): Use `scm_t_off'
instead of `off_t'.
* libguile/rw.c (scm_write_string_partial): Likewise.
* libguile/strports.c (st_resize_port, st_seek, st_truncate): Likewise.
* doc/ref/api-io.texi (Port Implementation): Update prototype of
`scm_set_port_seek ()' and `scm_set_port_truncate ()'.
* NEWS: Update.
This fixes bug #24009 reported by Martin Pitt.
* libguile/threads.c (guilify_self_1): Check the return value of
pipe(2).
(scm_std_select): Use `full_read ()' instead of `read ()' when reading
from WAKEUP_FD.
* libguile/async.c (scm_i_queue_async_cell): Use `full_write ()' instead
of write(2) when writing to SLEEP_FD.
* libguile/fports.c (fport_flush): Likewise.
* libguile/posix.c (getgroups): Use the return value of getgroups(2) as
NGROUPS.
(scm_nice): Get the return value of nice(2) to make glibc happy.
* libguile/scmsigs.c (take_signal): Use `full_write ()' instead of
write(2).
print.c, ports.c, mallocs.c, hooks.c, hashtab.c, fports.c,
guardians.c, filesys.c, coop-pthreads.c, continuations.c: Use
scm_uintprint to print unsigned integers, raw heap words, and
adresses, using a cast to scm_t_bits to turn pointers into
integers.
scm_fdes_to_port, but take mode bits directly instead of as a C
string.
(scm_i_fdes_to_port): Implement using above.
(scm_open_file): Use scm_i_fdes_to_port together with
scm_i_mode_bits to avoid accessing internals of SCM string from C.
* vports.c (scm_make_soft_port): Use scm_i_fdes_to_port together
with scm_i_mode_bits to avoid accessing internals of SCM string
from C.
* ports.h (scm_i_mode_bits): New, same as scm_mode_bits but with a
SCM string as argument.
* ports.c (scm_i_void_port): New, like scm_void_port but take mode
bits directly instead of C string.
(scm_void_port): Implement using above.
(scm_sys_make_void_port): Use scm_i_void_port together with
scm_i_mode_bits to avoid accessing internals of SCM string.
* convert.i.c, backtrace.c, strop.c, strorder.c, strports.c,
struct.c, unif.c, ports.c: Use SCM_I_STRING_CHARS,
SCM_I_STRING_UCHARS, and SCM_I_STRING_LENGTH instead of
SCM_STRING_CHARS, SCM_STRING_UCHARS, and SCM_STRING_LENGTH,
respectively. Also, replaced scm_return_first with more explicit
scm_remember_upto_here_1, etc, or introduced them in the first
place.
net_db.c, fports.c, filesys.c, eval.c, deprecation.c, dynl.c:
Replaced uses of SCM_STRING_CHARS with proper uses of
scm_to_locale_string. Replaced SCM_STRINGP with scm_is_string.
Replaced scm_mem2string with scm_from_locale_string.
* simpos.c, posix.c (allocate_string_pointers, environ_list_to_c):
Removed, replaced all uses with scm_i_allocate_string_pointers.
it's not thread safe.
(scm_syserror): Use scm_strerror rather than SCM_I_STRERROR, to take
advantage of this.
* fports.c (scm_open_file): Use scm_strerror likewise.
SCM_INUM): Deprecated by reenaming them to SCM_I_INUMP, SCM_I_NINUMP
and SCM_I_INUM, respectively and adding deprecated versions to
deprecated.h and deprecated.c. Changed all uses to either use the
SCM_I_ variants or scm_is_*, scm_to_*, or scm_from_*, as appropriate.
SCM_VALIDATE_BIGINT, SCM_VALIDATE_INUM_MIN,
SCM_VALIDATE_INUM_MIN_COPY,
SCM_VALIDATE_INUM_MIN_DEF_COPY,SCM_VALIDATE_INUM_DEF,
SCM_VALIDATE_INUM_DEF_COPY, SCM_VALIDATE_INUM_RANGE,
SCM_VALIDATE_INUM_RANGE_COPY): Deprecated because they make the
fixnum/bignum distinction visible. Changed all uses to scm_to_size_t
or similar.
SCM_NEGATE_BOOL, SCM_BOOLP): Deprecated by moving into "deprecated.h".
Replaced all uses with scm_is_false, scm_is_true, scm_from_bool, and
scm_is_bool, respectively.
* configure.in: Removed -lm check and added a cached check for
__libc_stack_end to get it building for mingw32 hosts.
2003-05-29 Stefan Jahn <stefan@lkcc.org>
* win32-dirent.c: Use malloc() instead of scm_malloc().
* stime.c (s_scm_strftime): Add a type cast to avoid compiler
warning.
* posix.c (s_scm_putenv): Disable use of unsetenv() for the
mingw32 build.
* modules.c (s_scm_module_import_interface): Renamed local
variable interface to _interface. Seems like 'interface'
is a special compiler directive for the mingw32 compiler.
* mkstemp.c: Provide prototype to avoid compiler warning.
* load.c (s_scm_search_path): Fixed absolute and relative
path detections for native Windows platforms.
* gc.h, threads.h: Export some more symbols using SCM_API
(necessary to build on mingw32).
* gc-freelist.c ("s_scm_map_free_list",
"s_scm_gc_set_debug_check_freelist_x"): Fixed use of FUNC_NAME.
* fports.c (fport_fill_input): Disable use of
fport_wait_for_input() on Win32 platforms.
* filesys.c (s_scm_basename): Fixed __MINGW32__ code.
* Makefile.am: Modified some rules for cross compiling.
2003-05-29 Stefan Jahn <stefan@lkcc.org>
* raw-ltdl.c: Some more modifications for mingw32 platforms.
2003-05-29 Stefan Jahn <stefan@lkcc.org>
* Makefile.am (libguile_srfi_srfi_1_la_LDFLAGS,
libguile_srfi_srfi_4_la_LDFLAGS,
libguile_srfi_srfi_13_14__la_LDFLAGS): Added the -no-undefined
option for the mingw32 build.
2003-05-29 Stefan Jahn <stefan@lkcc.org>
* standalone/Makefile.am: Setup to build on mingw32.