mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-24 12:20:20 +02:00
Improve handling of Unicode byte-order marks (BOMs).
* libguile/ports-internal.h (struct scm_port_internal): Add new members 'at_stream_start_for_bom_read' and 'at_stream_start_for_bom_write'. (SCM_UNICODE_BOM): New macro. (scm_i_port_iconv_descriptors): Add 'mode' parameter to prototype. * libguile/ports.c (scm_new_port_table_entry): Initialize 'at_stream_start_for_bom_read' and 'at_stream_start_for_bom_write'. (get_iconv_codepoint): Pass new 'mode' parameter to 'scm_i_port_iconv_descriptors'. (get_codepoint): After reading a codepoint at stream start, record that we're no longer at stream start, and consume a BOM where appropriate. (scm_seek): Set the stream start flags according to the new position. (looking_at_bytes): New static function. (scm_utf8_bom, scm_utf16be_bom, scm_utf16le_bom, scm_utf32be_bom, scm_utf32le_bom): New static const arrays. (decide_utf16_encoding, decide_utf32_encoding): New static functions. (scm_i_port_iconv_descriptors): Add new 'mode' parameter. If the specified encoding is UTF-16 or UTF-32, make that precise by deciding what byte order to use, and construct iconv descriptors based on the precise encoding. (scm_i_set_port_encoding_x): Record that we are now at stream start. Do not open the new iconv descriptors immediately; let them be initialized lazily. * libguile/print.c (display_string_using_iconv): Record that we're no longer at stream start. Write a BOM if appropriate. * doc/ref/api-io.texi (BOM Handling): New node. * test-suite/tests/ports.test ("set-port-encoding!, wrong encoding"): Adapt test to cope with the fact that 'set-port-encoding!' does not immediately open the iconv descriptors. (bv-read-test): New procedure. ("unicode byte-order marks (BOMs)"): New test prefix.
This commit is contained in:
parent
45c0878b86
commit
cdd3d6c9f4
5 changed files with 515 additions and 32 deletions
|
@ -881,8 +881,24 @@ display_string_using_iconv (const void *str, int narrow_p, size_t len,
|
|||
{
|
||||
size_t printed;
|
||||
scm_t_iconv_descriptors *id;
|
||||
scm_t_port_internal *pti = SCM_PORT_GET_INTERNAL (port);
|
||||
|
||||
id = scm_i_port_iconv_descriptors (port);
|
||||
id = scm_i_port_iconv_descriptors (port, SCM_PORT_WRITE);
|
||||
|
||||
if (SCM_UNLIKELY (pti->at_stream_start_for_bom_write && len > 0))
|
||||
{
|
||||
scm_t_port *pt = SCM_PTAB_ENTRY (port);
|
||||
|
||||
/* Record that we're no longer at stream start. */
|
||||
pti->at_stream_start_for_bom_write = 0;
|
||||
if (pt->rw_random)
|
||||
pti->at_stream_start_for_bom_read = 0;
|
||||
|
||||
/* Write a BOM if appropriate. */
|
||||
if (SCM_UNLIKELY (strcasecmp(pt->encoding, "UTF-16") == 0
|
||||
|| strcasecmp(pt->encoding, "UTF-32") == 0))
|
||||
display_character (SCM_UNICODE_BOM, port, iconveh_error);
|
||||
}
|
||||
|
||||
printed = 0;
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue