1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-30 03:40:34 +02:00
Commit graph

12 commits

Author SHA1 Message Date
Andy Wingo
e30ee90478 Revert "Handle CRLF and Unicode line endings in read-line"
This reverts commit 0f983e3db0.

After discussing with Mike we are going to punt the read-line changes
for now.  Open the port in O_TEXT mode if you want to chomp the CR in
CFLF sequences.
2021-03-12 22:08:16 +01:00
Mike Gran
0f983e3db0 Handle CRLF and Unicode line endings in read-line
* libguile/rdelim.c (scm_read_line): handle CRLF, LS and PS
* module/ice-9/suspendable-ports.scm (read-line): handle CRLF, LS, and PS
* module/web/http.scm (read-header-line): take advantage of CRLF in read-line
   (read-header): don't need to test for \return
* test-suite/tests/rdelim.test: new tests for read-line CRLF, LS and PS
* doc/ref/api-io.texi: update doc for read-line
2021-03-11 19:42:33 -08:00
Andy Wingo
1e058add7b U+FFFD is the input substitution character
* libguile/ports.c (UNICODE_REPLACEMENT_CHARACTER):
* libguile/ports.c (peek_utf8_codepoint)
  (scm_port_decode_char, peek_iconv_codepoint):
* module/ice-9/sports.scm (peek-char-and-len/utf8):
  (peek-char-and-len/iconv): Return U+FFFD when we get a decoding error
  when reading, instead of '?', in accordance with Unicode
  recommendations.
* test-suite/tests/iconv.test:
* test-suite/tests/ports.test:
* test-suite/tests/rdelim.test: Update tests.
* NEWS: Update.
2016-05-16 10:48:35 +02:00
Andy Wingo
1953d29038 Decoding errors do not advance read pointer
* libguile/ports.c (scm_getc): If the port conversion strategy is
  'error, signal an error before advancing the read pointer.  This is a
  change from previous behavior; before, we advanced the read pointer
  under an understanding that that was what R6RS required.  But, that
  seems to be not the case.
* test-suite/tests/ports.test ("string ports"): Update decoding-error
  tests to assume that read-char with an error doesn't advance the read
  pointer.
* test-suite/tests/rdelim.test ("read-line"): Likewise.
2016-05-10 11:36:28 +02:00
Mark H Weaver
856d318a9f Merge branch 'stable-2.0'
Conflicts:
	benchmark-suite/benchmarks/ports.bm
	libguile/async.h
	libguile/bytevectors.c
	libguile/foreign.c
	libguile/gsubr.c
	libguile/srfi-1.c
	libguile/vm-engine.h
	libguile/vm-i-scheme.c
	module/Makefile.am
	module/language/tree-il/analyze.scm
	module/language/tree-il/peval.scm
	module/scripts/compile.scm
	module/scripts/disassemble.scm
	test-suite/tests/asm-to-bytecode.test
	test-suite/tests/peval.test
	test-suite/tests/rdelim.test
2014-09-30 03:50:47 -04:00
Ludovic Courtès
a41b07a34f rdelim: Speed up 'read-string' (aka. 'get-string-all'.)
This yields a 20% improvement on the "read-string" benchmark.

* module/ice-9/rdelim.scm (read-string): Rewrite as a 'case-lambda',
  with a tight loop around 'read-char', and without using
  'read-string!'.
* test-suite/tests/rdelim.test ("read-string")["longer than 100 chars,
  with limit"]: New test.
* benchmark-suite/benchmarks/ports.bm ("rdelim")["read-string"]: New
  benchmark.
2014-05-28 23:00:20 +02:00
Mark H Weaver
6dce942c46 String ports use UTF-8; ignore %default-port-encoding.
* libguile/strports.c (scm_mkstrport): Use UTF-8; ignore
  %default-port-encoding.  Rename 'str_len' and 'c_pos' to
  'num_bytes' and 'c_byte_pos'.  Interpret 'pos' argument
  as a character index instead of a byte index.

* module/ice-9/boot-9.scm (%cond-expand-features): Add srfi-6 to the
  list of core features.

* module/srfi/srfi-6.scm (open-input-string, open-output-string): Simply
  re-export these, since the core versions are now compliant.

* doc/ref/api-io.texi (String Ports): Remove text that describes
  non-compliant behavior of string ports with regard to encoding.

* doc/ref/srfi-modules.texi (SRFI-0): Add srfi-6 to the list of
  core features.
  (SRFI-6): Remove text that mentions non-compliant behavior of
  core string ports.

* module/ice-9/format.scm (format):
* module/ice-9/pretty-print.scm (truncated-print):
* module/rnrs/io/ports.scm (open-string-input-port,
  open-string-output-port):
* test-suite/test-suite/lib.scm (format-test-name):
* test-suite/tests/chars.test ("combining accent is pretty-printed",
  "combining X is pretty-printed"):
* test-suite/tests/ecmascript.test (eread, eread/1):
* test-suite/tests/rdelim.test:
* test-suite/tests/reader.test (read-string):
* test-suite/tests/regexp.test:
* test-suite/tests/srfi-105.test (read-string): Don't set
  %default-port-encoding before creating string ports.

* benchmark-suite/benchmarks/ports.bm (%latin1-port): Use
  'set-port-encoding!' to set the string port encoding.
  (%utf8/ascii-port, %utf8/wide-port, "rdelim"): Don't set
  %default-port-encoding before creating string ports.

* test-suite/tests/r6rs-ports.test ("lookahead-u8 non-ASCII"): Don't set
  %default-port-encoding before creating string ports.
  ("put-bytevector with UTF-16 string port", "put-bytevector with
  wrong-encoding string port"): Use 'set-port-encoding!' to set the
  string port encoding.

* test-suite/tests/print.test (tprint): Use 'set-port-encoding!' to set
  the string port encoding.
  ("truncated-print"): Use 'pass-if-equal'.

* test-suite/tests/ports.test ("encoding failure leads to exception",
  "%default-port-encoding is honored", "peek-char [latin-1]", "peek-char
  [utf-8]", "peek-char [utf-16]"): Remove tests.
  ("%default-port-encoding is ignored", "peek-char"): Add tests.
  ("suitable encoding [latin-1]", "suitable encoding [latin-3]",
  "wrong encoding, error", "wrong encoding, substitute",
  "wrong encoding, escape"): Use 'set-port-encoding!' to set the
  string port encoding.
  ("%default-port-encoding, wrong encoding"): Rewrite to use
  a file port instead of a string port.
2013-08-07 01:22:22 -04:00
Andy Wingo
5a35d42aa5 add read-string and read-string! to (ice-9 rdelim)
* module/ice-9/rdelim.scm (read-string!, read-string): New functions.
* test-suite/tests/rdelim.test: Add tests.
* doc/ref/api-io.texi: Add docs.

* module/ice-9/iconv.scm:
* module/rnrs/io/ports.scm:
* module/web/uri.scm: Use the new functions.
2013-01-22 15:15:43 +01:00
Ludovic Courtès
96128014bf Make sure binary ports pass `binary-port?' regardless of the locale.
* libguile/r6rs-ports.c (make_bip, make_cbip, make_bop, make_cbop):
  Set `c_port->encoding' to NULL.

* test-suite/tests/r6rs-ports.test ("7.2.7 Input
  Ports")["bytevector-input-port is binary"]: New test.
  ("7.2.7 Input Ports")["make-custom-binary-input-port"]: Make sure PORT
  passes `binary-port?' and `input-port?'.
  ("8.2.10 Output ports")["bytevector-output-port is binary"]: New test.
  ["make-custom-binary-output"]: Rename to...
  ["make-custom-binary-output-port"]: ... this.

* test-suite/tests/ports.test ("string ports")["read-char, wrong
  encoding, error", "read-char, wrong encoding, escape", "read-char,
  wrong encoding, substitute", "peek-char, wrong encoding, error"]: Use
  `set-port-encoding!' instead of `%default-port-encoding' to set the
  encoding of bytevector input ports.

* test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
  error, substitute"]: Likewise.

* doc/ref/api-io.texi (R6RS Port Manipulation): Document `binary-port?'
  and `textual-port?'.

* doc/ref/r6rs.texi (R6RS Incompatibilities): Mention the soft
  distinction between textual and binary ports.
2011-04-22 23:58:00 +02:00
Ludovic Courtès
fe949e7bc6 Add `read-delimited' tests.
* test-suite/tests/rdelim.test ("read-delimited", "read-delimited!"):
  New test prefixes.
2011-02-10 23:04:38 +01:00
Ludovic Courtès
c62da8f891 Have read-char' & co. throw to decoding-error'.
* libguile/ports.c (scm_read_char): Mention `decoding-error' in the
  docstring.
  (get_codepoint): Change to return an error code; add `codepoint'
  output parameter.  Don't raise an error from here.
  (scm_getc): Raise an error with `scm_decoding_error' if
  `get_codepoint' returns an error.
  (scm_peek_char): Likewise.  Update docstring.

* libguile/strings.c (scm_decoding_error_key): New variable.
  (scm_decoding_error): New function.
  (scm_from_stringn): Use `scm_decoding_error' instead of
  `scm_encoding_error'.

* libguile/strings.h (scm_decoding_error): New declaration.

* test-suite/tests/ports.test ("string ports")["read-char, wrong
  encoding, error"]: Change to expect `decoding-error'.  Make sure PORT
  points past the error.
  ["read-char, wrong encoding, escape"]: Likewise.
  ["peek-char, wrong encoding, error"]: New test.

* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
  Output")["put-bytevector with wrong-encoding string port"]: Change to
  expect `decoding-error'.
  ("8.2.6  Input and output ports")["transcoded-port [error handling
  mode = raise]"]: Likewise.

* test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
  error, substitute"]: New tests.

* doc/ref/api-io.texi (Reading): Update documentation of `read-char' and
  `peek-char'.
  (Line/Delimited): Update documentation of `read-line'.
2011-02-02 18:06:28 +01:00
Ludovic Courtès
a2c36371ce Rewrite read-line' in terms of scm_getc'.
As a result `read-line' handles decoding and decoding errors the same
way as `scm_getc'.  It's also simpler and free of `malloc' calls.

* libguile/rdelim.c (scm_do_read_line): Remove.
  (scm_read_line): Rewrite as a loop that calls `scm_getc'.

* test-suite/tests/rdelim.test: New file.
* test-suite/Makefile.am (SCM_TESTS): Add `tests/rdelim.test'.
2011-01-26 00:29:51 +01:00