mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-10 14:00:21 +02:00
Change iconv procedures to take optional instead of keyword arg
* module/ice-9/iconv.scm (call-with-encoded-output-string): (string->bytevector, bytevector->string): Take an optional instead of a keyword argument. * doc/ref/api-data.texi (Representing Strings as Bytes): Adapt docs to change, and fix a number of errors. Thanks to Ludovic Courtès for the pointers. * test-suite/tests/iconv.test ("wide non-ascii string"): Add a test for the 'substitute path.
This commit is contained in:
parent
990b11c53f
commit
5ed4ea90a9
3 changed files with 29 additions and 15 deletions
|
@ -4190,6 +4190,11 @@ sequences of bytes. @xref{Bytevectors}, for more on how Guile
|
|||
represents raw byte sequences. This module gets its name from the
|
||||
common @sc{unix} command of the same name.
|
||||
|
||||
Note that often it is sufficient to just read and write strings from
|
||||
ports instead of using these functions. To do this, specify the port
|
||||
encoding using @code{set-port-encoding!}. @xref{Ports}, for more on
|
||||
ports and character encodings.
|
||||
|
||||
Unlike the rest of the procedures in this section, you have to load the
|
||||
@code{iconv} module before having access to these procedures:
|
||||
|
||||
|
@ -4197,31 +4202,32 @@ Unlike the rest of the procedures in this section, you have to load the
|
|||
(use-modules (ice-9 iconv))
|
||||
@end example
|
||||
|
||||
@deffn string->bytevector string encoding [#:conversion-strategy='error]
|
||||
@deffn string->bytevector string encoding [conversion-strategy]
|
||||
Encode @var{string} as a sequence of bytes.
|
||||
|
||||
The string will be encoded in the character set specified by the
|
||||
@var{encoding} string. If the string has characters that cannot be
|
||||
represented in the encoding, by default this procedure raises an
|
||||
@code{encoding-error}, though the @code{#:conversion-strategy} keyword
|
||||
can specify other behaviors.
|
||||
@code{encoding-error}. Pass a @var{conversion-strategy} argument to
|
||||
specify other behaviors.
|
||||
|
||||
The return value is a bytevector. @xref{Bytevectors}, for more on
|
||||
bytevectors. @xref{Ports}, for more on character encodings and
|
||||
conversion strategies.
|
||||
@end deffn
|
||||
|
||||
@deffn bytevector->string bytevector encoding
|
||||
@deffn bytevector->string bytevector encoding [conversion-strategy]
|
||||
Decode @var{bytevector} into a string.
|
||||
|
||||
The bytes will be decoded from the character set by the @var{encoding}
|
||||
string. If the bytes do not form a valid encoding, by default this
|
||||
procedure raises an @code{decoding-error}, though that may be overridden
|
||||
with the @code{#:conversion-strategy} keyword. @xref{Ports}, for more
|
||||
on character encodings and conversion strategies.
|
||||
procedure raises an @code{decoding-error}. As with
|
||||
@code{string->bytevector}, pass the optional @var{conversion-strategy}
|
||||
argument to modify this behavior. @xref{Ports}, for more on character
|
||||
encodings and conversion strategies.
|
||||
@end deffn
|
||||
|
||||
@deffn call-with-output-encoded-string encoding proc [#:conversion-strategy='error]
|
||||
@deffn call-with-output-encoded-string encoding proc [conversion-strategy]
|
||||
Like @code{call-with-output-string}, but instead of returning a string,
|
||||
returns a encoding of the string according to @var{encoding}, as a
|
||||
bytevector. This procedure can be more efficient than collecting a
|
||||
|
@ -4371,7 +4377,7 @@ If @var{lenp} is @code{NULL}, this function will return a null-terminated C
|
|||
string. It will throw an error if the string contains a null
|
||||
character.
|
||||
|
||||
The Scheme interface to this function is @code{encode-string}, from the
|
||||
The Scheme interface to this function is @code{string->bytevector}, from the
|
||||
@code{ice-9 iconv} module. @xref{Representing Strings as Bytes}.
|
||||
@end deftypefn
|
||||
|
||||
|
@ -4382,7 +4388,7 @@ string is passed as the ASCII, null-terminated C string @code{encoding}.
|
|||
The @var{handler} parameters suggests a strategy for dealing with
|
||||
unconvertable characters.
|
||||
|
||||
The Scheme interface to this function is @code{decode-string}.
|
||||
The Scheme interface to this function is @code{bytevector->string}.
|
||||
@xref{Representing Strings as Bytes}.
|
||||
@end deftypefn
|
||||
|
||||
|
|
|
@ -43,7 +43,8 @@
|
|||
bv))))
|
||||
|
||||
(define* (call-with-encoded-output-string encoding proc
|
||||
#:key (conversion-strategy 'error))
|
||||
#:optional
|
||||
(conversion-strategy 'error))
|
||||
(if (string-ci=? encoding "utf-8")
|
||||
;; I don't know why, but this appears to be faster; at least for
|
||||
;; serving examples/debug-sxml.scm (1464 reqs/s versus 850
|
||||
|
@ -59,16 +60,18 @@
|
|||
;; TODO: Provide C implementations that call scm_from_stringn and
|
||||
;; friends?
|
||||
|
||||
(define* (string->bytevector str encoding #:key (conversion-strategy 'error))
|
||||
(define* (string->bytevector str encoding
|
||||
#:optional (conversion-strategy 'error))
|
||||
(if (string-ci=? encoding "utf-8")
|
||||
(string->utf8 str)
|
||||
(call-with-encoded-output-string
|
||||
encoding
|
||||
(lambda (port)
|
||||
(display str port))
|
||||
#:conversion-strategy conversion-strategy)))
|
||||
conversion-strategy)))
|
||||
|
||||
(define* (bytevector->string bv encoding #:key (conversion-strategy 'error))
|
||||
(define* (bytevector->string bv encoding
|
||||
#:optional (conversion-strategy 'error))
|
||||
(if (string-ci=? encoding "utf-8")
|
||||
(utf8->string bv)
|
||||
(let ((p (open-bytevector-input-port bv)))
|
||||
|
|
|
@ -112,4 +112,9 @@
|
|||
(string->bytevector s "ascii"))
|
||||
|
||||
(pass-if-exception "encode as latin1" exception:encoding-error
|
||||
(string->bytevector s "latin1"))))
|
||||
(string->bytevector s "latin1"))
|
||||
|
||||
(pass-if "encode as ascii with substitutions"
|
||||
(equal? (make-string (string-length s) #\?)
|
||||
(bytevector->string (string->bytevector s "ascii" 'substitute)
|
||||
"ascii")))))
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue