mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-11 14:21:10 +02:00
Change iconv procedures to take optional instead of keyword arg
* module/ice-9/iconv.scm (call-with-encoded-output-string): (string->bytevector, bytevector->string): Take an optional instead of a keyword argument. * doc/ref/api-data.texi (Representing Strings as Bytes): Adapt docs to change, and fix a number of errors. Thanks to Ludovic Courtès for the pointers. * test-suite/tests/iconv.test ("wide non-ascii string"): Add a test for the 'substitute path.
This commit is contained in:
parent
990b11c53f
commit
5ed4ea90a9
3 changed files with 29 additions and 15 deletions
|
@ -4190,6 +4190,11 @@ sequences of bytes. @xref{Bytevectors}, for more on how Guile
|
||||||
represents raw byte sequences. This module gets its name from the
|
represents raw byte sequences. This module gets its name from the
|
||||||
common @sc{unix} command of the same name.
|
common @sc{unix} command of the same name.
|
||||||
|
|
||||||
|
Note that often it is sufficient to just read and write strings from
|
||||||
|
ports instead of using these functions. To do this, specify the port
|
||||||
|
encoding using @code{set-port-encoding!}. @xref{Ports}, for more on
|
||||||
|
ports and character encodings.
|
||||||
|
|
||||||
Unlike the rest of the procedures in this section, you have to load the
|
Unlike the rest of the procedures in this section, you have to load the
|
||||||
@code{iconv} module before having access to these procedures:
|
@code{iconv} module before having access to these procedures:
|
||||||
|
|
||||||
|
@ -4197,31 +4202,32 @@ Unlike the rest of the procedures in this section, you have to load the
|
||||||
(use-modules (ice-9 iconv))
|
(use-modules (ice-9 iconv))
|
||||||
@end example
|
@end example
|
||||||
|
|
||||||
@deffn string->bytevector string encoding [#:conversion-strategy='error]
|
@deffn string->bytevector string encoding [conversion-strategy]
|
||||||
Encode @var{string} as a sequence of bytes.
|
Encode @var{string} as a sequence of bytes.
|
||||||
|
|
||||||
The string will be encoded in the character set specified by the
|
The string will be encoded in the character set specified by the
|
||||||
@var{encoding} string. If the string has characters that cannot be
|
@var{encoding} string. If the string has characters that cannot be
|
||||||
represented in the encoding, by default this procedure raises an
|
represented in the encoding, by default this procedure raises an
|
||||||
@code{encoding-error}, though the @code{#:conversion-strategy} keyword
|
@code{encoding-error}. Pass a @var{conversion-strategy} argument to
|
||||||
can specify other behaviors.
|
specify other behaviors.
|
||||||
|
|
||||||
The return value is a bytevector. @xref{Bytevectors}, for more on
|
The return value is a bytevector. @xref{Bytevectors}, for more on
|
||||||
bytevectors. @xref{Ports}, for more on character encodings and
|
bytevectors. @xref{Ports}, for more on character encodings and
|
||||||
conversion strategies.
|
conversion strategies.
|
||||||
@end deffn
|
@end deffn
|
||||||
|
|
||||||
@deffn bytevector->string bytevector encoding
|
@deffn bytevector->string bytevector encoding [conversion-strategy]
|
||||||
Decode @var{bytevector} into a string.
|
Decode @var{bytevector} into a string.
|
||||||
|
|
||||||
The bytes will be decoded from the character set by the @var{encoding}
|
The bytes will be decoded from the character set by the @var{encoding}
|
||||||
string. If the bytes do not form a valid encoding, by default this
|
string. If the bytes do not form a valid encoding, by default this
|
||||||
procedure raises an @code{decoding-error}, though that may be overridden
|
procedure raises an @code{decoding-error}. As with
|
||||||
with the @code{#:conversion-strategy} keyword. @xref{Ports}, for more
|
@code{string->bytevector}, pass the optional @var{conversion-strategy}
|
||||||
on character encodings and conversion strategies.
|
argument to modify this behavior. @xref{Ports}, for more on character
|
||||||
|
encodings and conversion strategies.
|
||||||
@end deffn
|
@end deffn
|
||||||
|
|
||||||
@deffn call-with-output-encoded-string encoding proc [#:conversion-strategy='error]
|
@deffn call-with-output-encoded-string encoding proc [conversion-strategy]
|
||||||
Like @code{call-with-output-string}, but instead of returning a string,
|
Like @code{call-with-output-string}, but instead of returning a string,
|
||||||
returns a encoding of the string according to @var{encoding}, as a
|
returns a encoding of the string according to @var{encoding}, as a
|
||||||
bytevector. This procedure can be more efficient than collecting a
|
bytevector. This procedure can be more efficient than collecting a
|
||||||
|
@ -4371,7 +4377,7 @@ If @var{lenp} is @code{NULL}, this function will return a null-terminated C
|
||||||
string. It will throw an error if the string contains a null
|
string. It will throw an error if the string contains a null
|
||||||
character.
|
character.
|
||||||
|
|
||||||
The Scheme interface to this function is @code{encode-string}, from the
|
The Scheme interface to this function is @code{string->bytevector}, from the
|
||||||
@code{ice-9 iconv} module. @xref{Representing Strings as Bytes}.
|
@code{ice-9 iconv} module. @xref{Representing Strings as Bytes}.
|
||||||
@end deftypefn
|
@end deftypefn
|
||||||
|
|
||||||
|
@ -4382,7 +4388,7 @@ string is passed as the ASCII, null-terminated C string @code{encoding}.
|
||||||
The @var{handler} parameters suggests a strategy for dealing with
|
The @var{handler} parameters suggests a strategy for dealing with
|
||||||
unconvertable characters.
|
unconvertable characters.
|
||||||
|
|
||||||
The Scheme interface to this function is @code{decode-string}.
|
The Scheme interface to this function is @code{bytevector->string}.
|
||||||
@xref{Representing Strings as Bytes}.
|
@xref{Representing Strings as Bytes}.
|
||||||
@end deftypefn
|
@end deftypefn
|
||||||
|
|
||||||
|
|
|
@ -43,7 +43,8 @@
|
||||||
bv))))
|
bv))))
|
||||||
|
|
||||||
(define* (call-with-encoded-output-string encoding proc
|
(define* (call-with-encoded-output-string encoding proc
|
||||||
#:key (conversion-strategy 'error))
|
#:optional
|
||||||
|
(conversion-strategy 'error))
|
||||||
(if (string-ci=? encoding "utf-8")
|
(if (string-ci=? encoding "utf-8")
|
||||||
;; I don't know why, but this appears to be faster; at least for
|
;; I don't know why, but this appears to be faster; at least for
|
||||||
;; serving examples/debug-sxml.scm (1464 reqs/s versus 850
|
;; serving examples/debug-sxml.scm (1464 reqs/s versus 850
|
||||||
|
@ -59,16 +60,18 @@
|
||||||
;; TODO: Provide C implementations that call scm_from_stringn and
|
;; TODO: Provide C implementations that call scm_from_stringn and
|
||||||
;; friends?
|
;; friends?
|
||||||
|
|
||||||
(define* (string->bytevector str encoding #:key (conversion-strategy 'error))
|
(define* (string->bytevector str encoding
|
||||||
|
#:optional (conversion-strategy 'error))
|
||||||
(if (string-ci=? encoding "utf-8")
|
(if (string-ci=? encoding "utf-8")
|
||||||
(string->utf8 str)
|
(string->utf8 str)
|
||||||
(call-with-encoded-output-string
|
(call-with-encoded-output-string
|
||||||
encoding
|
encoding
|
||||||
(lambda (port)
|
(lambda (port)
|
||||||
(display str port))
|
(display str port))
|
||||||
#:conversion-strategy conversion-strategy)))
|
conversion-strategy)))
|
||||||
|
|
||||||
(define* (bytevector->string bv encoding #:key (conversion-strategy 'error))
|
(define* (bytevector->string bv encoding
|
||||||
|
#:optional (conversion-strategy 'error))
|
||||||
(if (string-ci=? encoding "utf-8")
|
(if (string-ci=? encoding "utf-8")
|
||||||
(utf8->string bv)
|
(utf8->string bv)
|
||||||
(let ((p (open-bytevector-input-port bv)))
|
(let ((p (open-bytevector-input-port bv)))
|
||||||
|
|
|
@ -112,4 +112,9 @@
|
||||||
(string->bytevector s "ascii"))
|
(string->bytevector s "ascii"))
|
||||||
|
|
||||||
(pass-if-exception "encode as latin1" exception:encoding-error
|
(pass-if-exception "encode as latin1" exception:encoding-error
|
||||||
(string->bytevector s "latin1"))))
|
(string->bytevector s "latin1"))
|
||||||
|
|
||||||
|
(pass-if "encode as ascii with substitutions"
|
||||||
|
(equal? (make-string (string-length s) #\?)
|
||||||
|
(bytevector->string (string->bytevector s "ascii" 'substitute)
|
||||||
|
"ascii")))))
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue