1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-05-20 11:40:18 +02:00

Fix uri-decode behavior for "+"

* module/web/uri.scm (uri-decode): Add #:decode-plus-to-space? keyword
  argument.
  (split-and-decode-uri-path): Don't decode plus to space.
* doc/ref/web.texi (URIs): Update documentation.
* test-suite/tests/web-uri.test ("decode"): Add tests.
* NEWS: Add entry.

Based on a patch by Brent <brent@tomski.co.za>.
This commit is contained in:
Andy Wingo 2016-06-20 14:34:19 +02:00
parent 4cf81b7ba0
commit 687d393e2c
4 changed files with 25 additions and 5 deletions

7
NEWS
View file

@ -6,6 +6,13 @@ Please send Guile bug reports to bug-guile@gnu.org.
Changes in 2.1.4 (changes since the 2.1.3 alpha release):
* Bug fixes
** Don't replace + with space when splitting and decoding URI paths
[TODO: Fold into generic 2.2 release notes.]
Changes in 2.1.3 (changes since the 2.1.2 alpha release):
* Notable changes

View file

@ -269,7 +269,7 @@ serialization.
Declare a default port for the given URI scheme.
@end deffn
@deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
@deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}] [#:decode-plus-to-space? #t]
Percent-decode the given @var{str}, according to @var{encoding}, which
should be the name of a character encoding.
@ -286,6 +286,11 @@ decoded bytes are not valid for the given encoding. Pass @code{#f} for
@xref{Ports, @code{set-port-encoding!}}, for more information on
character encodings.
If @var{decode-plus-to-space?} is true, which is the default, also
replace instances of the plus character @samp{+} with a space character.
This is needed when parsing @code{application/x-www-form-urlencoded}
data.
Returns a string of the decoded characters, or a bytevector if
@var{encoding} was @code{#f}.
@end deffn

View file

@ -322,7 +322,7 @@ serialization."
(define hex-chars
(string->char-set "0123456789abcdefABCDEF"))
(define* (uri-decode str #:key (encoding "utf-8"))
(define* (uri-decode str #:key (encoding "utf-8") (decode-plus-to-space? #t))
"Percent-decode the given STR, according to ENCODING,
which should be the name of a character encoding.
@ -338,6 +338,10 @@ bytes are not valid for the given encoding. Pass #f for ENCODING if
you want decoded bytes as a bytevector directly. set-port-encoding!,
for more information on character encodings.
If DECODE-PLUS-TO-SPACE? is true, which is the default, also replace
instances of the plus character (+) with a space character. This is
needed when parsing application/x-www-form-urlencoded data.
Returns a string of the decoded characters, or a bytevector if
ENCODING was #f."
(let* ((len (string-length str))
@ -348,7 +352,7 @@ ENCODING was #f."
(if (< i len)
(let ((ch (string-ref str i)))
(cond
((eqv? ch #\+)
((and (eqv? ch #\+) decode-plus-to-space?)
(put-u8 port (char->integer #\space))
(lp (1+ i)))
((and (< (+ i 2) len) (eqv? ch #\%)
@ -431,7 +435,8 @@ removing empty components.
For example, \"/foo/bar%20baz/\" decodes to the two-element list,
(\"foo\" \"bar baz\")."
(filter (lambda (x) (not (string-null? x)))
(map uri-decode (string-split path #\/))))
(map (lambda (s) (uri-decode s #:decode-plus-to-space? #f))
(string-split path #\/))))
(define (encode-and-join-uri-path parts)
"URI-encode each element of PARTS, which should be a list of

View file

@ -594,7 +594,10 @@
(equal? "foo bar" (uri-decode "foo%20bar")))
(pass-if "foo+bar"
(equal? "foo bar" (uri-decode "foo+bar"))))
(equal? "foo bar" (uri-decode "foo+bar")))
(pass-if "foo+bar"
(equal? '("foo+bar") (split-and-decode-uri-path "foo+bar"))))
(with-test-prefix "encode"
(pass-if (equal? "foo%20bar" (uri-encode "foo bar")))