1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-30 03:40:34 +02:00

Fix some invalid unicode handling issues with suspendable ports.

Fixes <https://bugs.gnu.org/62290>.

Based on the implementation in ports.c.  I don't understand what this
code is really doing, but the suspendable ports implementation differs
from the similar C code for a couple of inequalities.

* module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a
couple of inequalities.
* test-suite/tests/ports.test ("string ports"): Add additional invalid
UTF-8 test case.
* NEWS: Update.

Signed-off-by: Ludovic Courtès <ludo@gnu.org>
This commit is contained in:
Christopher Baines 2023-03-20 09:15:13 +00:00 committed by Ludovic Courtès
parent ffb95239aa
commit cba2e7e3fe
No known key found for this signature in database
GPG key ID: 090B11993D9AEBB5
3 changed files with 14 additions and 4 deletions

3
NEWS
View file

@ -23,6 +23,9 @@ the compiler reports it as "possibly unused".
* Bug fixes * Bug fixes
** (ice-9 suspendable-ports) incorrect UTF-8 decoding
(https://bugs.gnu.org/62290)
* Hashing of UTF-8 symbols with non-ASCII characters avoids corruption * Hashing of UTF-8 symbols with non-ASCII characters avoids corruption
This issue could cause `scm_from_utf8_symbol' and This issue could cause `scm_from_utf8_symbol' and

View file

@ -419,7 +419,7 @@
(= (logand u8_2 #xc0) #x80) (= (logand u8_2 #xc0) #x80)
(case u8_0 (case u8_0
((#xe0) (>= u8_1 #xa0)) ((#xe0) (>= u8_1 #xa0))
((#xed) (>= u8_1 #x9f)) ((#xed) (<= u8_1 #x9f))
(else #t))) (else #t)))
(kt (integer->char (kt (integer->char
(logior (ash (logand u8_0 #x0f) 12) (logior (ash (logand u8_0 #x0f) 12)
@ -436,7 +436,7 @@
(= (logand u8_3 #xc0) #x80) (= (logand u8_3 #xc0) #x80)
(case u8_0 (case u8_0
((#xf0) (>= u8_1 #x90)) ((#xf0) (>= u8_1 #x90))
((#xf4) (>= u8_1 #x8f)) ((#xf4) (<= u8_1 #x8f))
(else #t))) (else #t)))
(kt (integer->char (kt (integer->char
(logior (ash (logand u8_0 #x07) 18) (logior (ash (logand u8_0 #x07) 18)
@ -462,7 +462,7 @@
((< buffering 2) 1) ((< buffering 2) 1)
((not (= (logand (ref 1) #xc0) #x80)) 1) ((not (= (logand (ref 1) #xc0) #x80)) 1)
((and (eq? first-byte #xe0) (< (ref 1) #xa0)) 1) ((and (eq? first-byte #xe0) (< (ref 1) #xa0)) 1)
((and (eq? first-byte #xed) (< (ref 1) #x9f)) 1) ((and (eq? first-byte #xed) (> (ref 1) #x9f)) 1)
((< buffering 3) 2) ((< buffering 3) 2)
((not (= (logand (ref 2) #xc0) #x80)) 2) ((not (= (logand (ref 2) #xc0) #x80)) 2)
(else 0))) (else 0)))
@ -471,7 +471,7 @@
((< buffering 2) 1) ((< buffering 2) 1)
((not (= (logand (ref 1) #xc0) #x80)) 1) ((not (= (logand (ref 1) #xc0) #x80)) 1)
((and (eq? first-byte #xf0) (< (ref 1) #x90)) 1) ((and (eq? first-byte #xf0) (< (ref 1) #x90)) 1)
((and (eq? first-byte #xf4) (< (ref 1) #x8f)) 1) ((and (eq? first-byte #xf4) (> (ref 1) #x8f)) 1)
((< buffering 3) 2) ((< buffering 3) 2)
((not (= (logand (ref 2) #xc0) #x80)) 2) ((not (= (logand (ref 2) #xc0) #x80)) 2)
((< buffering 4) 3) ((< buffering 4) 3)

View file

@ -1059,6 +1059,13 @@
eof)) eof))
(test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8" (test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8"
(error ;; 2nd byte should be in the 90..BF range
error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte
eof))
(test-decoding-error (#xf4 #xa4 #xbd #xa4) "UTF-8"
(error ;; 2nd byte should be in the 90..BF range (error ;; 2nd byte should be in the 90..BF range
error ;; 88: not a valid starting byte error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte error ;; 88: not a valid starting byte