1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-30 03:40:34 +02:00

scm_i_utf8_string_hash: compute u8 chars not bytes

Noticed while investigating a migration to utf-8 strings.  After making
changes that routed non-ascii symbol hashing through this function,
encoding-iso88597.test began intermittently failing because it would
traverse trailing garbage when u8_strnlen reported 8 chars instead of 4.

Change the scm_i_str2symbol and scm_i_str2uninterned_symbol internal
hash type to unsigned long to explicitly match the scm_i_string_hash
result type.

* libguile/hash.c (scm_i_utf8_string_hash): Call u8_mbsnlen not u8_strnlen.
* libguile/symbols.c (scm_i_str2symbol, scm_i_str2uninterned_symbol):
Use unsigned long for scm_i_string_hash result.
* test-suite/standalone/.gitignore: Add test-hashing.
* test-suite/standalone/Makefile.am: Add test-hashing.
* test-suite/standalone/test-hashing.c: Add.
This commit is contained in:
Rob Browning 2023-03-12 14:26:10 -05:00
parent f0df1ed0fd
commit ffb95239aa
6 changed files with 86 additions and 3 deletions

12
NEWS
View file

@ -21,6 +21,18 @@ definitely unused---this is notably the case for modules that are only
used at macro-expansion time, such as (srfi srfi-26). In those cases,
the compiler reports it as "possibly unused".
* Bug fixes
* Hashing of UTF-8 symbols with non-ASCII characters avoids corruption
This issue could cause `scm_from_utf8_symbol' and
`scm_from_utf8_symboln` to incorrectly conclude that the symbol hadn't
already been interned, and then create a new one, which of course
wouldn't be `eq?' to the other(s). The incorrect hash was the result of
a buffer overrun, and so might vary. This problem affected a number of
other operations, given the internal use of those functions.
(<https://bugs.gnu.org/56413>)
Changes in 3.0.9 (since 3.0.8)