mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-05-01 04:10:18 +02:00
Document string-bytes-per-char and %string-dump
* doc/ref/api-data.texi (String Internals): new subsubsection. Document string-bytes-per-char and %string-dump.
This commit is contained in:
parent
7beae9f15a
commit
d85ae24dfb
1 changed files with 66 additions and 0 deletions
|
@ -2601,6 +2601,7 @@ If you want to prevent modifications, use @code{substring/read-only}.
|
|||
Guile provides all procedures of SRFI-13 and a few more.
|
||||
|
||||
@menu
|
||||
* String Internals:: The storage strategy for strings.
|
||||
* String Syntax:: Read syntax for strings.
|
||||
* String Predicates:: Testing strings for certain properties.
|
||||
* String Constructors:: Creating new string objects.
|
||||
|
@ -2616,6 +2617,71 @@ Guile provides all procedures of SRFI-13 and a few more.
|
|||
* Conversion to/from C::
|
||||
@end menu
|
||||
|
||||
@node String Internals
|
||||
@subsubsection String Internals
|
||||
|
||||
Guile stores each string in memory as a contiguous array of Unicode code
|
||||
points along with an associated set of attributes. If all of the code
|
||||
points of a string have an integer range between 0 and 255 inclusive,
|
||||
the code point array is stored as one byte per code point: it is stored
|
||||
as an ISO-8859-1 (aka Latin-1) string. If any of the code points of the
|
||||
string has an integer value greater that 255, the code point array is
|
||||
stored as four bytes per code point: it is stored as a UTF-32 string.
|
||||
|
||||
Conversion between the one-byte-per-code-point and
|
||||
four-bytes-per-code-point representations happens automatically as
|
||||
necessary.
|
||||
|
||||
No API is provided to set the internal representation of strings;
|
||||
however, there are pair of procedures available to query it. These are
|
||||
debugging procedures. Using them in production code is discouraged,
|
||||
since the details of Guile's internal representation of strings may
|
||||
change from release to release.
|
||||
|
||||
@deffn {Scheme Procedure} string-bytes-per-char str
|
||||
@deffnx {C Function} scm_string_bytes_per_char (str)
|
||||
Return the number of bytes used to encode a Unicode code point in string
|
||||
@var{str}. The result is one or four.
|
||||
@end deffn
|
||||
|
||||
@deffn {Scheme Procedure} %string-dump str
|
||||
@deffnx {C Function} scm_sys_string_dump (str)
|
||||
Returns an association list containing debugging information for
|
||||
@var{str}. The association list has the following entries.
|
||||
@table @code
|
||||
|
||||
@item string
|
||||
The string itself.
|
||||
|
||||
@item start
|
||||
The start index of the string into its stringbuf
|
||||
|
||||
@item length
|
||||
The length of the string
|
||||
|
||||
@item shared
|
||||
If this string is a substring, it returns its
|
||||
parent string. Otherwise, it returns @code{#f}
|
||||
|
||||
@item read-only
|
||||
@code{#t} if the string is read-only
|
||||
|
||||
@item stringbuf-chars
|
||||
A new string containing this string's stringbuf's characters
|
||||
|
||||
@item stringbuf-length
|
||||
The number of characters in this stringbuf
|
||||
|
||||
@item stringbuf-shared
|
||||
@code{#t} if this stringbuf is shared
|
||||
|
||||
@item stringbuf-wide
|
||||
@code{#t} if this stringbuf's characters are stored in a 32-bit buffer,
|
||||
or @code{#f} if they are stored in an 8-bit buffer
|
||||
@end table
|
||||
@end deffn
|
||||
|
||||
|
||||
@node String Syntax
|
||||
@subsubsection String Read Syntax
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue