@node SRFI-13/14 @chapter SRFI-13 and SRFI-14 This chapter documents the SRFI-13/14 library, which provides the string utility procedures defined in SRFI-13 and the character-set procedures defined in SRFI-14 for Guile. @menu * Introduction:: What is this all about? * Loading SRFI-13/14:: Loading the module into a running Guile. * String Functions:: Available string processing procedures. * Character-set Procedures:: Procedures for manipulating character sets. @end menu @c =================================================================== @node Introduction @section Introduction The SRFI-13/14 library is a shared library which provides the procedures defined in SRFI-13 (string library) and the procedures defined in SRFI-14 (character-set library). You should also refer to the SRFI documents, which provide some details I will not document here. If you don't know what SRFI means, and what all the numbers are about, you may want to refer to the SRFI home page at @url{http://srfi.schemers.org}. Note that only the procedures from SRFI-13 are documented here which are not already contained in Guile. For procedures not documented here please refer to the relevant chapters in the Guile Reference Manual, for example the documentation of strings and string procedures (REFFIXME). The SRFI-14 procedures are documented completely. @menu * What can be done:: What is possible with SRFI-13/14 * What cannot be done:: and what is not? @end menu @c =================================================================== @node What can be done @subsection What can be done All of the procedures defined in SRFI-13, which are not already included in the Guile core library, are implemented in the module @code{(srfi srfi-13)}. The procedures which are both in Guile and in SRFI-13, but which are slightly extended, have been implemented in this module, and the bindings overwrite those in the Guile core. All procedures from SRFI-14 (character-set library) are implemented in the module @code{(srfi srfi-14)}, as well as the standard variables @code{char-set:letter}, @code{char-set:digit} etc. @c =================================================================== @node What cannot be done @subsection What cannot be done The procedures which are defined in the section @emph{Low-level procedures} of SRFI-13 for parsing optional string indices, substring specification checking and Knuth-Morris-Pratt-Searching are not implemented. The procedures @code{string-contains} and @code{string-contains-ci} are not implemented very efficiently at the moment. This will be changed as soon as possible. @c =================================================================== @node Loading SRFI-13/14 @section Loading SRFI-13/14 When Guile is properly installed, it can be loaded into a running Guile by using the @code{(srfi srfi-13)} module. @example $ guile guile> (use-modules (srfi srfi-13)) guile> @end example When this step causes any errors, Guile is not properly installed. One possible reason is that Guile cannot find either the Scheme module file @file{srfi-13.scm}, or it cannot find the shared object file @file{libguile-srfi-srfi-13-14.so}. Make sure that the former is in the Guile load path and that the latter is either installed in some default location like @file{/usr/local/lib} or that the directory it was installed to is in your @code{LTDL_LIBRARY_PATH}. The same applies to @file{srfi-14.scm}. Now you can test whether the SRFI-13 procedures are working by calling the @code{string-concatenate} procedure. @example guile> (string-concatenate '("Hello" " " "World!")) "Hello World!" @end example The same goes for the SRFI-14 module, of course. @example $ guile guile> (use-modules (srfi srfi-14)) guile> (char-set-union (char-set #\f #\o #\o) (string->char-set "bar")) # guile> @end example @c =================================================================== @node String Functions @section String Functions In this section, we will describe all procedures defined in SRFI-13 (string library) and implemented by the module @code{(srfi srfi-13)}. Except for the procedures in the section @emph{Low-level procedures} of SRFI-13, all string procedures defined there are implemented completely. @menu * Predicates:: Testing strings. * SRFI-13 Constructors:: Constructing strings. * SRFI-13 List/String Conversion:: Converstion from/to character lists. * SRFI-13 Selection:: Selecting portions from strings. * SRFI-13 Modification:: Modifying string in--place. * SRFI-13 Comparison:: Comparing strings. * Prefixes/Suffixes:: Checking for common pre-/suffixes. * Searching:: Searching in strings. * Case Mapping:: Changing the case of strings. * Reverse/Append:: Append, concatenate and reverse strings. * Fold/Unfold/Map:: Fold/Unfold/Map over strings. * Replicate/Rotate:: String replication and rotation. * Miscellaneous:: Miscellaneous string procedures. * Filtering/Deleting:: Deleting characters from strings. @end menu @c =================================================================== @node Predicates @subsection Predicates In addition to the primitives @code{string?} and @code{string-null?}, which are already in the Guile core, the string predicates @code{string-any} and @code{string-every} are defined by SRFI-13. @deffn primitive string-any pred s [start end] Check if the predicate @var{pred} is true for any character in the string @var{s}, proceeding from left (index @var{start}) to right (index @var{end}). If @code{string-any} returns true, the returned true value is the one produced by the first successful application of @var{pred}. @end deffn @deffn primitive string-every pred s [start end] Check if the predicate @var{pred} is true for every character in the string @var{s}, proceeding from left (index @var{start}) to right (index @var{end}). If @code{string-every} returns true, the returned true value is the one produced by the final application of @var{pred} to the last character of @var{s}. @end deffn @c =================================================================== @node SRFI-13 Constructors @subsection Constructors SRFI-13 defines several procedures for constructing new strings. In addition to @code{make-string} and @code{string} (available in the Guile core library), the procedure @code{string-tabulate} does exist. @deffn primitive string-tabulate proc len @var{proc} is an integer->char procedure. Construct a string of size @var{len} by applying @var{proc} to each index to produce the corresponding string element. The order in which @var{proc} is applied to the indices is not specified. @end deffn @c =================================================================== @node SRFI-13 List/String Conversion @subsection List/String Conversion The procedure @code{string->list} is extended by SRFI-13, that is why it is included in @code{(srfi srfi-13)}. The other procedures are new. The Guile core already contains the procedure @code{list->string} for converting a list of characters into a string (REFFIXME). @deffn primitive string->list str [start end] Convert the string @var{str} into a list of characters. @end deffn @deffn primitive reverse-list->string chrs An efficient implementation of @code{(compose string->list reverse)}: @smalllisp (reverse-list->string '(#\a #\B #\c)) @result{} "cBa" @end smalllisp @end deffn @deffn primitive string-join ls [delimiter grammar] Append the string in the string list @var{ls}, using the string @var{delim} as a delimiter between the elements of @var{ls}. @var{grammar} is a symbol which specifies how the delimiter is placed between the strings, and defaults to the symbol @code{infix}. @table @code @item infix Insert the separator between list elements. An empty string will produce an empty list. @item string-infix Like @code{infix}, but will raise an error if given the empty list. @item suffix Insert the separator after every list element. @item prefix Insert the separator before each list element. @end table @end deffn @c =================================================================== @node SRFI-13 Selection @subsection Selection These procedures are called @dfn{selectors}, because they access information about the string or select pieces of a given string. Additional selector procedures are documented in the Strings section (REFFIXME), like @code{string-length} or @code{string-ref}. @code{string-copy} is also available in core Guile, but this version accepts additional start/end indices. @deffn primitive string-copy str [start end] Return a freshly allocated copy of the string @var{str}. If given, @var{start} and @var{end} delimit the portion of @var{str} which is copied. @end deffn @deffn primitive substring/shared str start [end] Like @code{substring}, but the result may share memory with the argument @var{str}. @end deffn @deffn primitive string-copy! target tstart s [start end] Copy the sequence of characters from index range [@var{start}, @var{end}) in string @var{s} to string @var{target}, beginning at index @var{tstart}. The characters are copied left-to-right or right-to-left as needed -- the copy is guaranteed to work, even if @var{target} and @var{s} are the same string. It is an error if the copy operation runs off the end of the target string. @end deffn @deffn primitive string-take s n @deffnx primitive string-take-right s n Return the @var{n} first/last characters of @var{s}. @end deffn @deffn primitive string-drop s n @deffnx primitive string-drop-right s n Return all but the first/last @var{n} characters of @var{s}. @end deffn @deffn primitive string-pad s len [chr start end] @deffnx primitive string-pad-right s len [chr start end] Take that characters from @var{start} to @var{end} from the string @var{s} and return a new string, right(left)-padded by the character @var{chr} to length @var{len}. If the resulting string is longer than @var{len}, it is truncated on the right (left). @end deffn @deffn primitive string-trim s [char_pred start end] @deffnx primitive string-trim-right s [char_pred start end] @deffnx primitive string-trim-both s [char_pred start end] Trim @var{s} by skipping over all characters on the left/right/both sides of the string that satisfy the parameter @var{char_pred}: @itemize @bullet @item if it is the character @var{ch}, characters equal to @var{ch} are trimmed, @item if it is a procedure @var{pred} characters that satisfy @var{pred} are trimmed, @item if it is a character set, characters in that set are trimmed. @end itemize If called without a @var{char_pred} argument, all whitespace is trimmed. @end deffn @c =================================================================== @node SRFI-13 Modification @subsection Modification The procedure @code{string-fill!} is extended from R5RS because it accepts optional start/end indices. This bindings shadows the procedure of the same name in the Guile core. The second modification procedure @code{string-set!} is documented in the Strings section (REFFIXME). @deffn primitive string-fill! str chr [start end] Stores @var{chr} in every element of the given @var{str} and returns an unspecified value. @end deffn @c =================================================================== @node SRFI-13 Comparison @subsection Comparison The procedures in this section are used for comparing strings in different ways. The comparison predicates differ from those in R5RS in that they do not only return @code{#t} or @code{#f}, but the mismatch index in the case of a true return value. @code{string-hash} and @code{string-hash-ci} are for calculating hash values for strings, useful for implementing fast lookup mechanisms. @deffn primitive string-compare s1 s2 proc_lt proc_eq proc_gt [start1 end1 start2 end2] @deffnx primitive string-compare-ci s1 s2 proc_lt proc_eq proc_gt [start1 end1 start2 end2] Apply @var{proc_lt}, @var{proc_eq}, @var{proc_gt} to the mismatch index, depending upon whether @var{s1} is less than, equal to, or greater than @var{s2}. The mismatch index is the largest index @var{i} such that for every 0 <= @var{j} < @var{i}, @var{s1}[@var{j}] = @var{s2}[@var{j}] -- that is, @var{i} is the first position that does not match. The character comparison is done case-insensitively. @end deffn @deffn primitive string= s1 s2 [start1 end1 start2 end2] @deffnx primitive string<> s1 s2 [start1 end1 start2 end2] @deffnx primitive string< s1 s2 [start1 end1 start2 end2] @deffnx primitive string> s1 s2 [start1 end1 start2 end2] @deffnx primitive string<= s1 s2 [start1 end1 start2 end2] @deffnx primitive string>= s1 s2 [start1 end1 start2 end2] Compare @var{s1} and @var{s2} and return @code{#f} if the predicate fails. Otherwise, the mismatch index is returned (or @var{end1} in the case of @code{string=}. @end deffn @deffn primitive string-ci= s1 s2 [start1 end1 start2 end2] @deffnx primitive string-ci<> s1 s2 [start1 end1 start2 end2] @deffnx primitive string-ci< s1 s2 [start1 end1 start2 end2] @deffnx primitive string-ci> s1 s2 [start1 end1 start2 end2] @deffnx primitive string-ci<= s1 s2 [start1 end1 start2 end2] @deffnx primitive string-ci>= s1 s2 [start1 end1 start2 end2] Compare @var{s1} and @var{s2} and return @code{#f} if the predicate fails. Otherwise, the mismatch index is returned (or @var{end1} in the case of @code{string=}. These are the case-insensitive variants. @end deffn @deffn primitive string-hash s [bound start end] @deffnx primitive string-hash-ci s [bound start end] Return a hash value of the string @var{s} in the range 0 @dots{} @var{bound} - 1. @code{string-hash-ci} is the case-insensitive variant. @end deffn @c =================================================================== @node Prefixes/Suffixes @subsection Prefixes/Suffixes Using these procedures you can determine whether a given string is a prefix or suffix of another string or how long a common prefix/suffix is. @deffn primitive string-prefix-length s1 s2 [start1 end1 start2 end2] @deffnx primitive string-prefix-length-ci s1 s2 [start1 end1 start2 end2] @deffnx primitive string-suffix-length s1 s2 [start1 end1 start2 end2] @deffnx primitive string-suffix-length-ci s1 s2 [start1 end1 start2 end2] Return the length of the longest common prefix/suffix of the two strings. @code{string-prefix-length-ci} and @code{string-suffix-length-ci} are the case-insensitive variants. @end deffn @deffn primitive string-prefix? s1 s2 [start1 end1 start2 end2] @deffnx primitive string-prefix-ci? s1 s2 [start1 end1 start2 end2] @deffnx primitive string-suffix? s1 s2 [start1 end1 start2 end2] @deffnx primitive string-suffix-ci? s1 s2 [start1 end1 start2 end2] Is @var{s1} a prefix/suffix of @var{s2}. @code{string-prefix-ci?} and @code{string-suffix-ci?} are the case-insensitive variants. @end deffn @c =================================================================== @node Searching @subsection Searching Use these procedures to find out whether a string contains a given character or a given substring, or a character from a set of characters. @deffn primitive string-index s char_pred [start end] @deffnx primitive string-index-right s char_pred [start end] Search through the string @var{s} from left to right (right to left), returning the index of the first (last) occurence of a character which @itemize @item equals @var{char_pred}, if it is character, @item satisifies the predicate @var{char_pred}, if it is a procedure, @item is in the set @var{char_pred}, if it is a character set. @end itemize @end deffn @deffn primitive string-skip s char_pred [start end] @deffnx primitive string-skip-right s char_pred [start end] Search through the string @var{s} from left to right (right to left), returning the index of the first (last) occurence of a character which @itemize @item does not equal @var{char_pred}, if it is character, @item does not satisify the predicate @var{char_pred}, if it is a procedure. @item is not in the set if @var{char_pred} is a character set. @end itemize @end deffn @deffn primitive string-count s char_pred [start end] Return the count of the number of characters in the string @var{s} which @itemize @bullet @item equals @var{char_pred}, if it is character, @item satisifies the predicate @var{char_pred}, if it is a procedure. @item is in the set @var{char_pred}, if it is a character set. @end itemize @end deffn @deffn primitive string-contains s1 s2 [start1 end1 start2 end2] @deffnx primitive string-contains-ci s1 s2 [start1 end1 start2 end2] Does string @var{s1} contain string @var{s2}? Return the index in @var{s1} where @var{s2} occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings. @code{string-contains-ci} is the case-insensitive variant. @end deffn @c =================================================================== @node Case Mapping @subsection Alphabetic Case Mapping These procedures convert the alphabetic case of strings. They are similar to the procedures in the Guile core, but are extended to handle optional start/end indices. @deffn primitive string-upcase s [start end] @deffnx primitive string-upcase! s [start end] Upcase every character in @var{s}. @code{string-upcase!} is the side-effecting variant. @end deffn @deffn primitive string-downcase s [start end] @deffnx primitive string-downcase! s [start end] Downcase every character in @var{s}. @code{string-downcase!} is the side--effecting variant. @end deffn @deffn primitive string-titlecase s [start end] @deffnx primitive string-titlecase! s [start end] Upcase every first character in every word in @var{s}, downcase the other characters. @code{string-titlecase!} is the side--effecting variant. @end deffn @c =================================================================== @node Reverse/Append @subsection Reverse/Append One appending procedure, @code{string-append} is the same in R5RS and in SRFI-13, so it is not redefined. @deffn primitive string-reverse str [start end] @deffnx primitive string-reverse! str [start end] Reverse the string @var{str}. The optional arguments @var{start} and @var{end} delimit the region of @var{str} to operate on. @code{string-reverse!} modifies the argument string and returns an unspecified value. @end deffn @deffn primitive string-append/shared ls @dots{} Like @code{string-append}, but the result may share memory with the argument strings. @end deffn @deffn primitive string-concatenate ls Append the elements of @var{ls} (which must be strings) together into a single string. Guaranteed to return a freshly allocated string. @end deffn @deffn primitive string-concatenate/shared ls Like @code{string-concatenate}, but the result may share memory with the strings in the list @var{ls}. @end deffn @deffn primitive string-concatenate-reverse ls final_string end Without optional arguments, this procedure is equivalent to @smalllisp (string-concatenate (reverse ls)) @end smalllisp If the optional argument @var{final_string} is specified, it is consed onto the beginning to @var{ls} before performing the list-reverse and string-concatenate operations. If @var{end} is given, only the characters of @var{final_string} up to index @var{end} are used. Guaranteed to return a freshly allocated string. @end deffn @deffn primitive string-concatenate-reverse/shared ls final_string end Like @code{string-concatenate-reverse}, but the result may share memory with the the strings in the @var{ls} arguments. @end deffn @c =================================================================== @node Fold/Unfold/Map @subsection Fold/Unfold/Map @code{string-map}, @code{string-for-each} etc. are for iterating over the characters a string is composed of. The fold and unfold procedures are list iterators and constructors. @deffn primitive string-map proc s [start end] @var{proc} is a char->char procedure, it is mapped over @var{s}. The order in which the procedure is applied to the string elements is not specified. @end deffn @deffn primitive string-map! proc s [start end] @var{proc} is a char->char procedure, it is mapped over @var{s}. The order in which the procedure is applied to the string elements is not specified. The string @var{s} is modified in-place, the return value is not specified. @end deffn @deffn primitive string-fold kons knil s [start end] @deffnx primitive string-fold-right kons knil s [start end] Fold @var{kons} over the characters of @var{s}, with @var{knil} as the terminating element, from left to right (or right to left, for @code{string-fold-right}). @var{kons} must expect two arguments: The actual character and the last result of @var{kons}' application. @end deffn @deffn primitive string-unfold p f g seed [base make_final] @deffnx primitive string-unfold-right p f g seed [base make_final] These are the fundamental string constructors. @itemize @item @var{g} is used to generate a series of @emph{seed} values from the initial @var{seed}: @var{seed}, (@var{g} @var{seed}), (@var{g}^2 @var{seed}), (@var{g}^3 @var{seed}), @dots{} @item @var{p} tells us when to stop -- when it returns true when applied to one of these seed values. @item @var{f} maps each seed value to the corresponding character in the result string. These chars are assembled into the string in a left-to-right (right-to-left) order. @item @var{base} is the optional initial/leftmost (rightmost) portion of the constructed string; it default to the empty string. @item @var{make_final} is applied to the terminal seed value (on which @var{p} returns true) to produce the final/rightmost (leftmost) portion of the constructed string. It defaults to @code{(lambda (x) "")}. @end itemize @end deffn @deffn primitive string-for-each proc s [start end] @var{proc} is mapped over @var{s} in left-to-right order. The return value is not specified. @end deffn @c =================================================================== @node Replicate/Rotate @subsection Replicate/Rotate These procedures are special substring procedures, which can also be used for replicating strings. They are a bit tricky to use, but consider this code fragment, which replicates the input string @code{"foo"} so often that the resulting string has a length of six. @lisp (xsubstring "foo" 0 6) @result{} "foofoo" @end lisp @deffn primitive xsubstring s from [to start end] This is the @emph{extended substring} procedure that implements replicated copying of a substring of some string. @var{s} is a string, @var{start} and @var{end} are optional arguments that demarcate a substring of @var{s}, defaulting to 0 and the length of @var{s}. Replicate this substring up and down index space, in both the positive and negative directions. @code{xsubstring} returns the substring of this string beginning at index @var{from}, and ending at @var{to}, which defaults to @var{from} + (@var{end} - @var{start}). @end deffn @deffn primitive string-xcopy! target tstart s sfrom [sto start end] Exactly the same as @code{xsubstring}, but the extracted text is written into the string @var{target} starting at index @var{tstart}. The operation is not defined if @code{(eq? @var{target} @var{s})} or these arguments share storage -- you cannot copy a string on top of itself. @end deffn @c =================================================================== @node Miscellaneous @subsection Miscellaneous @code{string-replace} is for replacing a portion of a string with another string and @code{string-tokenize} splits a string into a list of strings, breaking it up at a specified character. @deffn primitive string-replace s1 s2 [start1 end1 start2 end2] Return the string @var{s1}, but with the characters @var{start1} @dots{} @var{end1} replaced by the characters @var{start2} @dots{} @var{end2} from @var{s2}. @end deffn @deffn primitive string-tokenize s [token_char start end] Split the string @var{s} into a list of substrings, where each substring is a maximal non-empty contiguous sequence of characters equal to the character @var{token_char}, or whitespace, if @var{token_char} is not given. If @var{token_char} is a character set, it is used for finding the token borders. @end deffn @c =================================================================== @node Filtering/Deleting @subsection Filtering/Deleting @dfn{Filtering} means to remove all characters from a string which do not match a given criteria, @dfn{deleting} means the opposite. @deffn primitive string-filter s char_pred [start end] Filter the string @var{s}, retaining only those characters that satisfy the @var{char_pred} argument. If the argument is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership. @end deffn @deffn primitive string-delete s char_pred [start end] Filter the string @var{s}, retaining only those characters that do not satisfy the @var{char_pred} argument. If the argument is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership. @end deffn @c =================================================================== @node Character-set Procedures @section Character-set Procedures SRFI-14 defines the data type @dfn{character set}, and also defines a lot of procedures for handling this character type, and a few standard character sets like whitespace, alphabetic characters and others. @menu * Character Set Data Type:: Description of the character set data type. * Predicates/Comparison:: Testing character sets. * Iterating Over Character Sets:: Iterating over the members of a set. * Creating Character Sets:: Creating new character sets. * Querying Character Sets:: Extracting information from character sets. * Character-Set Algebra:: Set-algebra on character sets. * Standard Character Sets:: Variables containg standard character sets. @end menu @c =================================================================== @node Character Set Data Type @subsection Character Set Data Type The data type @dfn{charset} implements sets of characters (REFFIXME). Because the internal representation of character sets is not visible to the user, a lot of procedures for handling them are provided. Character sets can be created, extended, tested for the membership of a characters and be compared to other character sets. The Guile implementation of character sets deals with 8-bit characters. In the standard variables, only the ASCII part of the character range is really used, so that for example @dfn{Umlaute} and other accented characters are not considered to be letters. In the future, as Guile may get support for international character sets, this will change, so don't rely on these ``features''. @c =================================================================== @node Predicates/Comparison @subsection Predicates/Comparison Use these procedures for testing whether an object is a character set, or whether several character sets are equal or subsets of each other. @code{char-set-hash} can be used for calculating a hash value, maybe for usage in fast lookup procedures. @deffn primitive char-set? obj Return @code{#t} if @var{obj} is a character set, @code{#f} otherwise. @end deffn @deffn primitive char-set= cs1 @dots{} Return @code{#t} if all given character sets are equal. @end deffn @deffn primitive char-set<= cs1 @dots{} Return @code{#t} if every character set @var{cs}i is a subset of character set @var{cs}i+1. @end deffn @deffn primitive char-set-hash cs [bound] Compute a hash value for the character set @var{cs}. If @var{bound} is given and not @code{#f}, it restricts the returned value to the range 0 @dots{} @var{bound - 1}. @end deffn @c =================================================================== @node Iterating Over Character Sets @subsection Iterating Over Character Sets Character set cursors are a means for iterating over the members of a character sets. After creating a character set cursor with @code{char-set-cursor}, a cursor can be dereferenced with @code{char-set-ref}, advanced to the next member with @code{char-set-cursor-next}. Whether a cursor has passed past the last element of the set can be checked with @code{end-of-char-set?}. Additionally, mapping and (un-)folding procedures for character sets are provided. @deffn primitive char-set-cursor cs Return a cursor into the character set @var{cs}. @end deffn @deffn primitive char-set-ref cs cursor Return the character at the current cursor position @var{cursor} in the character set @var{cs}. It is an error to pass a cursor for which @code{end-of-char-set?} returns true. @end deffn @deffn primitive char-set-cursor-next cs cursor Advance the character set cursor @var{cursor} to the next character in the character set @var{cs}. It is an error if the cursor given satisfies @code{end-of-char-set?}. @end deffn @deffn primitive end-of-char-set? cursor Return @code{#t} if @var{cursor} has reached the end of a character set, @code{#f} otherwise. @end deffn @deffn primitive char-set-fold kons knil cs Fold the procedure @var{kons} over the character set @var{cs}, initializing it with @var{knil}. @end deffn @deffn primitive char-set-unfold p f g seed [base_cs] @deffnx primitive char-set-unfold! p f g seed base_cs This is a fundamental constructor for character sets. @itemize @item @var{g} is used to generate a series of ``seed'' values from the initial seed: @var{seed}, (@var{g} @var{seed}), (@var{g}^2 @var{seed}), (@var{g}^3 @var{seed}), @dots{} @item @var{p} tells us when to stop -- when it returns true when applied to one of the seed values. @item @var{f} maps each seed value to a character. These characters are added to the base character set @var{base_cs} to form the result; @var{base_cs} defaults to the empty set. @end itemize @code{char-set-unfold!} is the side-effecting variant. @end deffn @deffn primitive char-set-for-each proc cs Apply @var{proc} to every character in the character set @var{cs}. The return value is not specified. @end deffn @deffn primitive char-set-map proc cs Map the procedure @var{proc} over every character in @var{cs}. @var{proc} must be a character -> character procedure. @end deffn @c =================================================================== @node Creating Character Sets @subsection Creating Character Sets New character sets are produced with these procedures. @deffn primitive char-set-copy cs Return a newly allocated character set containing all characters in @var{cs}. @end deffn @deffn primitive char-set char1 @dots{} Return a character set containing all given characters. @end deffn @deffn primitive list->char-set char_list [base_cs] @deffnx primitive list->char-set! char_list base_cs Convert the character list @var{list} to a character set. If the character set @var{base_cs} is given, the character in this set are also included in the result. @code{list->char-set!} is the side-effecting variant. @end deffn @deffn primitive string->char-set s [base_cs] @deffnx primitive string->char-set! s base_cs Convert the string @var{str} to a character set. If the character set @var{base_cs} is given, the characters in this set are also included in the result. @code{string->char-set!} is the side-effecting variant. @end deffn @deffn primitive char-set-filter pred cs [base_cs] @deffnx primitive char-set-filter! pred cs base_cs Return a character set containing every character from @var{cs} so that it satisfies @var{pred}. If provided, the characters from @var{base_cs} are added to the result. @code{char-set-filter!} is the side-effecting variant. @end deffn @deffn primitive ucs-range->char-set lower upper [error? base_cs] @deffnx primitive uce-range->char-set! lower upper error? base_cs Return a character set containing all characters whose character codes lie in the half-open range [@var{lower},@var{upper}). If @var{error} is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If @var{error} is @code{#f}, these characters are silently left out of the resultung character set. The characters in @var{base_cs} are added to the result, if given. @code{ucs-range->char-set!} is the side-effecting variant. @end deffn @deffn procedure ->char-set x Coerce @var{x} into a character set. @var{x} may be a string, a character or a character set. @end deffn @c =================================================================== @node Querying Character Sets @subsection Querying Character Sets Access the elements and other information of a character set with these procedures. @deffn primitive char-set-size cs Return the number of elements in character set @var{cs}. @end deffn @deffn primitive char-set-count pred cs Return the number of the elements int the character set @var{cs} which satisfy the predicate @var{pred}. @end deffn @deffn primitive char-set->list cs Return a list containing the elements of the character set @var{cs}. @end deffn @deffn primitive char-set->string cs Return a string containing the elements of the character set @var{cs}. The order in which the characters are placed in the string is not defined. @end deffn @deffn primitive char-set-contains? cs char Return @code{#t} iff the character @var{ch} is contained in the character set @var{cs}. @end deffn @deffn primitive char-set-every pred cs Return a true value if every character in the character set @var{cs} satisfies the predicate @var{pred}. @end deffn @deffn primitive char-set-any pred cs Return a true value if any character in the character set @var{cs} satisfies the predicate @var{pred}. @end deffn @c =================================================================== @node Character-Set Algebra @subsection Character-Set Algebra Character sets can be manipulated with the common set algebra operation, such as union, complement, intersection etc. All of these procedures provide side--effecting variants, which modify their character set argument(s). @deffn primitive char-set-adjoin cs char1 @dots{} @deffnx primitive char-set-adjoin! cs char1 @dots{} Add all character arguments to the first argument, which must be a character set. @end deffn @deffn primitive char-set-delete cs char1 @dots{} @deffnx primitive char-set-delete! cs char1 @dots{} Delete all character arguments from the first argument, which must be a character set. @end deffn @deffn primitive char-set-complement cs @deffnx primitive char-set-complement! cs Return the complement of the character set @var{cs}. @end deffn @deffn primitive char-set-union cs1 @dots{} @deffnx primitive char-set-union! cs1 @dots{} Return the union of all argument character sets. @end deffn @deffn primitive char-set-intersection cs1 @dots{} @deffnx primitive char-set-intersection! cs1 @dots{} Return the intersection of all argument character sets. @end deffn @deffn primitive char-set-difference cs1 @dots{} @deffnx primitive char-set-difference! cs1 @dots{} Return the difference of all argument character sets. @end deffn @deffn primitive char-set-xor cs1 @dots{} @deffnx primitive char-set-xor! cs1 @dots{} Return the exclusive--or of all argument character sets. @end deffn @deffn primitive char-set-diff+intersection cs1 @dots{} @deffnx primitive char-set-diff+intersection! cs1 @dots{} Return the difference and the intersection of all argument character sets. @end deffn @c =================================================================== @node Standard Character Sets @subsection Standard Character Sets In order to make the use of the character set data type and procedures useful, several predefined character set variables exist. @defvar char-set:lower-case All lower--case characters. @end defvar @defvar char-set:upper-case All upper--case characters. @end defvar @defvar char-set:title-case This is empty, because ASCII has no titlecase characters. @end defvar @defvar char-set:letter All letters, e.g. the union of @code{char-set:lower-case} and @code{char-set:upper-case}. @end defvar @defvar char-set:digit All digits. @end defvar @defvar char-set:letter+digit The union of @code{char-set:letter} and @code{char-set:digit}. @end defvar @defvar char-set:graphic All characters which would put ink on the paper. @end defvar @defvar char-set:printing The union of @code{char-set:graphic} and @code{char-set:whitespace}. @end defvar @defvar char-set:whitespace All whitespace characters. @end defvar @defvar char-set:blank All horizontal whitespace characters, that is @code{#\space} and @code{#\tab}. @end defvar @defvar char-set:iso-control The ISO control characters with the codes 0--31 and 127. @end defvar @defvar char-set:punctuation The characters @code{!"#%&'()*,-./:;?@@[\\]_@{@}} @end defvar @defvar char-set:symbol The characters @code{$+<=>^`|~}. @end defvar @defvar char-set:hex-digit The hexadecimal digits @code{0123456789abcdefABCDEF}. @end defvar @defvar char-set:ascii All ASCII characters. @end defvar @defvar char-set:empty The empty character set. @end defvar @defvar char-set:full This character set contains all possible characters. @end defvar