1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-29 19:30:36 +02:00
Commit graph

97 commits

Author SHA1 Message Date
Ludovic Courtès
b3da54d181 Placate a number of `syntax-check' verifications.
- "filesystem" -> "file system"
  - remove doubled words
  - use EXIT_* macros instead of literal numbers
  - update `syntax-check' exclusion files
2012-01-05 23:38:10 +01:00
Ludovic Courtès
190d4b0d93 Make VM string literals immutable.
* libguile/strings.c (scm_i_make_string, scm_i_make_wide_string): Add
  `read_only_p' parameter.  All callers updated.

* libguile/vm-i-loader.c (load_string, load_wide_string): Push read-only
  strings.

* test-suite/tests/strings.test ("literals"): New test prefix.
2011-03-20 23:34:42 +01:00
Ludovic Courtès
6851d3be80 Change `scm_encoding_error' to pass the port and faulty character.
* libguile/strings.c (scm_encoding_error): Remove the `from', `to', and
  `string_or_bv' parameters; add `port' and `chr'.
  (scm_to_stringn): Update accordingly.

* libguile/strings.h (scm_encoding_error): Update accordingly.

* libguile/ports.c (scm_ungetc): Update accordingly.

* libguile/print.c (iprin1, scm_write_char): Update accordingly.

* test-suite/tests/encoding-escapes.test ("display output
  errors")["ultima", "Rashomon"]: Check the arguments of
  `encoding-error'.
  ["tekniko"]: New test.

* test-suite/tests/ports.test ("string ports")["wrong encoding"]: Adjust
  to new `encoding-error' arguments.
2011-02-02 18:06:29 +01:00
Ludovic Courtès
c62da8f891 Have read-char' & co. throw to decoding-error'.
* libguile/ports.c (scm_read_char): Mention `decoding-error' in the
  docstring.
  (get_codepoint): Change to return an error code; add `codepoint'
  output parameter.  Don't raise an error from here.
  (scm_getc): Raise an error with `scm_decoding_error' if
  `get_codepoint' returns an error.
  (scm_peek_char): Likewise.  Update docstring.

* libguile/strings.c (scm_decoding_error_key): New variable.
  (scm_decoding_error): New function.
  (scm_from_stringn): Use `scm_decoding_error' instead of
  `scm_encoding_error'.

* libguile/strings.h (scm_decoding_error): New declaration.

* test-suite/tests/ports.test ("string ports")["read-char, wrong
  encoding, error"]: Change to expect `decoding-error'.  Make sure PORT
  points past the error.
  ["read-char, wrong encoding, escape"]: Likewise.
  ["peek-char, wrong encoding, error"]: New test.

* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
  Output")["put-bytevector with wrong-encoding string port"]: Change to
  expect `decoding-error'.
  ("8.2.6  Input and output ports")["transcoded-port [error handling
  mode = raise]"]: Likewise.

* test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
  error, substitute"]: New tests.

* doc/ref/api-io.texi (Reading): Update documentation of `read-char' and
  `peek-char'.
  (Line/Delimited): Update documentation of `read-line'.
2011-02-02 18:06:28 +01:00
Ludovic Courtès
647dc1ac23 Add `scm_{to,from}_utf32_string'.
* libguile/strings.c (scm_from_utf32_string, scm_from_utf32_stringn,
  scm_to_utf32_string, scm_to_utf32_stringn): New functions.

* libguile/strings.h (scm_from_utf32_string, scm_from_utf32_stringn,
  scm_to_utf32_string, scm_to_utf32_stringn): New declarations.

* doc/ref/api-data.texi (Conversion to/from C): Document
  `scm_{to,from}_{utf8,utf32}_stringn'.
2011-01-26 00:29:50 +01:00
Ludovic Courtès
31d4d02be7 Hide the string escaping hacks.
* libguile/strings.c (scm_i_unistring_escapes_to_guile_escapes): Rename
  to...
  (unistring_escapes_to_guile_escapes): ... this.  Make `static'.
  (scm_i_unistring_escapes_to_r6rs_escapes): Rename to...
  (unistring_escapes_to_r6rs_escapes): ... this.  Make `static'.

* libguile/strings.h (scm_i_unistring_escapes_to_guile_escapes,
  scm_i_unistring_escapes_to_r6rs_escapes): Remove declarations.
2011-01-23 00:37:25 +01:00
Andy Wingo
d40e1ca893 add scm_{to,from}_{utf8,latin1}_string{n,}
* libguile/strings.h:
* libguile/strings.c (scm_from_latin1_string, scm_to_latin1_string): New
  functions, in terms of the latin1_stringn variants.
  (scm_from_utf8_string, scm_from_utf8_stringn)
  (scm_to_utf8_string, scm_to_utf8_stringn): New functions.
  (scm_i_from_utf8_string, scm_i_to_utf8_string): Removed these internal
  functions.
  (scm_from_stringn): Handle -1 as a length. Unlike the previous
  behavior of scm_from_locale_string (NULL), which returned the empty
  string, we now raise an error.  The null pointer is not the same as
  the empty string.

* libguile/stime.c (scm_strftime, scm_strptime): Adapt to publishing of
  utf8 functions.
2011-01-07 09:18:36 -08:00
Ludovic Courtès
6bf623ca34 Remove conflicting `scm_is_string' declaration.
* libguile/strings.h: Move `scm_is_string' declaration...
* libguile/inline.h: ... here.
  Reported by Noah Lavine <noah.b.lavine@gmail.com>.
2010-12-17 14:34:22 +01:00
Ludovic Courtès
4ff2b9f4b6 Internally expose `scm_i_unistring_escapes_to_{guile,r6rs}_escapes'.
* libguile/strings.c (unistring_escapes_to_guile_escapes): Rename to...
  (scm_i_unistring_escapes_to_guile_escapes): ... this.  Change `char **bufp'
  to `char *buf'; leave realloc responsibility to the caller.  Update caller.
  (unistring_escapes_to_r6rs_escapes): Rename to...
  (scm_i_unistring_escapes_to_r6rs_escapes): ... this.  Likewise.
2010-09-14 16:10:08 +02:00
Michael Gran
cf313a947b Provide non-locale C/Scheme string conversion functions
* doc/ref/api-data.texi: document scm_to_stringn, scm_from_stringn,
  scm_to_latin1_stringn, and scm_from_latin1_stringn
* libguile/strings.h (scm_to_stringn): make public
  (scm_to_latin1_stringn): new declaration
  (scm_from_latin1_stringn): new declaration
* libguile/strings.c (scm_to_latin1_stringn): new function
  (scm_from_latin1_stringn): new function
2010-09-12 08:29:31 -07:00
Ludovic Courtès
d14418a535 Expose `scm_encoding_error'.
* libguile/strings.c (scm_encoding_error): Make public.

* libguile/strings.h (scm_encoding_error): New internal declaration.
2010-07-15 23:12:57 +02:00
Ludovic Courtès
100e20c7fa Add `scm_i_string_data'.
* libguile/strings.c (STRINGBUF_CONTENTS): New macro.
  (STRINGBUF_CHARS, STRINGBUF_WIDE_CHARS): Use it.
  (scm_i_string_data): New function.

* libguile/strings.h (scm_i_string_data): New declaration.
2010-07-04 18:38:53 +02:00
Julian Graham
edb7bb4766 Support for Unicode string normalization functions
* libguile/strings.c, libguile/strings.h (normalize_str,
  scm_string_normalize_nfc, scm_string_normalize_nfd, scm_normalize_nfkc,
  scm_string_normalize_nfkd): New functions.
* test-suite/tests/strings.test: Unit tests for `string-normalize-nfc',
  `string-normalize-nfd', `string-normalize-nfkc', and
  `string-normalize-nfkd'.
* doc/ref/api-data.texi (String Comparison): Documentation for normalization
  functions.
2010-01-03 01:08:37 -05:00
Andy Wingo
e7efe8e793 decruftify scm_sys_protects
* libguile/root.h
* libguile/root.c (scm_sys_protects): It used to be that for some reason
  we'd define a special array of "protected" values. This was a little
  silly, always, but with the BDW GC it's completely unnecessary. Also
  many of these variables were unused, and none of them were good API.
  So remove this array, and either eliminate, make static, or make
  internal the various values.

* libguile/snarf.h: No need to generate calls to scm_permanent_object.

* guile-readline/readline.c (scm_init_readline): No need to call
  scm_permanent_object.

* libguile/array-map.c (ramap, rafe): Remove the dubious nullvect
  optimizations.

* libguile/async.c (scm_init_async): No need to init scm_asyncs, it is
  no more.

* libguile/eval.c (scm_init_eval): No need to init scm_listofnull, it is
  no more.

* libguile/gc.c: Make scm_protects a static var.
  (scm_storage_prehistory): Change the sanity check to use the address
  of protects.
  (scm_init_gc_protect_object): No need to clear the scm_sys_protects,
  as it is no more.

* libguile/keywords.c: Make the keyword obarray a static var.
* libguile/numbers.c: Make flo0 a static var.
* libguile/objprop.c: Make object_whash a static var.
* libguile/properties.c: Make properties_whash a static var.

* libguile/srcprop.h:
* libguile/srcprop.c: Make scm_source_whash a global with internal
  linkage.

* libguile/strings.h:
* libguile/strings.c: Make scm_nullstr a global with internal linkage.

* libguile/vectors.c (scm_init_vectors): No need to init scm_nullvect,
  it's unused.
2009-12-05 12:38:43 +01:00
Ludovic Courtès
56a3dcd431 Remove references to undefined macros.
The intent is to allow compilation with `-Wundef', which in turn should
make it easier to catch erroneous uses of nonexistent macros.

* libguile/__scm.h: Don't assume `BUILDING_LIBGUILE' is defined.

* libguile/conv-uinteger.i.c (SCM_TO_TYPE_PROTO): Remove unneeded CPP
  conditional on `TYPE_MIN == 0'.

* libguile/fports.c: Check for the definition of `HAVE_CHSIZE' and
  `HAVE_FTRUNCATE', not for their value.

* libguile/ports.c: Likewise.

* libguile/numbers.c (guile_ieee_init): Likewise with `HAVE_DINFINITY'
  and `HAVE_DQNAN'.

* test-suite/standalone/test-conversion.c (ieee_init): Likewise.

* libguile/strings.c: Likewise with `SCM_STRING_LENGTH_HISTOGRAM'.

* libguile/strings.h: Likewise.

* libguile/tags.h: Likewise with `HAVE_INTTYPES_H' and `HAVE_STDINT_H'.

* libguile/threads.c: Likewise with `HAVE_PTHREAD_GET_STACKADDR_NP'.

* libguile/vm-engine.c (VM_NAME): Likewise with `VM_CHECK_IP'.

* libguile/gen-scmconfig.c (main): Use "#ifdef HAVE_", not "#if HAVE_".

* libguile/socket.c (scm_setsockopt): Likewise.
2009-11-17 23:42:22 +01:00
Ludovic Courtès
731dd0ce19 Merge branch 'bdw-gc-static-alloc'
Conflicts:
	acinclude.m4
	libguile/__scm.h
	libguile/bdw-gc.h
	libguile/eval.c
2009-11-01 18:17:31 +01:00
Ludovic Courtès
0eb934f1f0 Use `SCM_DEPRECATED' in declarations of deprecated functions/variables.
* libguile/deprecated.c (SCM_BUILDING_DEPRECATED_CODE): New macro.

* libguile/async.c (SCM_BUILDING_DEPRECATED_CODE): Likewise.

* libguile/macros.c (SCM_BUILDING_DEPRECATED_CODE): Likewise.

* libguile/async.h, libguile/deprecated.h, libguile/eval.h,
  libguile/gc.h, libguile/gc.h, libguile/macros.h, libguile/ports.h,
  libguile/srfi-4.h, libguile/strings.h: Change declarations of
  deprecated functions and variables to use `SCM_DEPRECATED' instead of
  `SCM_API'.
2009-10-02 14:48:22 +02:00
Ludovic Courtès
6dc797eee9 Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	libguile/gc_os_dep.c
2009-09-09 22:39:49 +02:00
Michael Gran
f7f4d0477e Make scm_i_from_stringn into API for use with libguilereadline
* libguile/strings.c (scm_i_from_stringn): renamed to scm_from_stringn.
  All callers changed.

* libguile/strings.h: change declaration of scm_i_from_stringn to
  scm_from_stringn

* libguile/strports.c (scm_strport_to_string): scm_i_from_stringn ->
  scm_from_stringn

* guile-readline/readline.c (internal_readline): scm_i_from_stringn ->
  scm_from_stringn
2009-09-09 08:07:53 -07:00
Michael Gran
7519234547 Fix broken interaction between readline and Unicode
This requires separate small fixes.

Readline has internal logic to deal with multi-byte characters, so
it wants bytes, not characters.

scm_c_read gets called by the vm when readline is activated, and it was
truncating multi-byte characters because soft ports didn't have the
UCS-4 capability.

Soft ports need the capability to read UCS-4 characters.  Since soft ports
may have a single byte buffer, full characters need to be stored into the
pushback buffer.

This broke the optimizations in scm_c_read for using an alternate buffer
for single-byte-buffered ports, because the opimization wasn't expecting
anything in the pushback buffer.

* libguile/vports.c (sf_fill_input): store complete chars, not single bytes

* libguile/ports.c (scm_c_read): don't use optimized path for non Latin-1.
  Add debug prints.

* libguile/string.h: make scm_i_from_stringn and scm_i_string_ref public
  so that readline can use them

* guile-readline/readline.c: read bytes, not complete chars, from the
  input port.  Convert output to the output port's locale
2009-09-07 19:12:34 -07:00
Ludovic Courtès
5f236208d0 Merge branch 'boehm-demers-weiser-gc' into bdw-gc-static-alloc
Conflicts:
	acinclude.m4
	libguile/strings.c
2009-09-02 01:37:37 +02:00
Ludovic Courtès
13a9455669 Fix leaky handling of `scm_take_locale_{symbol,string} ()'.
* libguile/strings.c (scm_i_take_stringbufn, scm_i_c_take_symbol):
  Remove.
  (scm_take_locale_stringn): Rewrite in terms of `scm_from_locale_stringn ()'.

* libguile/strings.h (scm_i_c_take_symbol, scm_i_take_stringbufn):
  Remove declarations.
2009-09-01 00:38:40 +02:00
Michael Gran
fac32b518e Fix encoding errors with strings returned by string ports
String ports, being 8-bit, store strings using the character encoding
of the port.  This fixes a bug where the default character encoding, and
not the port's encoding, was being used to convert the string port data
back to a string.

* libguile/strports.c: extra comments
  (scm_strport_to_string):  use port's encoding when converting port data
  to a string

* libguile/strings.c (scm_i_from_stringn): renamed from scm_from_stringn
  and made internal.  All callers changed.
  (scm_from_stringn): renamed to scm_i_from_stringn.

* libguile/strings.h: declaration for scm_i_from_stringn
2009-08-30 16:54:49 -07:00
Ludovic Courtès
7af531508c Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	libguile/Makefile.am
	libguile/bytevectors.c
	libguile/gc-card.c
	libguile/gc-mark.c
	libguile/programs.c
	libguile/srcprop.c
	libguile/srfi-14.c
	libguile/symbols.c
	libguile/threads.c
	libguile/unif.c
	libguile/vm.c
2009-08-28 19:16:46 +02:00
Michael Gran
889975e51a Add full Unicode capability to ports and the default reader
Ports are given two additional properties: a character encoding and
a conversion failure strategy.  These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.

If unspecified, ports use a default value. The default value of these
properties is held in a fluid.  The default character encoding can be
modified by calling setlocale.

ISO-8859-1 is treated specially.  Since it is a native encoding of
strings, it can be processed more quickly.  Source code is assumed to be
ISO-8859-1 unless otherwise specified.  The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.

The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.

* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding

* module/system/base/compile.scm (compile-file): use source-code
  file's self-declared encoding when compiling files

* libguile/strports.c: store string ports in locale encoding
  (scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
  (scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
  new functions

* libguile/strings.h: new declaration for scm_i_string_contains_char

* libguile/strings.c (scm_i_string_contains_char): new function
  (scm_from_stringn, scm_to_stringn):  use NULL for Latin-1
  (scm_from_locale_stringn, scm_to_locale_stringn): respect character
  encoding of input and output ports

* libguile/read.h: declaration for scm_scan_for_encoding

* libguile/read.c:
  (read_token): now takes scheme string instead of C string/length
  (read_complete_token): new function
  (scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
  (scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
  (scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
  (scm_read_scsh_block_comment, scm_read_commented_expression)
  (scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
  (scm_read_expression): use scm_t_wchar for char type, use read_complete_token
  (scm_scan_for_encoding): new function to find a file's character encoding
  (scm_file_encoding): new function to find a port's character encoding

* libguile/rdelim.c: don't unpack strings

* libguile/print.h: declaration for modified function
  scm_i_charprint

* libguile/print.c: use locale when printing characters and
  strings
  (scm_i_charprint): input parameter is now scm_t_wchar
  (scm_simple_format): don't unpack strings

* libguile/posix.h: new declaration for scm_setbinary.

* libguile/posix.c (scm_setlocale): set default and stdio port
  encodings based on the locale's character encoding
  (scm_setbinary): new function

* libguile/ports.h (scm_t_port): add encoding and failed
  conversion handler to port type.  Declarations for new or modified
  functions scm_getc, scm_unget_byte, scm_ungetc,
  scm_i_get_port_encoding, scm_i_set_port_encoding_x,
  scm_port_encoding, scm_set_port_encoding_x,
  scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
  scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.

* libguile/ports.c: assign the current ports to zero on startup so
  we can see if they've been set.
  (scm_current_input_port, scm_current_output_port,
  scm_current_error_port): return #f if the port is not yet
  initialized
  (scm_new_port_table_entry): set up a new port's encoding and
  illegal sequence handler based on the thread's current defaults
  (scm_i_remove_port): free port encoding name when port is removed
  (scm_i_mode_bits_n): now takes a scheme string instead of a c
  string and length.  All callers changed.
  (SCM_MBCHAR_BUF_SIZE): new const
  (scm_getc): new function, since the scm_getc in inline.h is now
  scm_get_byte_or_eof.  This pulls one codepoint from a port.
  (scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
  (scm_unget_byte): new function, incorportaing the low-level functionality
  of scm_ungetc
  (scm_ungetc): uses scm_unget_byte

* libguile/numbers.h (scm_t_wchar): compilation order problem with
  scm_t_wchar being use in functions in multiple headers.  Forward
  declare scm_t_wchar.

* libguile/load.c (scm_primitive_load): scan for file encoding at
  top of file and use it to set the load port's encoding

* libguile/inline.h (scm_get_byte_or_eof): new function
  incorporating most of the functionality of scm_getc.

* libguile/fports.c (fport_fill_input): now returns scm_t_wchar

* libguile/chars.h (scm_t_wchar): avoid compilation order problem
  with declaration of scm_t_wchar
2009-08-25 07:54:37 -07:00
Michael Gran
587a33556f Modify socket and time functions for wide strings
* libguile/socket.c (scm_recv): receive the message without holding the
  stringbuf writing lock
  (scm_send): try to narrow a string before using it

* libguile/stime.c (strftime): convert string to UTF-8 so that it can
  be safely passed to strftime
  (strptime): convert input string to UTF-8 so that it can be safely
  passed through strptime

* libguile/strings.c (narrow_stringbuf): new function
  (scm_i_try_narrow_string): new function

* libguile/strings.h: new declaration for scm_i_try_narrow_string
2009-08-23 09:29:45 -07:00
Michael Gran
3f47e52621 Use string accessors for string->number conversion
* libguile/numbers.c (scm_i_print_fraction): use string accessors
  (XDIGIT2UINT): use libunistring function
  (mem2uinteger, mem2integer, mem2decimal_from_point, mem2ureal)
  (mem2complex): take scheme string instead of c string; use accessors
  (scm_i_string_to_number): new function
  (scm_c_locale_string_to_number): use scm_i_string_to_number

* libguile/numbers.h: declaration for scm_i_string_to_number

* libguile/strings.c (scm_i_string_strcmp): new function

* libguile/strings.h: declaration for scm_i_string_strcmp
2009-08-21 09:18:30 -07:00
Michael Gran
f8ba2bb911 Rename string-width to string-bytes-per-char
* libguile/strings.h: rename scm_string_width to scm_string_bytes_per_char

* libguile/strings.c (scm_string_width): renamed to scm_string_bytes_per_char
  (scm_string_bytes_per_char): renamed from scm_string_width

* module/language/assembly/compile-bytecode.scm (write-bytecode): string-width
  -> string-bytes-per-char

* module/language/glil/compile-assembly.scm (dump-object): string-width
  -> string-bytes-per-char
2009-08-19 22:15:22 -07:00
Ludovic Courtès
fbb857a472 Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	lib/Makefile.am
	libguile/Makefile.am
	libguile/frames.c
	libguile/gc-card.c
	libguile/gc-freelist.c
	libguile/gc-mark.c
	libguile/gc-segment.c
	libguile/gc_os_dep.c
	libguile/load.c
	libguile/macros.c
	libguile/objcodes.c
	libguile/programs.c
	libguile/strings.c
	libguile/vm.c
	m4/gnulib-cache.m4
	m4/gnulib-comp.m4
	m4/inline.m4
2009-08-18 00:06:45 +02:00
Michael Gran
eca29b0202 Don't include libunistring headers in Guile public headers
This requres the creation of a new type
scm_t_string_failed_conversion_handler to replace libunistring's
enum iconveh_ilseq_handler.

* libguile/strings.h: don't include <uniconv.h>
(scm_t_string_failed_conversion_handler): new enum type
(SCM_FAILED_CONVERSION_ERROR, SCM_FAILED_CONVERSION_QUESTION_MARK):
(SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE): new enum type values

* libguile/strings.c (scm_to_stringn): now takes type
scm_t_string_failed_conversion_handler.  All callers changed.

* libguile/print.c: include <uniconv.h>

* libguile/ports.c (scm_lfwrite_substr): use
scm_t_string_conversion_handler's constants

* libguile/gen-scmconfig.c (SCM_ICONVEH_ERROR):
(SCM_ICONVEH_QUESTION_MARK, SCM_ICONVEH_ESCAPE_SEQUENCE): store
iconveh_ilseq_hander constants as #define's
2009-08-12 09:21:55 -07:00
Michael Gran
32be5735cd Make scm_charprint and scm_i_string_wide_chars SCM_INTERNAL.
Also, scm_charprint is renamed to scm_i_charprint.

        * libguile/strings.h: make scm_i_string_wide_chars internal.

        * libguile/print.h: rename scm_charprint to scm_i_charprint.  Make
        internal.

        * libguile/print.c (scm_i_charprint): renamed from scm_charprint
        (scm_charprint): renamed to scm_i_charprint. All callers changed.
2009-08-10 05:51:05 -07:00
Michael Gran
6ce6923b68 Improve %string-dump and %symbol-dump
%string-dump and %symbol-dump are modified to return assocation lists
of string and symbol attributes instead of printing to stderr.  They
are no longer conditional on SCM_DEBUG.

        * libguile/strings.c (scm_sys_string_dump)
        (scm_sys_symbol_dump): now returns alist of properties.  No longer
        require that SCM_DEBUG be defined.
        (scm_sys_stringbuf_hist): now conditional on
        SCM_STRING_LENGTH_HISTOGRAM

        * libguile/strings.h: scm_sys_string_dump and scm_sys_symbol dump
        are now declared as API
2009-08-10 00:09:33 -07:00
Michael Gran
9c44cd4559 Add Unicode strings and symbols
This adds full Unicode strings as a datatype, and it adds some
minimal functionality.  The terminal and port encoding is assumed
to be ISO-8859-1.  Non-ISO-8859-1 characters are written or
input as string character escapes.

The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.

The process for writing to strings has been modified.  There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.

To compile strings that may be wide, the VM storage of strings and
string-likes has changed.

Most string-using functions have not yet been updated and may break
when used with wide strings.


        * module/language/assembly/compile-bytecode.scm (write-bytecode):
        use variable width string bytecode format

        * module/language/assembly.scm (byte-length): use variable width
        bytecode format

        * libguile/vm-i-loader.c (load-string, load-symbol):
        (load-keyword, define): use variable-width bytecode format

        * libguile/vm-engine.h (FETCH_WIDTH): new macro

        * libguile/strings.h: new declarations

        * libguile/strings.c (make_wide_stringbuf): new function
        (widen_stringbuf): new function
        (scm_i_make_wide_string): new function
        (scm_i_is_narrow_string): new function
        (scm_i_string_wide_chars): new function
        (scm_i_string_start_writing): new function
        (scm_i_string_ref): new function
        (scm_i_string_set_x): new function
        (scm_i_is_narrow_symbol): new function
        (scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
        (scm_string_width): new function
        (unistring_escapes_to_guile_escapes): new function
        (scm_to_stringn): new function
        (scm_i_stringbuf_free): modify for wide strings
        (scm_i_substring_copy): modify for wide strings
        (scm_i_string_chars, scm_string_append): modify for wide strings
        (scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
        (scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
        (scm_string, scm_i_deprecated_string_chars): modify for wide strings
        (scm_from_locale_string, scm_from_locale_stringn): add null test

        * libguile/srfi-13.c: add calls for scm_i_string_start_writing for
        each call of scm_i_string_stop_writing
        (scm_string_for_each): modify for wide strings

        * libguile/socket.c: add calls for scm_i_string_start_writing for each
        call of scm_i_string_stop_writing

        * libguile/rw.c: add calls for scm_i_string_start_writing for each
        call of scm_i_string_stop_writing

        * libguile/read.c (scm_read_string): allow reading of wide strings

        * libguile/print.h: add declaration for scm_charprint

        * libguile/print.c (iprin1): print wide strings and add new string
        escapes
        (scm_charprint): new function

        * libguile/ports.h: new declarations for scm_lfwrite_substr and
        scm_lfwrite_str

        * libguile/ports.c (update_port_lf): new function
        (scm_lfwrite): use update_port_lf
        (scm_lfwrite_substr): new function
        (scm_lfwrite_str): new function

        * test-suite/tests/asm-to-bytecode.test ("compiler"): add string
        width byte to sting-like asm tests
2009-08-08 02:35:00 -07:00
Neil Jerram
53befeb700 Change Guile license to LGPLv3+
(Not quite finished, the following will be done tomorrow.
   module/srfi/*.scm
   module/rnrs/*.scm
   module/scripts/*.scm
   testsuite/*.scm
   guile-readline/*
)
2009-06-17 00:22:09 +01:00
Ludovic Courtès
5bec288a67 Merge branch 'boehm-demers-weiser-gc' into bdw-gc-static-alloc 2009-01-19 22:31:38 +01:00
Ludovic Courtès
2a77682322 Use scm_gc malloc_pointerless ()' in scm_i allocate_string_pointers ()'.
* libguile/dynl.c (free_string_pointers): Remove.
  (scm_dynamic_args_call): Remove reference to `free_string_pointers ()'
  and remove dynwind.

* libguile/posix.c (free_string_pointers): Remove.
  (scm_execl, scm_execlp, scm_execle, scm_environ): Remove references
  to `free_string_pointers ()'.

* libguile/simpos.c (free_string_pointers): Remove.
  (scm_system_star): Remove reference to `free_string_pointers ()',
  remove enclosing dynwind.

* libguile/strings.c (scm_i_allocate_string_pointers): Use
  `scm_gc_malloc_pointerless ()' and `scm_gc_malloc ()'
  instead of `scm_malloc ()' and `scm_to_locale_string ()',
  so that the result is automatically GC'd when no longer
  referenced.  Remove unneeded dynwind.
  (scm_i_free_string_pointers): Remove.

* libguile/strings.h (scm_i_free_string_pointers): Remove declaration.
2009-01-18 16:02:04 +01:00
Ludovic Courtès
35920c00a8 Expose some of the string/stringbuf internal flags and tags.
* libguile/strings.h (scm_tc7_ro_string, SCM_I_STRINGBUF_F_SHARED,
  SCM_I_STRINGBUF_F_INLINE): New macros.

* libguile/strings.c (STRINGBUF_F_SHARED): Alias for
  `SCM_I_STRINGBUF_F_SHARED'.
  (STRINGBUF_F_INLINE): Alias for `SCM_I_STRINGBUF_F_INLINE'.
  (RO_STRING_TAG): Alias for `scm_tc7_ro_string'.
2009-01-13 23:59:17 +01:00
Ludovic Courtès
074f69cdf2 Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	libguile/Makefile.am
	libguile/threads.c
2008-10-11 19:25:54 +02:00
Ludovic Courtès
45a9f43049 Revert "Make literal strings (i.e., returned by `read') read-only."
This reverts commit fb2f8886c4.

The rationale is that `read' must return mutable strings, as reported
by szgyg <szgyg@ludens.elte.hu>.
2008-10-09 22:21:33 +02:00
Ludovic Courtès
b66a552487 Merge branch 'master' into boehm-demers-weiser-gc 2008-09-23 19:01:01 +02:00
Ludovic Courtès
fb2f8886c4 Make literal strings (i.e., returned by `read') read-only.
* libguile/read.c (scm_read_string): Use `scm_i_make_read_only_string ()' to
  return a read-only string, as mandated by R5RS.  Reported by Bill
  Schottstaedt <bil@ccrma.Stanford.EDU>.

* libguile/strings.c (scm_i_make_read_only_string): New function.
  (scm_i_shared_substring_read_only): Special-case the empty string
  so that the read-only and read-write empty strings are `eq?'.  This
  optimization is relied on by the `substring/shared' `empty string'
  test case in `srfi-13.test'.

* libguile/strings.h (scm_i_make_read_only_string): New declaration.

* test-suite/tests/strings.test ("string-set!")["literal string"]: New test.

* NEWS: Update.
2008-09-23 18:45:27 +02:00
Ludovic Courtès
d6c74168a7 Remove unused GC string/symbol functions.
* libguile/strings.c (scm_i_stringbuf_mark, scm_i_stringbuf_free,
  scm_i_string_mark, scm_i_string_free, scm_i_symbol_mark,
  scm_i_symbol_free): Remove.

* libguile/strings.h: Remove corresponding declarations.
2008-09-15 22:59:08 +02:00
Ludovic Courtès
071bb6a840 Add `scm_c_symbol_length ()'. 2008-07-05 20:16:12 +02:00
Ludovic Courtès
102dbb6f6c Add `SCM_INTERNAL' macro, use it. 2008-05-31 23:21:02 +02:00
Kevin Ryde
2b829bbb3d merge from 1.8 branch 2006-04-17 00:05:42 +00:00
Han-Wen Nienhuys
fd0a5bbcb7 patches by Ludovic Courtès for symbol generation. 2006-01-24 20:30:09 +00:00
Marius Vollmer
92205699d0 The FSF has a new address. 2005-05-23 19:57:22 +00:00
Mikael Djurfeldt
e654b06261 (SCM_STRING_UCHARS): Added missing argument. 2005-03-20 13:56:53 +00:00
Marius Vollmer
fe78c51aa2 Turn all deprecated features that once were macros but are now
functions back into macros.
2005-01-12 11:10:02 +00:00
Marius Vollmer
ed35de727a (scm_substring_read_only,
scm_c_substring_read_only, scm_i_substring_read_only): New.
(RO_STRING_TAG, IS_RO_STRING): New.
(scm_i_string_writable_chars): Bail on read-only strings.
2004-09-22 13:54:15 +00:00