1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-30 11:50:28 +02:00
Commit graph

481 commits

Author SHA1 Message Date
Andy Wingo
fc7bd367ab remove all deprecated code
* libguile/async.c:
* libguile/async.h:
* libguile/debug.h:
* libguile/deprecated.c:
* libguile/deprecated.h:
* libguile/evalext.h:
* libguile/gc-malloc.c:
* libguile/gc.h:
* libguile/gen-scmconfig.c:
* libguile/numbers.c:
* libguile/ports.c:
* libguile/ports.h:
* libguile/procprop.c:
* libguile/procprop.h:
* libguile/read.c:
* libguile/socket.c:
* libguile/srfi-4.h:
* libguile/strings.c:
* libguile/strings.h:
* libguile/tags.h:
* module/ice-9/boot-9.scm:
* module/ice-9/deprecated.scm: Remove all deprecated code.  CPP defines
  that were not previously issuing warnings were changed so that their
  expansions would indicate the replacement forms to use,
  e.g. scm_sizet__GONE__REPLACE_WITH__size_t.

  The two exceptions were SCM_LISTN, which did not produce warnings
  before, and the string-filter argument order stuff.

  Drops the initial dirty memory usage of Guile down to 2.8 MB on my
  machine, from 4.4 MB.
2011-05-12 14:01:26 +02:00
Ludovic Courtès
7be1705dbd Fix `get_utf8_codepoint' to not consume valid starting bytes.
Thanks to Mark H. Weaver for pointing this out.

* libguile/ports.c (CONSUME_PEEKED_BYTE): New macro.
  (get_utf8_codepoint): New variable `pt'.  Use
  `scm_peek_byte_or_eof'/`CONSUME_PEEKED_BYTE' pairs instead of
  `scm_get_byte_or_eof'.

* test-suite/tests/ports.test ("string ports")[#xc2 #x41 #x42, #xe0 #xa0
  #x41 #x42, #xf0 #x88 #x88 #x88]: Fix to conform to Unicode 6.0.0.
  [#xe0 #x88 #x88]: Remove test.
  [#xf0 #x80 #x80 #x41]: New test.
2011-05-07 22:47:49 +02:00
Ludovic Courtès
7b292a9d34 Special-case UTF-8 ports to bypass `iconv' entirely.
* libguile/ports.c (update_port_lf): Handle EOF.
  (get_utf8_codepoint, get_iconv_codepoint): New functions.
  (get_codepoint): Use them.
  (scm_i_set_port_encoding_x): Don't open conversion descriptors when
  ENCODING is "UTF-8".

* libguile/print.c (display_string_as_utf8, display_string_using_iconv):
  New functions.
  (display_string): Use them.

* test-suite/tests/ports.test ("string ports")[#xc2 #x41 #x42]: Add a
  note that this is not the wrong behavior per Unicode 6.0.0.
2011-05-06 17:54:09 +02:00
Ludovic Courtès
a42d79711b Gracefully handle unterminated UTF-8 sequences instead of hitting an `assert'.
* libguile/ports.c (get_codepoint): Return EILSEQ when OUTPUT_SIZE == 0.

* test-suite/tests/ports.test ("string ports"): Add 2 tests for
  unterminated sequences.
2011-04-27 20:34:08 +02:00
Ludovic Courtès
190d4b0d93 Make VM string literals immutable.
* libguile/strings.c (scm_i_make_string, scm_i_make_wide_string): Add
  `read_only_p' parameter.  All callers updated.

* libguile/vm-i-loader.c (load_string, load_wide_string): Push read-only
  strings.

* test-suite/tests/strings.test ("literals"): New test prefix.
2011-03-20 23:34:42 +01:00
Andy Wingo
03976fee3b fix code that causes warnings on gcc 4.6
* libguile/arrays.c (scm_i_read_array):
* libguile/backtrace.c (display_backtrace_body):
* libguile/filesys.c (scm_readdir)
* libguile/i18n.c (chr_to_case):
* libguile/ports.c (register_finalizer_for_port):
* libguile/posix.c (scm_nice):
* libguile/stacks.c (scm_make_stack): Clean up a number of
  set-but-unused vars.  Thanks to Douglas Mencken for the report.

* libguile/numbers.c (scm_log, scm_exp): Fix a few #if cases that should
  be #ifdef.
2011-03-17 12:39:53 +01:00
Andy Wingo
ac012a27a2 update port-filename docs
* doc/ref/api-io.texi (File Ports):
* libguile/ports.c (scm_port_filename): Fix docs to match
  implementation.
2011-02-28 20:54:03 +01:00
Andy Wingo
3e05fc0466 fix a couple leaks in ports.c. thanks valgrind!
* libguile/ports.c (scm_i_remove_port): Fix a case in which ports
  explictly closed via close-port would leak their iconv_t data.
  (scm_set_port_encoding_x): scm_i_set_port_encoding_x strdups its
  argument, so we need to free the locale encoding of the incoming str.
2011-02-18 19:28:37 +01:00
Andy Wingo
8269ba5b2c ports.c safely accesses the port weak hash table
* libguile/ports.h (scm_i_remove_port): Remove declaration, as it was
  SCM_INTERNAL.
* libguile/ports.c (scm_add_to_port_table): Issue a deprecation
  warning if this function is called.  Remove needless SCM_API
  declaration, it was already declared as such in ports.h.  Safely
  access the port table.
  (scm_i_remove_port): Remove bogus comment about lack of need for
  threadsafety.  Take the port table mutex.
  (scm_close_port): No need to take port table mutex around calling
  scm_i_remove_port.
2011-02-10 23:16:52 +01:00
Andy Wingo
b7b4aef97c fix potential concurrency bugs in port-for-each
* libguile/ports.c (scm_c_port_for_each): Simplify to avoid concurrency-
  and gc-related bugs.
2011-02-10 23:16:52 +01:00
Ludovic Courtès
9d9c66ba82 Add scm_i_set_default_port_encoding' and scm_i_default_port_encoding'.
* libguile/ports.c (scm_i_set_default_port_encoding,
  scm_i_default_port_encoding): New function.  Replace
  `scm_i_set_port_encoding_x' and `scm_i_get_port_encoding' with
  PORT == SCM_BOOL_F.
  (scm_i_set_port_encoding_x): Assume PORT is a port.
  (scm_i_get_port_encoding): Remove.
  (scm_port_encoding): Adjust accordingly.
  (scm_new_port_table_entry): Use `scm_i_default_port_encoding'.

* libguile/ports.h (scm_i_get_port_encoding): Remove declarations.
  (scm_i_default_port_encoding, scm_i_set_default_port_encoding): New
  declarations.

* libguile/posix.c (setlocale): Use `scm_i_set_default_port_encoding'.
2011-02-10 23:04:43 +01:00
Ludovic Courtès
064c27c4ef Simplify `scm_i_set_port_encoding_x'.
* libguile/ports.c (find_valid_encoding): Remove.
  (scm_i_set_port_encoding_x): Remove call to `find_valid_encoding'.
  Remove `valid_enc'.  Rename `enc' to `encoding'.

* test-suite/tests/ports.test ("port-encoding"): New test prefix.
2011-02-10 23:04:43 +01:00
Ludovic Courtès
6851d3be80 Change `scm_encoding_error' to pass the port and faulty character.
* libguile/strings.c (scm_encoding_error): Remove the `from', `to', and
  `string_or_bv' parameters; add `port' and `chr'.
  (scm_to_stringn): Update accordingly.

* libguile/strings.h (scm_encoding_error): Update accordingly.

* libguile/ports.c (scm_ungetc): Update accordingly.

* libguile/print.c (iprin1, scm_write_char): Update accordingly.

* test-suite/tests/encoding-escapes.test ("display output
  errors")["ultima", "Rashomon"]: Check the arguments of
  `encoding-error'.
  ["tekniko"]: New test.

* test-suite/tests/ports.test ("string ports")["wrong encoding"]: Adjust
  to new `encoding-error' arguments.
2011-02-02 18:06:29 +01:00
Ludovic Courtès
c62da8f891 Have read-char' & co. throw to decoding-error'.
* libguile/ports.c (scm_read_char): Mention `decoding-error' in the
  docstring.
  (get_codepoint): Change to return an error code; add `codepoint'
  output parameter.  Don't raise an error from here.
  (scm_getc): Raise an error with `scm_decoding_error' if
  `get_codepoint' returns an error.
  (scm_peek_char): Likewise.  Update docstring.

* libguile/strings.c (scm_decoding_error_key): New variable.
  (scm_decoding_error): New function.
  (scm_from_stringn): Use `scm_decoding_error' instead of
  `scm_encoding_error'.

* libguile/strings.h (scm_decoding_error): New declaration.

* test-suite/tests/ports.test ("string ports")["read-char, wrong
  encoding, error"]: Change to expect `decoding-error'.  Make sure PORT
  points past the error.
  ["read-char, wrong encoding, escape"]: Likewise.
  ["peek-char, wrong encoding, error"]: New test.

* test-suite/tests/r6rs-ports.test ("7.2.11 Binary
  Output")["put-bytevector with wrong-encoding string port"]: Change to
  expect `decoding-error'.
  ("8.2.6  Input and output ports")["transcoded-port [error handling
  mode = raise]"]: Likewise.

* test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
  error, substitute"]: New tests.

* doc/ref/api-io.texi (Reading): Update documentation of `read-char' and
  `peek-char'.
  (Line/Delimited): Update documentation of `read-line'.
2011-02-02 18:06:28 +01:00
Ludovic Courtès
cc540d0bbd Have `scm_getc' honor the port's conversion strategy.
* libguile/ports.c (get_codepoint): Reset `pt->input_cd' upon failure.
  If `pt->ilseq_handler' is `SCM_ICONVEH_QUESTION_MARK', then return a
  question mark.
  [failure]: Use `scm_encoding_error' when raising an error.

* test-suite/lib.scm (exception:encoding-error): Adjust regexp.

* test-suite/tests/ports.test ("string ports")["read-char, wrong
  encoding, error", "read-char, wrong encoding, escape", "read-char,
  wrong encoding, substitute"]: New tests.
2011-01-26 00:29:51 +01:00
Ludovic Courtès
da288f50bf Remove useless branches in the port code.
* libguile/ports.c (scm_i_get_port_encoding): Remove useless `if'.
  (scm_set_port_encoding_x): Remove redundant `find_valid_encoding'
  call.
2011-01-24 23:15:18 +01:00
Ludovic Courtès
d9544bf012 Always initialize a port's encoding name.
* libguile/ports.c (scm_i_set_port_encoding_x): Always initialize
  PT->encoding to something non-NULL.  This fixes callers of
  `scm_encoding_error' such that they always pass a non-NULL encoding
  name.  Reported by Matei Conovici.
2011-01-24 23:15:18 +01:00
Ludovic Courtès
4325620f6d Rewrite scm_lfwrite_substr' in terms of scm_display'.
* libguile/ports.c (scm_lfwrite_substr): Rewrite in terms of
  `scm_display'.
2011-01-23 01:26:07 +01:00
Ludovic Courtès
a917871527 Remove `scm_lfwrite_str'.
* libguile/ports.c (scm_lfwrite_str): Remove.

* libguile/ports.h (scm_lfwrite_str): Remove declaration.

* libguile/numbers.c (scm_i_print_fraction): Use `scm_display' instead
  of `scm_lfwrite_str'.
2011-01-23 01:14:51 +01:00
Ludovic Courtès
f4bc4e5934 Rewrite read-char', display', etc. using iconv calls instead of libunistring.
Thanks to Bruno Haible for his suggestions.  See
<http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>,
for details.

* libguile/ports.c (register_finalizer_for_port): Always register a
  finalizer for PORT.
  (finalize_port): Close ENTRY->input_cd and ENTRY->output_cd.
  (scm_new_port_table_entry): Initialize the `input_cd' and `output_cd'
  fields.
  (utf8_to_codepoint): New function.
  (get_codepoint): Rewrite to use `iconv' instead of libunistring.
  (scm_i_set_port_encoding_x): Initialize the `input_cd' and `output_cd'
  fields.
  (update_port_lf): Move upward.  Use `switch' instead of `if's.

* libguile/ports.h (scm_t_port)[input_cd, output_cd]: New fields.

* libguile/print.c (codepoint_to_utf8, display_string): New functions.
  (display_character): Use `display_string'.
  (write_combining_character): Likewise.
  (iprin1): Use `display_string' instead of `scm_lfwrite_str', and
  `display_character' instead of `scm_putc'.
  (write_character): Likewise.
  (write_character_escaped): New function.

* test-suite/tests/encoding-escapes.test ("display output
  escapes")["Rashomon"]: Use lower-case escapes.
  ["fake escape"]: New test.
2011-01-23 00:37:25 +01:00
Andy Wingo
4a655e50a3 use scm_from_latin1_symboln for string literals and load-symbol
* libguile/bytevectors.c:
* libguile/eval.c:
* libguile/goops.c:
* libguile/i18n.c:
* libguile/load.c:
* libguile/memoize.c:
* libguile/modules.c:
* libguile/ports.c:
* libguile/print.c:
* libguile/procs.c:
* libguile/programs.c:
* libguile/read.c:
* libguile/script.c:
* libguile/srfi-14.c:
* libguile/stacks.c:
* libguile/strings.c:
* libguile/throw.c:
* libguile/vm.c: Use scm_from_latin1_symboln to make symbols from string
  literals, because they aren't in the user's locale -- they are in
  ASCII, and we can optimize this case.

* libguile/vm-i-loader.c: Also use scm_from_latin1_symboln when loading
  narrow symbols.
2011-01-07 09:18:41 -08:00
Andy Wingo
67a72dc13c scm_setvbuf doesn't throw away current buffers
* libguile/ports.c (scm_drain_input): Slight optimization.

* libguile/fports.c (scm_setvbuf): If there is buffered output, flush
  it.  If there is input, drain it, and then unread it after updating
  the buffers.  Much more sensible than dropping it silently...
2010-12-06 19:59:15 +01:00
Andy Wingo
e1ee45e78b indentation fix in ports.c
* libguile/ports.c (scm_i_get_conversion_strategy): Indentation fix.
2010-12-02 12:46:15 +01:00
Andy Wingo
b2456dd434 fix segfaults when closing the current input port
* libguile/ports.c (scm_char_ready_p, scm_peek_char, scm_unread_char)
  (scm_unread_string): Always validate the port, even in the case that
  we get it the default current-input-port. Otherwise the following
  causes a segfault:

    (begin (close-port (current-input-port)) (peek-char))
2010-10-10 12:13:04 +02:00
Ludovic Courtès
fd5eec2b6e Optimize `peek-char'.
This makes `peek-char' 40x faster on a port whose encoding is
faster on a UTF-8 port containing multi-byte codepoints.

The `xml->sxml' procedure is 4x faster on a 2.7 MiB XML file.

* libguile/ports.c (get_codepoint): New procedure, moved here from
  `scm_getc', with the additional BUF and LEN parameters.
  (scm_getc): Use it.
  (scm_peek_char): Use it instead of the `scm_getc'/`scm_ungetc'
  sequence.

* test-suite/tests/ports.test ("string ports")["peek-char [latin-1]",
  "peek-char [utf-8]"]: New tests.

* benchmark-suite/Makefile.am (SCM_BENCHMARKS): Add
  `benchmarks/ports.bm'.

* benchmark-suite/benchmarks/ports.bm: New file.
2010-09-15 18:38:57 +02:00
Ludovic Courtès
63479112d6 Remove heap allocations in scm_getc', scm_ungetc', and `find_valid_encoding'.
* libguile/ports.c (scm_getc): Provide `u32_conv_from_encoding' with the
  RESULT_BUF stack-allocated buffer to avoid heap allocation.
  (find_valid_encoding): Likewise.
  (scm_ungetc): Ditto with `u32_conv_to_encoding'.
2010-07-15 23:12:57 +02:00
Andy Wingo
0af34a3f83 port-encoding returns #f if port encoding not set
* libguile/ports.c (scm_port_encoding): Instead of returning "NONE" if
  we don't know the encoding, return #f. Allows truncated-print to work
  if you don't have a locale set.
2010-01-09 19:21:09 +01:00
Andy Wingo
f39448c5a3 remove a bunch of needless scm_permanent_object calls
* libguile/array-handle.c:
* libguile/bytevectors.c:
* libguile/deprecated.c:
* libguile/eval.c:
* libguile/feature.c:
* libguile/filesys.c:
* libguile/gc.c:
* libguile/gdbint.c:
* libguile/goops.c:
* libguile/instructions.c:
* libguile/load.c:
* libguile/modules.c:
* libguile/numbers.c:
* libguile/options.c:
* libguile/ports.c:
* libguile/scmsigs.c:
* libguile/srcprop.c:
* libguile/srfi-4.c:
* libguile/stacks.c:
* libguile/threads.c:
* libguile/vm.c: Remove calls to scm_permanent_object, as they are no
  longer needed with the BDW GC.
2009-12-05 11:32:50 +01:00
Andy Wingo
dd3a26f3da remove scm_ports_prehistory
* libguile/ports.h:
* libguile/ports.c (scm_ports_prehistory): Remove this function. Just
  initialize global vars to NULL/0 instead of needing a prehistory
  function.
2009-12-05 11:10:11 +01:00
Ludovic Courtès
56a3dcd431 Remove references to undefined macros.
The intent is to allow compilation with `-Wundef', which in turn should
make it easier to catch erroneous uses of nonexistent macros.

* libguile/__scm.h: Don't assume `BUILDING_LIBGUILE' is defined.

* libguile/conv-uinteger.i.c (SCM_TO_TYPE_PROTO): Remove unneeded CPP
  conditional on `TYPE_MIN == 0'.

* libguile/fports.c: Check for the definition of `HAVE_CHSIZE' and
  `HAVE_FTRUNCATE', not for their value.

* libguile/ports.c: Likewise.

* libguile/numbers.c (guile_ieee_init): Likewise with `HAVE_DINFINITY'
  and `HAVE_DQNAN'.

* test-suite/standalone/test-conversion.c (ieee_init): Likewise.

* libguile/strings.c: Likewise with `SCM_STRING_LENGTH_HISTOGRAM'.

* libguile/strings.h: Likewise.

* libguile/tags.h: Likewise with `HAVE_INTTYPES_H' and `HAVE_STDINT_H'.

* libguile/threads.c: Likewise with `HAVE_PTHREAD_GET_STACKADDR_NP'.

* libguile/vm-engine.c (VM_NAME): Likewise with `VM_CHECK_IP'.

* libguile/gen-scmconfig.c (main): Use "#ifdef HAVE_", not "#if HAVE_".

* libguile/socket.c (scm_setsockopt): Likewise.
2009-11-17 23:42:22 +01:00
Andy Wingo
efcebb5b56 fold objects.[ch] into goops.[ch]
Remove objects.h #includes as appropriate.
2009-11-15 20:28:12 +01:00
Ludovic Courtès
d6a6989e08 Replace setbinary' by a public %default-port-encoding' fluid.
* doc/ref/api-evaluation.texi (Character Encoding of Source Files): Add
  reference to the "Ports" node.

* doc/ref/api-io.texi (Ports): Document `%default-port-encoding'.

* libguile/ports.c (scm_port_encoding_var): Rename to...
  (default_port_encoding_var): ... this; update callers.  Make `static'.

* libguile/posix.c (scm_setbinary): Remove.

* libguile/posix.h: Adjust accordingly.

* test-suite/tests/numbers.test: Remove unneeded `setbinary' call.

* test-suite/tests/ports.test: Replace `setbinary' call by equivalent
  `%default-port-encoding' mutation and `set-port-encoding!' calls.

* test-suite/tests/r6rs-ports.test: Replace `setbinary' call by
  equivalent `%default-port-encoding' mutation.
2009-11-14 16:59:25 +01:00
Ludovic Courtès
682d78d05e Use GC-managed memory for port->encoding.
* libguile/ports.c (scm_new_port_table_entry): Use `scm_gc_strdup ()'
  instead of `strdup ()' for `entry->encoding'.
  (scm_i_set_port_encoding_x): Likewise.  Remove free(3) call.
  (scm_i_remove_port): Don't explicitly free memory.
  (scm_unget_byte): Remove outdated and misleading comment.
2009-10-15 00:56:56 +02:00
Ludovic Courtès
3051344be5 Remove default port/SMOB finalizers.
* libguile/ports.c (scm_port_free0): Remove.
  (scm_make_port_type): Set `free' to NULL.

* libguile/smob.c (scm_make_smob_type): Leave `free' to NULL.
  (scm_smob_free): Remove.

* libguile/smob.h (scm_smob_free): Remove.

* libguile/deprecated.c (scm_smob_free): New.

* libguile/deprecated.h (scm_smob_free): New declaration.
2009-09-28 23:32:33 +02:00
Neil Jerram
b77afe82a4 Typo fixes 2009-09-21 23:22:32 +01:00
Ludovic Courtès
6dc797eee9 Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	libguile/gc_os_dep.c
2009-09-09 22:39:49 +02:00
Michael Gran
7519234547 Fix broken interaction between readline and Unicode
This requires separate small fixes.

Readline has internal logic to deal with multi-byte characters, so
it wants bytes, not characters.

scm_c_read gets called by the vm when readline is activated, and it was
truncating multi-byte characters because soft ports didn't have the
UCS-4 capability.

Soft ports need the capability to read UCS-4 characters.  Since soft ports
may have a single byte buffer, full characters need to be stored into the
pushback buffer.

This broke the optimizations in scm_c_read for using an alternate buffer
for single-byte-buffered ports, because the opimization wasn't expecting
anything in the pushback buffer.

* libguile/vports.c (sf_fill_input): store complete chars, not single bytes

* libguile/ports.c (scm_c_read): don't use optimized path for non Latin-1.
  Add debug prints.

* libguile/string.h: make scm_i_from_stringn and scm_i_string_ref public
  so that readline can use them

* guile-readline/readline.c: read bytes, not complete chars, from the
  input port.  Convert output to the output port's locale
2009-09-07 19:12:34 -07:00
Ludovic Courtès
7af531508c Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	libguile/Makefile.am
	libguile/bytevectors.c
	libguile/gc-card.c
	libguile/gc-mark.c
	libguile/programs.c
	libguile/srcprop.c
	libguile/srfi-14.c
	libguile/symbols.c
	libguile/threads.c
	libguile/unif.c
	libguile/vm.c
2009-08-28 19:16:46 +02:00
Michael Gran
8736ef70ac scm_getc improperly handles Latin-1 characters
Upper-plane Latin-1 characters should be converted to codepoints.

* libguile/ports.c (scm_getc): improper conversion of char to scm_t_wchar
2009-08-27 20:42:36 -07:00
Michael Gran
889975e51a Add full Unicode capability to ports and the default reader
Ports are given two additional properties: a character encoding and
a conversion failure strategy.  These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.

If unspecified, ports use a default value. The default value of these
properties is held in a fluid.  The default character encoding can be
modified by calling setlocale.

ISO-8859-1 is treated specially.  Since it is a native encoding of
strings, it can be processed more quickly.  Source code is assumed to be
ISO-8859-1 unless otherwise specified.  The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.

The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.

* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding

* module/system/base/compile.scm (compile-file): use source-code
  file's self-declared encoding when compiling files

* libguile/strports.c: store string ports in locale encoding
  (scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
  (scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
  new functions

* libguile/strings.h: new declaration for scm_i_string_contains_char

* libguile/strings.c (scm_i_string_contains_char): new function
  (scm_from_stringn, scm_to_stringn):  use NULL for Latin-1
  (scm_from_locale_stringn, scm_to_locale_stringn): respect character
  encoding of input and output ports

* libguile/read.h: declaration for scm_scan_for_encoding

* libguile/read.c:
  (read_token): now takes scheme string instead of C string/length
  (read_complete_token): new function
  (scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
  (scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
  (scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
  (scm_read_scsh_block_comment, scm_read_commented_expression)
  (scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
  (scm_read_expression): use scm_t_wchar for char type, use read_complete_token
  (scm_scan_for_encoding): new function to find a file's character encoding
  (scm_file_encoding): new function to find a port's character encoding

* libguile/rdelim.c: don't unpack strings

* libguile/print.h: declaration for modified function
  scm_i_charprint

* libguile/print.c: use locale when printing characters and
  strings
  (scm_i_charprint): input parameter is now scm_t_wchar
  (scm_simple_format): don't unpack strings

* libguile/posix.h: new declaration for scm_setbinary.

* libguile/posix.c (scm_setlocale): set default and stdio port
  encodings based on the locale's character encoding
  (scm_setbinary): new function

* libguile/ports.h (scm_t_port): add encoding and failed
  conversion handler to port type.  Declarations for new or modified
  functions scm_getc, scm_unget_byte, scm_ungetc,
  scm_i_get_port_encoding, scm_i_set_port_encoding_x,
  scm_port_encoding, scm_set_port_encoding_x,
  scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
  scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.

* libguile/ports.c: assign the current ports to zero on startup so
  we can see if they've been set.
  (scm_current_input_port, scm_current_output_port,
  scm_current_error_port): return #f if the port is not yet
  initialized
  (scm_new_port_table_entry): set up a new port's encoding and
  illegal sequence handler based on the thread's current defaults
  (scm_i_remove_port): free port encoding name when port is removed
  (scm_i_mode_bits_n): now takes a scheme string instead of a c
  string and length.  All callers changed.
  (SCM_MBCHAR_BUF_SIZE): new const
  (scm_getc): new function, since the scm_getc in inline.h is now
  scm_get_byte_or_eof.  This pulls one codepoint from a port.
  (scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
  (scm_unget_byte): new function, incorportaing the low-level functionality
  of scm_ungetc
  (scm_ungetc): uses scm_unget_byte

* libguile/numbers.h (scm_t_wchar): compilation order problem with
  scm_t_wchar being use in functions in multiple headers.  Forward
  declare scm_t_wchar.

* libguile/load.c (scm_primitive_load): scan for file encoding at
  top of file and use it to set the load port's encoding

* libguile/inline.h (scm_get_byte_or_eof): new function
  incorporating most of the functionality of scm_getc.

* libguile/fports.c (fport_fill_input): now returns scm_t_wchar

* libguile/chars.h (scm_t_wchar): avoid compilation order problem
  with declaration of scm_t_wchar
2009-08-25 07:54:37 -07:00
Michael Gran
06b961904d Avoid possible mutex hang on error message output
Avoid possible mutex hang when scm_lfwrite_substr is used in error
message output and when an error has caused the stringbuf write
mutex to not be unlocked.  scm_lfwrite_substr makes a substring:
making a substring requires that mutex.

Hopefully, all cases of non-local jumps when the stringbuf write
lock is held have been eliminated anyway, making this O.B.E.

* libguile/ports.c (scm_lfwrite_str): include functionality in this
  function instead of making this a special case of scm_lfwrite_substr
2009-08-19 22:15:41 -07:00
Ludovic Courtès
fbb857a472 Merge branch 'master' into boehm-demers-weiser-gc
Conflicts:
	lib/Makefile.am
	libguile/Makefile.am
	libguile/frames.c
	libguile/gc-card.c
	libguile/gc-freelist.c
	libguile/gc-mark.c
	libguile/gc-segment.c
	libguile/gc_os_dep.c
	libguile/load.c
	libguile/macros.c
	libguile/objcodes.c
	libguile/programs.c
	libguile/strings.c
	libguile/vm.c
	m4/gnulib-cache.m4
	m4/gnulib-comp.m4
	m4/inline.m4
2009-08-18 00:06:45 +02:00
Michael Gran
eca29b0202 Don't include libunistring headers in Guile public headers
This requres the creation of a new type
scm_t_string_failed_conversion_handler to replace libunistring's
enum iconveh_ilseq_handler.

* libguile/strings.h: don't include <uniconv.h>
(scm_t_string_failed_conversion_handler): new enum type
(SCM_FAILED_CONVERSION_ERROR, SCM_FAILED_CONVERSION_QUESTION_MARK):
(SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE): new enum type values

* libguile/strings.c (scm_to_stringn): now takes type
scm_t_string_failed_conversion_handler.  All callers changed.

* libguile/print.c: include <uniconv.h>

* libguile/ports.c (scm_lfwrite_substr): use
scm_t_string_conversion_handler's constants

* libguile/gen-scmconfig.c (SCM_ICONVEH_ERROR):
(SCM_ICONVEH_QUESTION_MARK, SCM_ICONVEH_ESCAPE_SEQUENCE): store
iconveh_ilseq_hander constants as #define's
2009-08-12 09:21:55 -07:00
Michael Gran
bd4911efd2 Some signed/unsigned comparison and conversions
* libguile/ports.c (scm_lfwrite_str, scm_lfwrite_substr): signed/unsigned
  conversion and comparison

* libguile/strings.c (scm_string_append): signed/unsigned comparison
2009-08-12 08:50:45 -07:00
Michael Gran
b07992900b Port position macros shouldn't require enclosing braces
The port position macros incorrectly required enclosing braces
when used within if statements.

        * libguile/ports.h (SCM_INCLINE, SCM_ZEROCOL, SCM_INCCOL)
        (SCM_DECCOL, SCM_TABCOL): enclose macro in do/while

        * libguile/ports.c (update_port_lf): remove extra braces
2009-08-09 16:58:14 -07:00
Michael Gran
9c44cd4559 Add Unicode strings and symbols
This adds full Unicode strings as a datatype, and it adds some
minimal functionality.  The terminal and port encoding is assumed
to be ISO-8859-1.  Non-ISO-8859-1 characters are written or
input as string character escapes.

The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.

The process for writing to strings has been modified.  There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.

To compile strings that may be wide, the VM storage of strings and
string-likes has changed.

Most string-using functions have not yet been updated and may break
when used with wide strings.


        * module/language/assembly/compile-bytecode.scm (write-bytecode):
        use variable width string bytecode format

        * module/language/assembly.scm (byte-length): use variable width
        bytecode format

        * libguile/vm-i-loader.c (load-string, load-symbol):
        (load-keyword, define): use variable-width bytecode format

        * libguile/vm-engine.h (FETCH_WIDTH): new macro

        * libguile/strings.h: new declarations

        * libguile/strings.c (make_wide_stringbuf): new function
        (widen_stringbuf): new function
        (scm_i_make_wide_string): new function
        (scm_i_is_narrow_string): new function
        (scm_i_string_wide_chars): new function
        (scm_i_string_start_writing): new function
        (scm_i_string_ref): new function
        (scm_i_string_set_x): new function
        (scm_i_is_narrow_symbol): new function
        (scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
        (scm_string_width): new function
        (unistring_escapes_to_guile_escapes): new function
        (scm_to_stringn): new function
        (scm_i_stringbuf_free): modify for wide strings
        (scm_i_substring_copy): modify for wide strings
        (scm_i_string_chars, scm_string_append): modify for wide strings
        (scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
        (scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
        (scm_string, scm_i_deprecated_string_chars): modify for wide strings
        (scm_from_locale_string, scm_from_locale_stringn): add null test

        * libguile/srfi-13.c: add calls for scm_i_string_start_writing for
        each call of scm_i_string_stop_writing
        (scm_string_for_each): modify for wide strings

        * libguile/socket.c: add calls for scm_i_string_start_writing for each
        call of scm_i_string_stop_writing

        * libguile/rw.c: add calls for scm_i_string_start_writing for each
        call of scm_i_string_stop_writing

        * libguile/read.c (scm_read_string): allow reading of wide strings

        * libguile/print.h: add declaration for scm_charprint

        * libguile/print.c (iprin1): print wide strings and add new string
        escapes
        (scm_charprint): new function

        * libguile/ports.h: new declarations for scm_lfwrite_substr and
        scm_lfwrite_str

        * libguile/ports.c (update_port_lf): new function
        (scm_lfwrite): use update_port_lf
        (scm_lfwrite_substr): new function
        (scm_lfwrite_str): new function

        * test-suite/tests/asm-to-bytecode.test ("compiler"): add string
        width byte to sting-like asm tests
2009-08-08 02:35:00 -07:00
Ludovic Courtès
0a94eb002e Remove seek/truncate shortcuts to file ports.
Suggested by Neil.

* libguile/fports.c (fport_seek_or_seek64): Rename to `fport_seek ()'.
  (fport_seek, scm_i_fport_seek, scm_i_fport_truncate): Remove.

* libguile/fports.h (scm_i_fport_seek, scm_i_fport_truncate): Remove
  declarations.

* libguile/ports.c (scm_seek): Remove shortcut that would call out to
  `scm_i_fport_seek ()'.
  (scm_truncate_file): Likewise.
2009-06-28 23:33:17 +02:00
Ludovic Courtès
f1ce919933 Add scm_t_off' type so that scm_t_port' has a fixed layout.
* libguile/gen-scmconfig.c (main): Produce a definition for
  `scm_t_off'.

* libguile/ports.h (scm_t_port)[read_buf_size, saved_read_buf_size,
  write_buf_size, seek, truncate]: Use `scm_t_off' instead of `off_t' so
  that the layout and size of the structure does not depend on the
  application's `_FILE_OFFSET_BITS' value.  Reported by Bill
  Schottstaedt, see
  http://lists.gnu.org/archive/html/bug-guile/2009-06/msg00018.html.
  (scm_set_port_seek, scm_set_port_truncate): Update.

* libguile/ports.c (scm_set_port_seek, scm_set_port_truncate): Use
  `scm_t_off' and `off_t_or_off64_t'.

* libguile/fports.c (fport_seek, fport_truncate): Use `scm_t_off'
  instead of `off_t'.

* libguile/r6rs-ports.c (bip_seek, cbp_seek, bop_seek): Use `scm_t_off'
  instead of `off_t'.

* libguile/rw.c (scm_write_string_partial): Likewise.

* libguile/strports.c (st_resize_port, st_seek, st_truncate): Likewise.

* doc/ref/api-io.texi (Port Implementation): Update prototype of
  `scm_set_port_seek ()' and `scm_set_port_truncate ()'.

* NEWS: Update.
2009-06-25 23:32:44 +02:00
Neil Jerram
53befeb700 Change Guile license to LGPLv3+
(Not quite finished, the following will be done tomorrow.
   module/srfi/*.scm
   module/rnrs/*.scm
   module/scripts/*.scm
   testsuite/*.scm
   guile-readline/*
)
2009-06-17 00:22:09 +01:00
Ludovic Courtès
6290d3f109 GOOPS: Statically allocate the PORT class array.
* libguile/goops.c (scm_port_class): Statically allocate it.
  (create_port_classes): Don't use `scm_calloc ()'.

* libguile/goops.h (scm_port_class): Update declaration.

* libguile/ports.c (scm_make_port_type): When checking whether
  GOOPS is initialized, check whether the first element of
  SCM_PORT_CLASS is non-zero.
2009-02-03 00:03:09 +01:00