This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
* libguile/vm-i-scheme.c: Add add1 and sub1 instructions.
* module/language/tree-il/compile-glil.scm: Compile 1+ and 1- to add1
and sub1.
* module/language/tree-il/primitives.scm (define-primitive-expander):
Add support for `if' statements in the consequent.
(+, -): Compile (- x 1), (+ x 1), and (+ 1 x) to 1- or 1+ as
appropriate.
(1-): Remove this one. Seems we forgot 1+ before, but we weren't
compiling it nicely anyway.
* test-suite/tests/tree-il.test ("void"): Fix expected compilation of (+
(void) 1) to allow for add1.
Affects so far let-bound symbols, lambda arguments to come in the future.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Add :always-lexical option.
* test-suite/tests/elisp-compiler.test: Test it.
* module/language/tree-il/analyze.scm (<binding-info>): New record type.
(report-unused-variables): New procedure.
* module/language/tree-il/compile-glil.scm (%warning-passes): New
variable.
(compile-glil): Honor `#:warnings' from OPTS.
* test-suite/tests/tree-il.test (call-with-warnings): New procedure.
(%opts-w-unused): New variable.
("warnings"): New test prefix.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Handle without-void-checks.
* test-suite/tests/elisp-compiler.test: Test it.
* module/language/elisp/README: Document new features.
* module/language/elisp/runtime/function-slot.scm: Implement funcall, apply and
eval by using the existing compiler code.
* test-suite/tests/elisp-compiler.test: Test those.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Implement guile-primitive.
* test-suite/tests/elisp-compiler.test: Switched a usage of guile-ref to
the now available guile-primitive.
* module/language/elisp/README: Document it.
* module/language/elisp/bindings.scm: New fields in bindings data structure
to keep track of lexical bindings for symbols.
* module/language/elisp/compile-tree-il.scm: Implement lexical-let(*).
* test-suite/tests/elisp-compiler.test: Test lexical scoping with lexical-let.
* libguile/objcodes.c (OBJCODE_COOKIE): Bump objcode cookie, as we added
to struct scm_objcode.
* libguile/objcodes.h (struct scm_objcode): Add a uint32 after metalen
and before base, so that if the structure has 8-byte alignment, base
will have 8-byte alignment too. (Before, base was 12 bytes from the
start of the structure, now it's 16 bytes.)
* libguile/vm-engine.h (ASSERT_ALIGNED_PROCEDURE): Add a check that can
be turned on with VM_ENABLE_PARANOID_ASSERTIONS.
(CACHE_PROGRAM): Call ASSERT_ALIGNED_PROCEDURE.
* libguile/vm-i-system.c (long-local-ref): Add a missing semicolon.
* libguile/vm.c (really_make_boot_program): Rework to operate directly
on a malloc'd buffer, so that the program will be 8-byte aligned.
* module/language/assembly.scm (*program-header-len*): Add another 4 for
the padding.
(object->assembly): Fix case in which we would return (make-int8 0)
instead of (make-int8:0). This would throw off compile-assembly.scm's
use of addr+.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
out the padding int.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
And pop off the padding int too.
* module/language/glil/compile-assembly.scm (glil->assembly): Don't pack
the assembly, assume that assembly.scm has done it for us. If a
program has a meta, pad out the program so that meta will be aligned.
* test-suite/tests/asm-to-bytecode.test: Adapt to expect programs to
have the extra 4-byte padding int.
* module/language/elisp/README: Document the change.
* module/language/elisp/compile-tree-il.scm: Add disable-void-check option.
* test-suite/tests/elisp-compiler.test: Test it.
* libguile/objcodes.h (struct scm_objcode): Remove the "unused" field --
the old "nexts" -- and expand nlocs to 16 bits.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
the nlocs as a uint16.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
Decompile 16-bit nlocs. It seems this decompilation is little-endian
:-/
* test-suite/tests/asm-to-bytecode.test: Fix up to understand nlocs as a
little-endian value. The test does the right thing regarding
endianness.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Implement flet and flet*.
* test-suite/tests/elisp-compiler.test: Test flet and flet*.
* libguile/programs.h:
* libguile/programs.c: (SCM_PROGRAM_FREE_VARIABLES): Rename from
SCM_PROGRAM_FREE_VARS. Callers changed.
* libguile/programs.c (scm_make_program): Rename arg to
"free_variables".
(scm_program_free_variables): Rename from program-free-vars.
* libguile/vm-engine.h:
* libguile/vm-engine.c (VM_CHECK_FREE_VARIABLES): Rename from
VM_CHECK_CLOSURE.
(vm_engine, CACHE_PROGRAM): Rename closure and closure_count to free_vars and
free_vars_vount.
* libguile/vm-i-system.c (FREE_VARIABLE_REF): Rename from CLOSURE_REF.
(free-ref, free-boxed-ref, free-boxed-set): Rename from closure-ref,
closure-boxed-ref, closure-boxed-set.
(make-closure): Renamed from make-closure2.
* module/language/glil/compile-assembly.scm (glil->assembly): Hack to
never write out the the old "make-closure" instruction. Will fix
better later. Change to emit free-ref etc instead of closure-ref.
* module/language/tree-il/compile-glil.scm (flatten): Emit make-closure
instead of make-closure2, now that the old make-closure is gone.
* module/system/vm/program.scm (system): Rename program-free-vars to
program-free-variables.
* test-suite/tests/tree-il.test ("lambda"): Update for make-closure.
* module/language/glil.scm (<glil>): New GLIL type, <glil-lexical>,
which will subsume other lexical types.
* module/language/glil/compile-assembly.scm: Compile <glil-lexical>.
(make-open-binding): Change the interpretation of the second argument
-- instead of indicating an "external" var, it now indicates a boxed
var.
(open-binding): Adapt to new glil-bind format.
* module/language/tree-il/analyze.scm: Add a lot more docs.
(analyze-lexicals): Change the allocation algorithm and output format
to allow the tree-il->glil compiler to capture free variables
appropriately and to reference bound variables in boxes if necessary.
Amply documented.
* module/language/tree-il/compile-glil.scm (compile-glil): Compile
lexical variable access to <glil-lexical>. Emit variable capture and
closure creation code here, instead of leaving that task to the
GLIL->assembly compiler.
* test-suite/tests/tree-il.test: Update expected code emission.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Moved ensure-fluid! to runtime function.
* module/language/elisp/runtime.scm: Runtime functions to support dynamic value access.
* module/language/elisp/runtime/function-slot.scm: Defined the built-ins.
* test-suite/tests/elisp-compiler.test: Test them.
* module/language/elisp/README: Document it and some further ideas written down.
* module/language/elisp/compile-tree-il.scm: Implement prog1, dolist.
* module/language/elisp/runtime/macro-slot.scm: prog2 and dotimes.
* test-suite/tests/elisp-compiler.test: Test prog1, prog2, dotimes, dolist.
* libguile/array-handle.c (scm_i_register_array_implementation):
(scm_i_array_implementation_for_obj): Add generic array facility,
which will (in a few commits) detangle the array code.
(scm_array_get_handle): Use the generic array facility. Note that
scm_t_array_handle no longer has ref and set function pointers;
instead it has a pointer to the array implementation. It is unlikely
that code out there used these functions, however, as the supported
way was through scm_array_handle_ref/set_x.
(scm_array_handle_pos): Move this function here from arrays.c.
(scm_array_handle_element_type): New function, returns a Scheme value
representing the type of element stored in this array.
* libguile/array-handle.h (scm_t_array_element_type): New enum, for
generically determining the type of an array.
(scm_array_handle_rank):
(scm_array_handle_dims): These are now just #defines.
* libguile/arrays.c:
* libguile/bitvectors.c:
* libguile/bytevectors.c:
* libguile/srfi-4.c:
* libguile/strings.c:
* libguile/vectors.c: Register array implementations for all of these.
* libguile/inline.h: Update for array_handle_ref/set change.
* libguile/deprecated.h: Need to include arrays.h now.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Implement defmacro and expansion.
* module/language/elisp/runtime/macro-slot.scm: New module to keep definitions.
* test-suite/Makefile.am: Add elisp-compiler.test to list of tests.
* test-suite/tests/elisp-compiler.test: Basic macro tests.
* module/language/elisp/runtime/function-slot.scm: Fixed errors in number preds.
* test-suite/tests/elisp-compiler.test: Test built-ins already implemented.
Thanks to Bill Schottstaedt for reporting this problem!
* libguile/numbers.c (mem2ureal): Don't be misled by *p_exactness
being INEXACT on entry (as is possible when reading a complex
number): use local exactness variable x which starts as EXACT.
Call mem2decimal_from_point () with &x instead of p_exactness.
* test-suite/tests/numbers.test ("string->number"): Add complex number
tests suggested by Bill.
As a side effect, this allows compilation of literal bytevectors
("#vu8(...)"), which gets done by the generic array handling
of the GLIL->assembly compiler.
* doc/ref/api-compound.texi (Generalized Vectors): Mention bytevectors.
(Arrays, Array Syntax): Likewise.
* doc/ref/api-data.texi (Bytevectors as Generalized Vectors): New node.
* libguile/bytevectors.c (scm_i_bytevector_generalized_set_x): New.
* libguile/bytevectors.h (scm_i_bytevector_generalized_set_x): New
declaration.
* libguile/srfi-4.c (scm_i_generalized_vector_type,
scm_array_handle_uniform_element_size,
scm_array_handle_uniform_writable_elements): Add support for
bytevectors.
* libguile/unif.c (type_creator_table): Add `vu8'.
(bytevector_ref, bytevector_set): New functions.
(memoize_ref, memoize_set): Add support for bytevectors.
* libguile/vectors.c (scm_is_generalized_vector,
scm_c_generalized_vector_length, scm_c_generalized_vector_ref,
scm_c_generalized_vector_set_x): Add support for bytevectors.
* test-suite/tests/bytevectors.test ("Generalized Vectors"): New test
set.
* libguile/read.c (scm_read_bytevector): New function.
(scm_read_sharp): Add `v' case for bytevectors.
* test-suite/lib.scm (exception:read-error): New variable.
* test-suite/tests/bytevectors.test ("Datum Syntax"): New test set.
* libguile/bytevectors.c (bytevector_equal_p): New function.
* test-suite/tests/bytevectors.test ("2.3 Operations on Bytes and
Octets")["equal?"]: New test.
Thanks to Greg Troxel for reporting, and Barry Fishman for the
explanation and fix.
* test-suite/tests/popen.test ("open-input-pipe"): Use shell function
`read' with an explicit argument, as apparently not all shells
support read with no argument.
* libguile/load.c (scm_try_autocompile): Punt if compiled-file-name does
not resolve, which would indicate that the file in question is part of
the compiler itself.
* test-suite/tests/elisp.test: Today I was an evil one -- disable
autocompilation for the elisp tests, as they are meant only for the
memoizer's eyes. Hopefully Daniel will fix this :-)