* module/language/elisp/parser.scm: New parser file.
* module/language/elisp/lexer.scm: Fix lexer/1 and add unquote-splicing support.
* module/language/elisp/spec.scm: Use new elisp-reader.
* module/language/elisp/README: Document we've got a reader now.
* test-suite/tests/elisp-reader.test: Test the parser.
* module/language/elisp/lexer.scm: New lexer file.
* test-suite/Makefile.am: Register elisp-reader.test as new test.
* test-suite/tests/elisp-reader.test: New test-case.
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
* libguile/socket.c (scm_recv): receive the message without holding the
stringbuf writing lock
(scm_send): try to narrow a string before using it
* libguile/stime.c (strftime): convert string to UTF-8 so that it can
be safely passed to strftime
(strptime): convert input string to UTF-8 so that it can be safely
passed through strptime
* libguile/strings.c (narrow_stringbuf): new function
(scm_i_try_narrow_string): new function
* libguile/strings.h: new declaration for scm_i_try_narrow_string
* libguile/srcprop.c (scm_set_source_properties_x): Look for the special
source properties, save them off, and then construct a srcprops object
using them.
* libguile/hash.c (scm_i_string_hash): new function
(scm_hasher): don't unpack string: use scm_i_string_hash
* libguile/hash.h: new declaration for scm_i_string_hash
* libguile/print.c (quote_keywordish_symbol): use symbol accessors
(scm_i_print_symbol_name): new function
(scm_print_symbol_name): call scm_i_print_symbol_name
(iprin1): use scm_i_print_symbol_name to print symbols
* libguile/print.h: new declaration for scm_i_print_symbol_name
* libguile/symbols.c (lookup_interned_symbol): now takes scheme string
instead of c string; callers changed
(lookup_interned_symbol): add wide symbol support
(scm_i_c_mem2symbol): removed
(scm_i_mem2symbol): removed and replaced with scm_i_str2symbol
(scm_i_str2symbol): new function
(scm_i_mem2uninterned_symbol): removed and replaced with
scm_i_str2uninterned_symbol
(scm_i_str2uninterned_symbol): new function
(scm_make_symbol, scm_string_to_symbol, scm_from_locale_symbol)
(scm_from_locale_symboln): use scm_i_str2symbol
* test-suite/tests/symbols.test: new tests
* module/ice-9/boot-9.scm (eval): Here at the tail of boot-9, replace
the root definition of `eval' with a procedure that will call
`compile'.
* test-suite/tests/syntax.test ("top-level define"):
("internal define"): Run unmemoization tests in the interpreter, using
primitive-eval.
* libguile/tags.h (scm_tc7_program):
* libguile/programs.h: Programs now have their own tc7 code. Fix up the
macros appropriately.
* libguile/programs.c: Remove smobby bits, leaving marking, printing,
and application for other parts of Guile.
* libguile/debug.c (scm_procedure_source):
* libguile/eval.c (scm_trampoline_0, scm_trampoline_1)
(scm_trampoline_2): Add cases for tc7_program.
* libguile/eval.i.c (CEVAL, SCM_APPLY):
* libguile/evalext.c (scm_self_evaluating_p):
* libguile/gc-card.c (scm_i_sweep_card, scm_i_tag_name):
* libguile/gc-mark.c (1):
* libguile/print.c (iprin1):
* libguile/procs.c (scm_procedure_p, scm_thunk_p)
* libguile/vm-i-system.c (make-closure): Adapt to new procedure
representation.
* libguile/procprop.c (scm_i_procedure_arity): Do the right thing for
programs.
* test-suite/tests/procprop.test ("procedure-arity"): Arity test now
succeeds.
* libguile/goops.c (scm_class_of): Programs now belong to the class
<procedure>, not a smob class.
* libguile/vm.h (struct vm, struct vm_cont):
* libguile/vm-engine.c (vm_engine):
* libguile/frames.h (SCM_FRAME_BYTE_CAST, struct vm_frame):
* libguile/frames.c (scm_c_make_vm_frame): Fix usages of scm_byte_t,
changing them to scm_t_uint8.
* module/ice-9/boot-9.scm (@bind): Define a VM-compatible syntax
definition for this old evaluator primitive.
* test-suite/tests/dynamic-scope.test: Change the expected error
messages.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump
* libguile/vm-engine.c (vm_engine): Push a frame corresponding to the
mv-call.
* libguile/vm-i-system.c: Renumber ops.
(new-frame): New op, pushes a frame.
(call, mv-call): No need to shuffle args, though we do need to pop the
frame in the non-vm call case.
(goto/args): Inconsequential tweaks.
(call/cc): Push a frame if needed.
* module/language/tree-il/compile-glil.scm (flatten): Emit `new-frame'
as appropriate.
* test-suite/tests/tree-il.test: Fix to expect new-frame.
* test-suite/tests/compiler.test ("psyntax")["redefinition", "compile in
current module", "compile in fresh module"]: Use `begin' to enforce
evaluation order. Thanks Andy!
* module/ice-9/boot-9.scm (module-name): When making MOD non-anonymous,
bind it in the `(%app modules)' name space.
* test-suite/tests/compiler.test ("psyntax")["compile in current
module", "compile in fresh module"]: New tests.
* test-suite/tests/modules.test ("foundations")["modules don't remain
anonymous"]: New test.
* module/ice-9/psyntax-pp.scm: Regenerate.
* module/ice-9/psyntax.scm (chi-top)[define-form]: If a same-named
imported variable exists, take its value instead of `#f'.
* test-suite/tests/compiler.test ("psyntax")["redefinition"]: New tests.
* module/language/tree-il/compile-glil.scm (compile-glil): Compute
warnings before optimizing, as unreferenced variables will be
optimized out.
* libguile/_scm.h: Fix C99 comment.
* module/language/tree-il/fix-letrec.scm (partition-vars): Also analyze
let-bound vars.
(fix-letrec!): Fix a bug whereby a set! to an unreffed var would be
called for value, not effect. Also "fix" <let>-bound lambda
expressions -- really speeds up pmatch.
* test-suite/tests/tree-il.test ("lexical sets", "the or hack"): Update
to take into account the new optimizations.
* libguile/string.c (scm_string): Restores the functionality
where scm_string tests for circular lists
* test-suite/tests/strings.test: add test for circular lists
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump.
* libguile/vm-engine.c (vm_error_bad_wide_string_length): New error
case.
* libguile/vm-i-loader.c (load-unsigned-integer, load-integer)
(load-keyword): Remove these instructions. The former two are
obsoleted by make-int64/make-uint64, the latter via make-keyword.
(load-string): Only handle narrow strings.
(load-symbol): Only handle narrow symbols. The wide case is handled
via make-symbol.
(load-wide-string): New instruction, for wide strings.
* libguile/vm-i-system.c (define): Move here from loaders.c, as now it
just takes a sym on the stack.
(make-keyword, make-symbol): New instructions.
* module/language/assembly.scm: Remove removed instructions. No more
width byte in load-string etc.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Adapt
to change in instruction set.
* module/language/glil/compile-assembly.scm (glil->assembly): Compile
define by pushing the sym then emitting (define).
(dump-object): Dump narrow and wide strings differently. Use
make-keyword and make-symbol as appropriate.
* module/language/tree-il/compile-glil.scm (flatten): When compiling a
ref to a primitive (not a call), first see if the primitive is
actually bound in the root module. (That's not the case with e.g.
bytevector-u8-ref).
* module/system/xref.scm (program-callee-rev-vars): Don't parse out
"nexts".
* test-suite/tests/asm-to-bytecode.test ("compiler"): Adapt to bytecode
format change.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
* libguile/vm-i-scheme.c: Add add1 and sub1 instructions.
* module/language/tree-il/compile-glil.scm: Compile 1+ and 1- to add1
and sub1.
* module/language/tree-il/primitives.scm (define-primitive-expander):
Add support for `if' statements in the consequent.
(+, -): Compile (- x 1), (+ x 1), and (+ 1 x) to 1- or 1+ as
appropriate.
(1-): Remove this one. Seems we forgot 1+ before, but we weren't
compiling it nicely anyway.
* test-suite/tests/tree-il.test ("void"): Fix expected compilation of (+
(void) 1) to allow for add1.
Affects so far let-bound symbols, lambda arguments to come in the future.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Add :always-lexical option.
* test-suite/tests/elisp-compiler.test: Test it.
* module/language/tree-il/analyze.scm (<binding-info>): New record type.
(report-unused-variables): New procedure.
* module/language/tree-il/compile-glil.scm (%warning-passes): New
variable.
(compile-glil): Honor `#:warnings' from OPTS.
* test-suite/tests/tree-il.test (call-with-warnings): New procedure.
(%opts-w-unused): New variable.
("warnings"): New test prefix.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Handle without-void-checks.
* test-suite/tests/elisp-compiler.test: Test it.
* module/language/elisp/README: Document new features.
* module/language/elisp/runtime/function-slot.scm: Implement funcall, apply and
eval by using the existing compiler code.
* test-suite/tests/elisp-compiler.test: Test those.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Implement guile-primitive.
* test-suite/tests/elisp-compiler.test: Switched a usage of guile-ref to
the now available guile-primitive.
* module/language/elisp/README: Document it.
* module/language/elisp/bindings.scm: New fields in bindings data structure
to keep track of lexical bindings for symbols.
* module/language/elisp/compile-tree-il.scm: Implement lexical-let(*).
* test-suite/tests/elisp-compiler.test: Test lexical scoping with lexical-let.
* libguile/objcodes.c (OBJCODE_COOKIE): Bump objcode cookie, as we added
to struct scm_objcode.
* libguile/objcodes.h (struct scm_objcode): Add a uint32 after metalen
and before base, so that if the structure has 8-byte alignment, base
will have 8-byte alignment too. (Before, base was 12 bytes from the
start of the structure, now it's 16 bytes.)
* libguile/vm-engine.h (ASSERT_ALIGNED_PROCEDURE): Add a check that can
be turned on with VM_ENABLE_PARANOID_ASSERTIONS.
(CACHE_PROGRAM): Call ASSERT_ALIGNED_PROCEDURE.
* libguile/vm-i-system.c (long-local-ref): Add a missing semicolon.
* libguile/vm.c (really_make_boot_program): Rework to operate directly
on a malloc'd buffer, so that the program will be 8-byte aligned.
* module/language/assembly.scm (*program-header-len*): Add another 4 for
the padding.
(object->assembly): Fix case in which we would return (make-int8 0)
instead of (make-int8:0). This would throw off compile-assembly.scm's
use of addr+.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
out the padding int.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
And pop off the padding int too.
* module/language/glil/compile-assembly.scm (glil->assembly): Don't pack
the assembly, assume that assembly.scm has done it for us. If a
program has a meta, pad out the program so that meta will be aligned.
* test-suite/tests/asm-to-bytecode.test: Adapt to expect programs to
have the extra 4-byte padding int.
* module/language/elisp/README: Document the change.
* module/language/elisp/compile-tree-il.scm: Add disable-void-check option.
* test-suite/tests/elisp-compiler.test: Test it.
* libguile/objcodes.h (struct scm_objcode): Remove the "unused" field --
the old "nexts" -- and expand nlocs to 16 bits.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
the nlocs as a uint16.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
Decompile 16-bit nlocs. It seems this decompilation is little-endian
:-/
* test-suite/tests/asm-to-bytecode.test: Fix up to understand nlocs as a
little-endian value. The test does the right thing regarding
endianness.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Implement flet and flet*.
* test-suite/tests/elisp-compiler.test: Test flet and flet*.
* libguile/programs.h:
* libguile/programs.c: (SCM_PROGRAM_FREE_VARIABLES): Rename from
SCM_PROGRAM_FREE_VARS. Callers changed.
* libguile/programs.c (scm_make_program): Rename arg to
"free_variables".
(scm_program_free_variables): Rename from program-free-vars.
* libguile/vm-engine.h:
* libguile/vm-engine.c (VM_CHECK_FREE_VARIABLES): Rename from
VM_CHECK_CLOSURE.
(vm_engine, CACHE_PROGRAM): Rename closure and closure_count to free_vars and
free_vars_vount.
* libguile/vm-i-system.c (FREE_VARIABLE_REF): Rename from CLOSURE_REF.
(free-ref, free-boxed-ref, free-boxed-set): Rename from closure-ref,
closure-boxed-ref, closure-boxed-set.
(make-closure): Renamed from make-closure2.
* module/language/glil/compile-assembly.scm (glil->assembly): Hack to
never write out the the old "make-closure" instruction. Will fix
better later. Change to emit free-ref etc instead of closure-ref.
* module/language/tree-il/compile-glil.scm (flatten): Emit make-closure
instead of make-closure2, now that the old make-closure is gone.
* module/system/vm/program.scm (system): Rename program-free-vars to
program-free-variables.
* test-suite/tests/tree-il.test ("lambda"): Update for make-closure.
* module/language/glil.scm (<glil>): New GLIL type, <glil-lexical>,
which will subsume other lexical types.
* module/language/glil/compile-assembly.scm: Compile <glil-lexical>.
(make-open-binding): Change the interpretation of the second argument
-- instead of indicating an "external" var, it now indicates a boxed
var.
(open-binding): Adapt to new glil-bind format.
* module/language/tree-il/analyze.scm: Add a lot more docs.
(analyze-lexicals): Change the allocation algorithm and output format
to allow the tree-il->glil compiler to capture free variables
appropriately and to reference bound variables in boxes if necessary.
Amply documented.
* module/language/tree-il/compile-glil.scm (compile-glil): Compile
lexical variable access to <glil-lexical>. Emit variable capture and
closure creation code here, instead of leaving that task to the
GLIL->assembly compiler.
* test-suite/tests/tree-il.test: Update expected code emission.