The real names \`, \, and \,@ should be used instead and are returned
now by the real reader.
* module/language/elisp/compile-tree-il.scm: Only accept correct names.
* module/language/elisp/parser.scm: New parser file.
* module/language/elisp/lexer.scm: Fix lexer/1 and add unquote-splicing support.
* module/language/elisp/spec.scm: Use new elisp-reader.
* module/language/elisp/README: Document we've got a reader now.
* test-suite/tests/elisp-reader.test: Test the parser.
* module/language/elisp/lexer.scm: New lexer file.
* test-suite/Makefile.am: Register elisp-reader.test as new test.
* test-suite/tests/elisp-reader.test: New test-case.
Ports are given two additional properties: a character encoding and
a conversion failure strategy. These properties have getters and setters.
The new properties are used to convert any locale text to/from the
internal representation of strings.
If unspecified, ports use a default value. The default value of these
properties is held in a fluid. The default character encoding can be
modified by calling setlocale.
ISO-8859-1 is treated specially. Since it is a native encoding of
strings, it can be processed more quickly. Source code is assumed to be
ISO-8859-1 unless otherwise specified. The encoding of a source code
file can be given as 'coding: XXXXX' in a magic comment at the top of a
file.
The C functions that deal with encoding often use a null pointer
as shorthand for the native Latin-1 encoding, for efficiency's sake.
* test-suite/tests/encoding-iso88591.test: new tests
* test-suite/tests/encoding-iso88597.test: new tests
* test-suite/tests/encoding-utf8.test: new tests
* test-suite/tests/encoding-escapes.test: new tests
* test-suite/tests/numbers.test: declare 'binary' encoding
* test-suite/tests/ports.test: declare 'binary' encoding
* test-suite/tests/r6rs-ports.test: declare 'binary' encoding
* module/system/base/compile.scm (compile-file): use source-code
file's self-declared encoding when compiling files
* libguile/strports.c: store string ports in locale encoding
(scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
(scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
new functions
* libguile/strings.h: new declaration for scm_i_string_contains_char
* libguile/strings.c (scm_i_string_contains_char): new function
(scm_from_stringn, scm_to_stringn): use NULL for Latin-1
(scm_from_locale_stringn, scm_to_locale_stringn): respect character
encoding of input and output ports
* libguile/read.h: declaration for scm_scan_for_encoding
* libguile/read.c:
(read_token): now takes scheme string instead of C string/length
(read_complete_token): new function
(scm_read_sexp, scm_read_number, scm_read_mixed_case_symbol)
(scm_read_number_and_radix, scm_read_quote, scm_read_semicolon_comment)
(scm_read_srfi4_vector, scm_read_bytevector, scm_read_guile_bit_vector)
(scm_read_scsh_block_comment, scm_read_commented_expression)
(scm_read_extended_symbol, scm_read_sharp_extension, scm_read_shart)
(scm_read_expression): use scm_t_wchar for char type, use read_complete_token
(scm_scan_for_encoding): new function to find a file's character encoding
(scm_file_encoding): new function to find a port's character encoding
* libguile/rdelim.c: don't unpack strings
* libguile/print.h: declaration for modified function
scm_i_charprint
* libguile/print.c: use locale when printing characters and
strings
(scm_i_charprint): input parameter is now scm_t_wchar
(scm_simple_format): don't unpack strings
* libguile/posix.h: new declaration for scm_setbinary.
* libguile/posix.c (scm_setlocale): set default and stdio port
encodings based on the locale's character encoding
(scm_setbinary): new function
* libguile/ports.h (scm_t_port): add encoding and failed
conversion handler to port type. Declarations for new or modified
functions scm_getc, scm_unget_byte, scm_ungetc,
scm_i_get_port_encoding, scm_i_set_port_encoding_x,
scm_port_encoding, scm_set_port_encoding_x,
scm_i_get_conversion_strategy, scm_i_set_conversion_strategy_x,
scm_port_conversion_strategy, scm_set_port_conversion_strategy_x.
* libguile/ports.c: assign the current ports to zero on startup so
we can see if they've been set.
(scm_current_input_port, scm_current_output_port,
scm_current_error_port): return #f if the port is not yet
initialized
(scm_new_port_table_entry): set up a new port's encoding and
illegal sequence handler based on the thread's current defaults
(scm_i_remove_port): free port encoding name when port is removed
(scm_i_mode_bits_n): now takes a scheme string instead of a c
string and length. All callers changed.
(SCM_MBCHAR_BUF_SIZE): new const
(scm_getc): new function, since the scm_getc in inline.h is now
scm_get_byte_or_eof. This pulls one codepoint from a port.
(scm_lfwrite_substr, scm_lfwrite_str): now uses port's encoding
(scm_unget_byte): new function, incorportaing the low-level functionality
of scm_ungetc
(scm_ungetc): uses scm_unget_byte
* libguile/numbers.h (scm_t_wchar): compilation order problem with
scm_t_wchar being use in functions in multiple headers. Forward
declare scm_t_wchar.
* libguile/load.c (scm_primitive_load): scan for file encoding at
top of file and use it to set the load port's encoding
* libguile/inline.h (scm_get_byte_or_eof): new function
incorporating most of the functionality of scm_getc.
* libguile/fports.c (fport_fill_input): now returns scm_t_wchar
* libguile/chars.h (scm_t_wchar): avoid compilation order problem
with declaration of scm_t_wchar
* module/ice-9/boot-9.scm (eval): Here at the tail of boot-9, replace
the root definition of `eval' with a procedure that will call
`compile'.
* test-suite/tests/syntax.test ("top-level define"):
("internal define"): Run unmemoization tests in the interpreter, using
primitive-eval.
* module/ice-9/boot-9.scm (@bind): Define a VM-compatible syntax
definition for this old evaluator primitive.
* test-suite/tests/dynamic-scope.test: Change the expected error
messages.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump
* libguile/vm-engine.c (vm_engine): Push a frame corresponding to the
mv-call.
* libguile/vm-i-system.c: Renumber ops.
(new-frame): New op, pushes a frame.
(call, mv-call): No need to shuffle args, though we do need to pop the
frame in the non-vm call case.
(goto/args): Inconsequential tweaks.
(call/cc): Push a frame if needed.
* module/language/tree-il/compile-glil.scm (flatten): Emit `new-frame'
as appropriate.
* test-suite/tests/tree-il.test: Fix to expect new-frame.
* libguile/load.h:
* libguile/load.c (scm_sys_warn_autocompilation_enabled): New primitive,
not exported. Since `load' autocompiles now, it should warn in the
same way that the bits hardcoded into C warn.
(scm_try_autocompile): Use scm_sys_warn_autocompilation_enabled.
* module/ice-9/boot-9.scm (autocompiled-file-name): New helper.
(load): Try autocompiling the argument, if appropriate. Will
autocompile files passed on Guile's command line. `primitive-load' is
unaffected.
This should now work thanks to the changes in
28b119ee3d ("make sure all programs are
8-byte aligned"). This commit is a follow-up to
ec99fe8ecb ("Add FIXMEs about misaligned
objcode-metas.").
* libguile/objcodes.c (scm_c_make_objcode_slice): Uncomment assertion
that checks for proper alignment of PTR.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Update
comment about META's alignment.
* module/ice-9/boot-9.scm (module-name): When making MOD non-anonymous,
bind it in the `(%app modules)' name space.
* test-suite/tests/compiler.test ("psyntax")["compile in current
module", "compile in fresh module"]: New tests.
* test-suite/tests/modules.test ("foundations")["modules don't remain
anonymous"]: New test.
* module/ice-9/psyntax-pp.scm: Regenerate.
* module/ice-9/psyntax.scm (chi-top)[define-form]: If a same-named
imported variable exists, take its value instead of `#f'.
* test-suite/tests/compiler.test ("psyntax")["redefinition"]: New tests.
* module/language/tree-il/compile-glil.scm (compile-glil): Compute
warnings before optimizing, as unreferenced variables will be
optimized out.
* libguile/_scm.h: Fix C99 comment.
* module/language/tree-il/fix-letrec.scm (partition-vars): Also analyze
let-bound vars.
(fix-letrec!): Fix a bug whereby a set! to an unreffed var would be
called for value, not effect. Also "fix" <let>-bound lambda
expressions -- really speeds up pmatch.
* test-suite/tests/tree-il.test ("lexical sets", "the or hack"): Update
to take into account the new optimizations.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump.
* libguile/vm-engine.c (vm_error_bad_wide_string_length): New error
case.
* libguile/vm-i-loader.c (load-unsigned-integer, load-integer)
(load-keyword): Remove these instructions. The former two are
obsoleted by make-int64/make-uint64, the latter via make-keyword.
(load-string): Only handle narrow strings.
(load-symbol): Only handle narrow symbols. The wide case is handled
via make-symbol.
(load-wide-string): New instruction, for wide strings.
* libguile/vm-i-system.c (define): Move here from loaders.c, as now it
just takes a sym on the stack.
(make-keyword, make-symbol): New instructions.
* module/language/assembly.scm: Remove removed instructions. No more
width byte in load-string etc.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Adapt
to change in instruction set.
* module/language/glil/compile-assembly.scm (glil->assembly): Compile
define by pushing the sym then emitting (define).
(dump-object): Dump narrow and wide strings differently. Use
make-keyword and make-symbol as appropriate.
* module/language/tree-il/compile-glil.scm (flatten): When compiling a
ref to a primitive (not a call), first see if the primitive is
actually bound in the root module. (That's not the case with e.g.
bytevector-u8-ref).
* module/system/xref.scm (program-callee-rev-vars): Don't parse out
"nexts".
* test-suite/tests/asm-to-bytecode.test ("compiler"): Adapt to bytecode
format change.
* module/Makefile.am (ECMASCRIPT_LANG_SOURCES):
* module/language/ecmascript/compile-ghil.scm:
* module/language/ecmascript/compile-tree-il.scm: SOURCES): Replace the
GHIL compiler with a ->tree-il compiler. Not fully functional, but the
basics work.
* module/language/ecmascript/spec.scm: Only include the tree-il compiler.
* module/language/ecmascript/tokenize.scm (read-punctuation): Avoid
mutating a constant.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
* module/language/tree-il/analyze.scm (analyze-lexicals): Rework to
actually determine when a fixed-point procedure may be allocated as a
label.
* module/language/tree-il/compile-glil.scm (emit-bindings): Always emit
a <glil-bind>. Otherwise it's too hard to pair with unbindings.
(flatten-lambda): Consequently, here we only `bind' if there are any
vars to bind. This doesn't make any difference, given that lambdas
don't have trailing unbind instructions, but it does keep the GLIL
output the same for thunks -- no extraneous (bind) instructions. Keeps
tree-il.test happy.
(flatten): Some bugfixes. Yaaay, it works!!!
* module/language/tree-il/compile-glil.scm (flatten-lambda, flatten):
Implement compilation of label-allocated lambda expressions. Quite
tricky, we'll see if this works when the new analyzer lands.
* module/language/tree-il/analyze.scm: Add some more comments about
something that will land in a future commit: compiling fixpoint
lambdas as labels.
(analyze-lexicals): Reorder a bit, and add a label alist to procedure
allocations. Empty for now.
* module/language/tree-il/compile-glil.scm (flatten): Adapt to the free
variables being in the cddr of the allocation, not the cdr.
* libguile/vm-i-scheme.c (vector-ref, vector-set): Sync registers if we
call out to C.
* module/language/tree-il/compile-glil.scm (flatten-lambda): Add an
extra argument, the self-label, which should be the gensym under which
the procedure is bound in a <fix> expression.
(flatten): If we see a call to a lexical ref to the self-label in a
tail position, rename and goto instead of goto/args, which will tear
down the frame -- or will, in the future. It's a primitive form of
loop detection.
* module/language/tree-il/primitives.scm (zero?): Expand to (= x 0).
* module/Makefile.am (SOURCES): Reorganize so GHIL is compiled last,
along with ecmascript.
* module/language/scheme/spec.scm: Remove references to GHIL, as it's
bitrotten and obsolete..
* module/language/tree-il.scm (make-tree-il-folder): Rework so that we
only have down and up procs, and call down and up on each element.
* module/language/tree-il/analyze.scm (analyze-lexicals): Fix a thinko
handling let-values.
* module/language/tree-il/fix-letrec.scm: Actually implement fixing
letrec. The resulting code will perform better, but violations of the
letrec restriction are not detected. This behavior is allowed by the
spec, but it is undesirable. Perhaps that will be fixed later.
* module/language/tree-il/inline.scm (inline!): Fix a case in which
((lambda args foo)) would be erroneously inlined to foo. Remove empty
let, letrec, and fix statements.
* module/language/tree-il/primitives.scm (effect-free-primitive?): New
public predicate.
* module/language/tree-il.scm (tree-il-fold): Fix for let-values case.
(make-tree-il-folder): New public macro, makes a multi-valued folder
specific to the number of seeds that the user wants.
* module/language/tree-il/optimize.scm (optimize!): Reverse the order of
inline! and fix-letrec!, as the latter might expose opportunities for
the former.
* module/srfi/srfi-11.scm (let-values): Reimplement in terms of
syntax-case, so that its expressions may reference hygienically bound
variables. See the NEWS for the rationale.
(let*-values): An empty let*-values still introduces a local `let'
binding contour.
* module/system/base/syntax.scm (record-case): Yukkkk. Reimplement in
terms of syntax-case. Ug-ly, but see the NEWS again: "Lexical bindings
introduced by hygienic macros may not be referenced by nonhygienic
macros."
* libguile/vm-i-system.c (fix-closure): New instruction, for wiring
together fixpoint procedures.
* module/Makefile.am (TREE_IL_LANG_SOURCES): Add fix-letrec.scm.
* module/language/glil/compile-assembly.scm (glil->assembly): Reindent
the <glil-lexical> case, and handle 'fix for locally-bound vars.
* module/language/tree-il.scm (<fix>): Add the <fix> tree-il type and
accessors, for fixed-point bindings. This IL construct is taken from
the Waddell paper.
(parse-tree-il, unparse-tree-il, tree-il->scheme, tree-il-fold)
(pre-order!, post-order!): Update for <fix>.
* module/language/tree-il/analyze.scm (analyze-lexicals): Update for
<fix>. The difference here is that the bindings may not be assigned,
and are not marked as such. They are not boxed.
(report-unused-variables): Update for <fix>.
* module/language/tree-il/compile-glil.scm (flatten): Compile <fix> to
GLIL.
* module/language/tree-il/fix-letrec.scm: A stub implementation of
fixing letrec -- will flesh out in a separate commit.
* module/language/tree-il/inline.scm: Fix license, it was mistakenly
added with LGPL v2.1+.
* module/language/tree-il/optimize.scm (optimize!): Run the fix-letrec!
pass.
* libguile/vm-i-scheme.c: Add add1 and sub1 instructions.
* module/language/tree-il/compile-glil.scm: Compile 1+ and 1- to add1
and sub1.
* module/language/tree-il/primitives.scm (define-primitive-expander):
Add support for `if' statements in the consequent.
(+, -): Compile (- x 1), (+ x 1), and (+ 1 x) to 1- or 1+ as
appropriate.
(1-): Remove this one. Seems we forgot 1+ before, but we weren't
compiling it nicely anyway.
* test-suite/tests/tree-il.test ("void"): Fix expected compilation of (+
(void) 1) to allow for add1.
* module/language/elisp/compile-tree-il.scm: Move dynamic binding to one place
and changed names that refer to `fluids' for dynamic binding.
* module/language/elisp/bindings.scm: Changed names referring to `fluids'.
Affects so far let-bound symbols, lambda arguments to come in the future.
* module/language/elisp/README: Document it.
* module/language/elisp/compile-tree-il.scm: Add :always-lexical option.
* test-suite/tests/elisp-compiler.test: Test it.
* module/language/tree-il/analyze.scm (<binding-info>): New record type.
(report-unused-variables): New procedure.
* module/language/tree-il/compile-glil.scm (%warning-passes): New
variable.
(compile-glil): Honor `#:warnings' from OPTS.
* test-suite/tests/tree-il.test (call-with-warnings): New procedure.
(%opts-w-unused): New variable.
("warnings"): New test prefix.