* libguile/boolean.c (scm_nil_p): New function.
* libguile/vm-i-scheme.c (nilp, not_nilp):
* libguile/vm-i-system.c (br_if_nil, br_if_not_nil): New instructions.
Renumber other ops.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Increment.
* module/language/assembly/compile-bytecode.scm (compile-bytecode): Add
support for writing `br-if-nil' and `br-if-not-nil' instructions.
* module/language/assembly/disassemble.scm (code-annotation): Add
`br-if-nil' and `br-if-not-nil' to the list of branch instructions.
* module/language/tree-il/compile-glil.scm: Add `nil?' to
`*primcall-ops*'.
(flatten): Use the new branch instructions for `nil?' conditionals.
* module/language/tree-il/primitives.scm: Add `nil?' to
`*interesting-primitive-names*', `*effect-free-primitives', and
`*effect+exception-free-primitives*'.
* module/language/assembly/compile-bytecode.scm (compile-bytecode):
Rewrite to fill a bytevector directly, instead of using bytevector
ports. `write-bytecode' itself is still present and almost the same
as before; it's just that `write-byte' et al now inline the effect of
writing a byte to a binary port.
* test-suite/tests/asm-to-bytecode.test (comp-test): Refactor to use
public interfaces.
* module/system/vm/program.scm (source:line-for-user): New exported
procedure, returns a 1-indexed line, suitable for presentation to a
user.
(write-program): Use source:line-for-user when making fallback names.
* module/system/vm/coverage.scm (coverage-data->lcov):
* module/language/assembly/disassemble.scm (source->string):
* module/system/repl/debug.scm (print-frame): Use source:line-for-user.
* module/system/base/language.scm (<language>): Remove the `version'
field from languages. It just wasn't useful.
* module/language/assembly/spec.scm:
* module/language/brainfuck/spec.scm:
* module/language/bytecode/spec.scm:
* module/language/ecmascript/spec.scm:
* module/language/elisp/spec.scm:
* module/language/glil/spec.scm:
* module/language/objcode/spec.scm:
* module/language/scheme/spec.scm:
* module/language/tree-il/spec.scm:
* module/language/value/spec.scm: Remove #:version from all language
definitions. Shorten some language names (e.g. "Guile Scheme" ->
"Scheme").
The bug was introduced dad6817f ("Use the R6RS I/O API in
`write-bytecode'.").
* module/language/assembly/compile-bytecode.scm
(write-bytecode)[write-string]: Rename to...
[write-latin1-string]: ... this. Add the `write-loader-len' call.
Write each character individually instead of using `string->utf8'.
[write-loader]: Remove.
* test-suite/tests/asm-to-bytecode.test ("compiler")[load-string "æ"]:
New test.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
Replace the WRITE-BYTE and GET-ADDR parameters with PORT. New ADDRESS
and EMIT-OPCODE? parameters. Callers updated.
[write-byte, get-addr]: New procedures.
Adjust to write to PORT.
(compile-bytecode): Update accordingly.
* test-suite/tests/asm-to-bytecode.test (munge-bytecode): Return a
bytevector instead of a u8vector.
(comp-test): Deal with bytevectors.
* module/language/assembly/disassemble.scm (disassemble-meta):
Properties start with the fourth element, not the third. (The third is
the set of arities.)
A byte ordering error caused incorrect display of wide strings
when using the ",c" decompilation from the REPL.
* module/language/assembly/decompile-bytecode.scm (decode-bytecode):
wide strings are encoded in native endianness
* module/language/assembly/compile-bytecode.scm (write-bytecode): Handle
br-if-nargs compilation.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
And decompile them nicely as well.
* module/language/assembly/disassemble.scm (code-annotation): And,
present the disassembly if br-if-nargs-* nicely.
* libguile/objcodes.h (struct scm_objcode): Remove nargs, nrest, and
nlocs, as they are no longer needed. Also obviates the need for a
padding word.
* libguile/procs.c (scm_thunk_p): Use scm_i_program_arity for programs.
* libguile/procprop.c (scm_i_procedure_arity): Use scm_i_program_arity
for programs.
(scm_procedure_properties, scm_set_procedure_properties_x)
(scm_procedure_property, scm_set_procedure_property_x): Rework so that
non-closure properties are stored directly in a weak hash, instead of
needing a weak hash of "stand-in" closures to hold the properties. Fix
docstrings also.
* libguile/root.h (scm_stand_in_procs): Remove from the scm_sys_protects
set. Actually with libGC, we should be able to store the elements of
scm_sys_protects directly as global variables.
* libguile/gc.c (scm_init_storage): Remove scm_stand_in_procs
initialization.
* libguile/programs.c (scm_i_program_arity): New private accessor, tries
to determine the "minimum arity" of a program.
* libguile/vm.c (really_make_boot_program): Adapt to changes in
struct scm_objcode.
* module/language/assembly.scm (*program-header-len*, byte-length):
* module/language/assembly/compile-bytecode.scm (write-bytecode):
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
* module/language/assembly/disassemble.scm (disassemble-load-program):
Adapt to changes in objcode.
* module/system/xref.scm (program-callee-rev-vars): Adapt to changes in
assembly.
* module/language/glil.scm: Remove nargs, nrest, and nlocs from
glil-program.
* module/language/glil/compile-assembly.scm (make-meta, glil->assembly):
* module/language/glil/decompile-assembly.scm (decompile-toplevel):
(decompile-load-program): Adapt to changes in GLIL and assembly.
* module/language/tree-il/compile-glil.scm (flatten-lambda): Adapt to
changes in GLIL.
* test-suite/tests/asm-to-bytecode.test: Adapt to assembly and bytecode
changes.
* test-suite/tests/tree-il.test: Adapt to GLIL changes.
* module/language/assembly/spec.scm:
* module/language/brainfuck/spec.scm:
* module/language/bytecode/spec.scm:
* module/language/ecmascript/spec.scm:
* module/language/glil/spec.scm:
* module/language/scheme/spec.scm:
* module/language/tree-il/spec.scm: Language-readers now take two
arguments: the port and the environment. This should allow for
compile-environment-specific reader behavior.
* module/system/base/compile.scm (read-and-compile):
* module/system/repl/common.scm (repl-read): Pass the environment to the
language-reader.
* module/system/repl/repl.scm (meta-reader, prompting-meta-read):
* module/system/repl/command.scm (define-meta-command): Use the second
argument to repl-reader, so we avoid frobbing current-reader.
The intent is to maintain the readability of `pmatch' invocations.
* module/language/assembly/disassemble.scm (disassemble-load-program):
Don't use wildcards in `pmatch' invocations, even when the matched
elements are unused.
* module/language/glil/decompile-assembly.scm (decompile-toplevel,
decompile-load-program): Likewise.
* module/system/xref.scm (program-callee-rev-vars): Likewise.
* module/language/assembly.scm (byte-length): Likewise.
* module/language/tree-il/compile-glil.scm (flatten): Likewise.
This should now work thanks to the changes in
28b119ee3d ("make sure all programs are
8-byte aligned"). This commit is a follow-up to
ec99fe8ecb ("Add FIXMEs about misaligned
objcode-metas.").
* libguile/objcodes.c (scm_c_make_objcode_slice): Uncomment assertion
that checks for proper alignment of PTR.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Update
comment about META's alignment.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump.
* libguile/vm-engine.c (vm_error_bad_wide_string_length): New error
case.
* libguile/vm-i-loader.c (load-unsigned-integer, load-integer)
(load-keyword): Remove these instructions. The former two are
obsoleted by make-int64/make-uint64, the latter via make-keyword.
(load-string): Only handle narrow strings.
(load-symbol): Only handle narrow symbols. The wide case is handled
via make-symbol.
(load-wide-string): New instruction, for wide strings.
* libguile/vm-i-system.c (define): Move here from loaders.c, as now it
just takes a sym on the stack.
(make-keyword, make-symbol): New instructions.
* module/language/assembly.scm: Remove removed instructions. No more
width byte in load-string etc.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Adapt
to change in instruction set.
* module/language/glil/compile-assembly.scm (glil->assembly): Compile
define by pushing the sym then emitting (define).
(dump-object): Dump narrow and wide strings differently. Use
make-keyword and make-symbol as appropriate.
* module/language/tree-il/compile-glil.scm (flatten): When compiling a
ref to a primitive (not a call), first see if the primitive is
actually bound in the root module. (That's not the case with e.g.
bytevector-u8-ref).
* module/system/xref.scm (program-callee-rev-vars): Don't parse out
"nexts".
* test-suite/tests/asm-to-bytecode.test ("compiler"): Adapt to bytecode
format change.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
This adds the 32-bit standalone characters. Strings are still
8-bit. Characters larger than 8-bit can only be entered or
displayed in octal format at this point. At this point, the
terminal's display encoding is expected to be Latin-1.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
add 32-bit char
* module/language/assembly.scm (object->assembly): add 32-bit char
(assembly->object): add 32-bit char
* libguile/vm-i-system.c (make-char32): new op
* libguile/print.c (iprin1): print 32-bit char
* libguile/numbers.h: add type scm_t_wchar
* libguile/numbers.c: add type scm_t_wchar
* libguile/chars.h: new type scm_t_wchar
(SCM_CODEPOINT_MAX): new
(SCM_IS_UNICODE_CHAR): new
(SCM_MAKE_CHAR): operate on 32-bit char
* libguile/chars.c: comparison operators now use Unicode
codepoints
(scm_c_upcase): now receives and returns scm_t_wchar
(scm_c_downcase): now receives and returns scm_t_wchar
* libguile/objcodes.c (OBJCODE_COOKIE): Bump again, as our jump offsets
are now multiplied by 8.
* libguile/vm-i-system.c (BR): Interpret the 16-bit offset as a relative
jump to the nearest 8-byte-aligned block -- increasing relative jump
range from +/-32K to +/-240K.
(mvra): Do the same for the mvra jump.
* libguile/vm.c (really_make_boot_program): Align the mvra.
* module/language/assembly.scm (align-block): New export, for aligning
blocks.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Emit
jumps to the nearest 8-byte-aligned block. Effectively our range is 18
bits in either direction. I would like to do this differently -- have
long-br and long-br-if, and all the other br instructions go to 8 bits
only. But the assembler doesn't have an appropriate representation to
allow me to do this yet, so for now this is what we have.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
Decode the 19-bit jumps.
* libguile/objcodes.c (OBJCODE_COOKIE): Bump objcode cookie, as we added
to struct scm_objcode.
* libguile/objcodes.h (struct scm_objcode): Add a uint32 after metalen
and before base, so that if the structure has 8-byte alignment, base
will have 8-byte alignment too. (Before, base was 12 bytes from the
start of the structure, now it's 16 bytes.)
* libguile/vm-engine.h (ASSERT_ALIGNED_PROCEDURE): Add a check that can
be turned on with VM_ENABLE_PARANOID_ASSERTIONS.
(CACHE_PROGRAM): Call ASSERT_ALIGNED_PROCEDURE.
* libguile/vm-i-system.c (long-local-ref): Add a missing semicolon.
* libguile/vm.c (really_make_boot_program): Rework to operate directly
on a malloc'd buffer, so that the program will be 8-byte aligned.
* module/language/assembly.scm (*program-header-len*): Add another 4 for
the padding.
(object->assembly): Fix case in which we would return (make-int8 0)
instead of (make-int8:0). This would throw off compile-assembly.scm's
use of addr+.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
out the padding int.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
And pop off the padding int too.
* module/language/glil/compile-assembly.scm (glil->assembly): Don't pack
the assembly, assume that assembly.scm has done it for us. If a
program has a meta, pad out the program so that meta will be aligned.
* test-suite/tests/asm-to-bytecode.test: Adapt to expect programs to
have the extra 4-byte padding int.
* libguile/objcodes.h (struct scm_objcode): Remove the "unused" field --
the old "nexts" -- and expand nlocs to 16 bits.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
the nlocs as a uint16.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
Decompile 16-bit nlocs. It seems this decompilation is little-endian
:-/
* test-suite/tests/asm-to-bytecode.test: Fix up to understand nlocs as a
little-endian value. The test does the right thing regarding
endianness.
This allows, e.g., ",c #u8(1 2 3)" at the REPL to actually work instead
of failing to decode `load-array'.
* module/language/assembly/decompile-bytecode.scm (decode-bytecode):
Account for the `load-array' instruction, which is followed by a
bytevector instead of a string. We should find a more elegant way to
do that.
* module/rnrs/bytevector.scm (rnrs):
* libguile/bytevectors.h:
* libguile/bytevectors.c (scm_uniform_array_to_bytevector): New function.
* libguile/unif.h:
* libguile/unif.c (scm_from_contiguous_typed_array): New function.
* libguile/vm-i-loader.c (load-array): New instruction, for loading byte
data into uniform vectors. Currently it copies out the data, though in
the future we could avoid that.
* module/language/assembly.scm (align-code): New exported function,
aligns code on some boundary.
(align-program): Use align-code.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Support
the load-array instruction.
* module/language/glil/compile-assembly.scm (dump-object): Dump uniform
arrays. Neat :)
* doc/ref/api-procedures.texi (Compiled Procedures): Fix for API changes.
* doc/ref/compiler.texi (Compiling to the Virtual Machine): Replace GHIL
docs with Tree-IL docs. Update the bits about the Scheme compiler to
talk about Tree-IL and the expander instead of GHIL. Remove
<glil-argument>. Add placeholder sections for assembly and bytecode.
* doc/ref/vm.texi: Update examples with what currently happens. Reword
some things. Fix a couple errors.
* libguile/vm-i-system.c (externals): Remove this instruction, it's not
used.
* module/ice-9/documentation.scm (object-documentation): If the object is
a macro, try to return documentation on the macro transformer.
* module/language/assembly/disassemble.scm (disassemble-load-program):
Fix problem in which we skipped the first element of the object vector,
because of changes to procedure layouts a few months ago.
* module/language/scheme/spec.scm (read-file): Remove read-file
definition.
* module/language/tree-il.scm: Reorder exports. Remove <lexical>, it was
a compat shim to something that was never released. Fix `location'.
* module/language/tree-il/primitives.scm (/): Fix expander for more than
two args to /.
* module/system/base/compile.scm (read-file-in): Remove unused
definition.
* module/system/base/language.scm (system): Remove language-read-file.
* module/language/ecmascript/spec.scm (ecmascript): Remove read-file
definition.
* module/system/base/compile.scm: Expect compile passes to produce three
values, not two. The third is the "continuation environment", the
environment that can be used to compile a subsequent expression from
the same source language. For example, expansion-time side effects can
set the current module, which would be reflected appropriately in the
continuation environment.
* module/language/assembly/compile-bytecode.scm:
* module/language/bytecode/spec.scm:
* module/language/ecmascript/compile-ghil.scm:
* module/language/ghil/compile-glil.scm:
* module/language/glil/spec.scm:
* module/language/objcode/spec.scm:
* module/language/scheme/compile-ghil.scm:
* module/system/base/compile.scm: Update compile passes to return a
continuation environment.
* am/guilec (.scm.go): Create the target's directory, in case
$(builddir) != $(srcdir).
* configure.in: Don't output any makefile under `module/system' or
`module/language'.
* module/Makefile.am (SUBDIRS): Remove `language' and `system'. Add `.'
to the front.
(modpath, SOURCES, SCHEME_LANG_SOURCES, ECMASCRIPT_LANG_SOURCES,
GHIL_LANG_SOURCES, GLIL_LANG_SOURCES, ASSEMBLY_LANG_SOURCES,
BYTECODE_LANG_SOURCES, OBJCODE_LANG_SOURCES, VALUE_LANG_SOURCES): New
variables, taken from former `Makefile.am' files in sub-directories.