* test-suite/tests/asm-to-bytecode.test (native-cpu, native-word-size):
New procedures.
(test-target): When the target is the native CPU, use the native word
size instead of WORD-SIZE.
* test-suite/tests/asm-to-bytecode.test (%objcode-cookie-size)
(test-target): The objcode version embedded in the cookie is not an
effective version, so elide it from the test.
* module/system/base/target.scm (%target-endianness, %target-word-size):
New fluids.
(%native-word-size): New variable.
(with-target): Set these fluids.
(cpu-endianness, cpu-word-size, triplet-cpu, triplet-vendor,
triplet-os): New procedures.
(target-cpu, target-vendor, target-os): Use them.
(target-endianness, target-word-size): Refer to the corresponding
fluid.
* libguile/objcodes.c (target_endianness_var, target_word_size_var): New
global variables.
(NATIVE_ENDIANNESS): New macro.
(target_endianness, target_word_size, to_native_order): New functions.
(make_objcode_from_file): Use `scm_bytecode_to_native_objcode' instead
of `scm_bytecode_to_objcode'.
(bytecode_to_objcode): New function, based on `scm_bytecode_to_objcode',
with the addition of an `endianness' and `word_size' parameters.
(scm_bytecode_to_objcode): Use it.
(scm_bytecode_to_native_objcode): New function.
(scm_write_objcode): Use `target_word_size' and `target_endianness'.
Convert OBJCODE's len and meta-len to native byte order.
(scm_init_objcodes): Initialize `target_endianness_var' and
`target_word_size_var'.
* libguile/objcodes.h (scm_bytecode_to_native_objcode): New declaration.
* libguile/vm.c (really_make_boot_program): Use
`scm_bytecode_to_native_objcode' instead of `scm_bytecode_to_objcode'.
* test-suite/tests/asm-to-bytecode.test (%objcode-cookie-size): New
variable.
(test-target): New procedure.
("cross-compilation"): Add `test-target' calls and the "unknown
target" test.
* module/system/base/target.scm (validate-target): Accept any tuple with
at least 3 parts.
* test-suite/tests/asm-to-bytecode.test (test-triplet): New procedure.
("cross-compilation"): New test prefix.
* module/language/assembly/compile-bytecode.scm (compile-bytecode):
Rewrite to fill a bytevector directly, instead of using bytevector
ports. `write-bytecode' itself is still present and almost the same
as before; it's just that `write-byte' et al now inline the effect of
writing a byte to a binary port.
* test-suite/tests/asm-to-bytecode.test (comp-test): Refactor to use
public interfaces.
The bug was introduced dad6817f ("Use the R6RS I/O API in
`write-bytecode'.").
* module/language/assembly/compile-bytecode.scm
(write-bytecode)[write-string]: Rename to...
[write-latin1-string]: ... this. Add the `write-loader-len' call.
Write each character individually instead of using `string->utf8'.
[write-loader]: Remove.
* test-suite/tests/asm-to-bytecode.test ("compiler")[load-string "æ"]:
New test.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
Replace the WRITE-BYTE and GET-ADDR parameters with PORT. New ADDRESS
and EMIT-OPCODE? parameters. Callers updated.
[write-byte, get-addr]: New procedures.
Adjust to write to PORT.
(compile-bytecode): Update accordingly.
* test-suite/tests/asm-to-bytecode.test (munge-bytecode): Return a
bytevector instead of a u8vector.
(comp-test): Deal with bytevectors.
* libguile/objcodes.h (struct scm_objcode): Remove nargs, nrest, and
nlocs, as they are no longer needed. Also obviates the need for a
padding word.
* libguile/procs.c (scm_thunk_p): Use scm_i_program_arity for programs.
* libguile/procprop.c (scm_i_procedure_arity): Use scm_i_program_arity
for programs.
(scm_procedure_properties, scm_set_procedure_properties_x)
(scm_procedure_property, scm_set_procedure_property_x): Rework so that
non-closure properties are stored directly in a weak hash, instead of
needing a weak hash of "stand-in" closures to hold the properties. Fix
docstrings also.
* libguile/root.h (scm_stand_in_procs): Remove from the scm_sys_protects
set. Actually with libGC, we should be able to store the elements of
scm_sys_protects directly as global variables.
* libguile/gc.c (scm_init_storage): Remove scm_stand_in_procs
initialization.
* libguile/programs.c (scm_i_program_arity): New private accessor, tries
to determine the "minimum arity" of a program.
* libguile/vm.c (really_make_boot_program): Adapt to changes in
struct scm_objcode.
* module/language/assembly.scm (*program-header-len*, byte-length):
* module/language/assembly/compile-bytecode.scm (write-bytecode):
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
* module/language/assembly/disassemble.scm (disassemble-load-program):
Adapt to changes in objcode.
* module/system/xref.scm (program-callee-rev-vars): Adapt to changes in
assembly.
* module/language/glil.scm: Remove nargs, nrest, and nlocs from
glil-program.
* module/language/glil/compile-assembly.scm (make-meta, glil->assembly):
* module/language/glil/decompile-assembly.scm (decompile-toplevel):
(decompile-load-program): Adapt to changes in GLIL and assembly.
* module/language/tree-il/compile-glil.scm (flatten-lambda): Adapt to
changes in GLIL.
* test-suite/tests/asm-to-bytecode.test: Adapt to assembly and bytecode
changes.
* test-suite/tests/tree-il.test: Adapt to GLIL changes.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump.
* libguile/vm-engine.c (vm_error_bad_wide_string_length): New error
case.
* libguile/vm-i-loader.c (load-unsigned-integer, load-integer)
(load-keyword): Remove these instructions. The former two are
obsoleted by make-int64/make-uint64, the latter via make-keyword.
(load-string): Only handle narrow strings.
(load-symbol): Only handle narrow symbols. The wide case is handled
via make-symbol.
(load-wide-string): New instruction, for wide strings.
* libguile/vm-i-system.c (define): Move here from loaders.c, as now it
just takes a sym on the stack.
(make-keyword, make-symbol): New instructions.
* module/language/assembly.scm: Remove removed instructions. No more
width byte in load-string etc.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Adapt
to change in instruction set.
* module/language/glil/compile-assembly.scm (glil->assembly): Compile
define by pushing the sym then emitting (define).
(dump-object): Dump narrow and wide strings differently. Use
make-keyword and make-symbol as appropriate.
* module/language/tree-il/compile-glil.scm (flatten): When compiling a
ref to a primitive (not a call), first see if the primitive is
actually bound in the root module. (That's not the case with e.g.
bytevector-u8-ref).
* module/system/xref.scm (program-callee-rev-vars): Don't parse out
"nexts".
* test-suite/tests/asm-to-bytecode.test ("compiler"): Adapt to bytecode
format change.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
* libguile/objcodes.c (OBJCODE_COOKIE): Bump objcode cookie, as we added
to struct scm_objcode.
* libguile/objcodes.h (struct scm_objcode): Add a uint32 after metalen
and before base, so that if the structure has 8-byte alignment, base
will have 8-byte alignment too. (Before, base was 12 bytes from the
start of the structure, now it's 16 bytes.)
* libguile/vm-engine.h (ASSERT_ALIGNED_PROCEDURE): Add a check that can
be turned on with VM_ENABLE_PARANOID_ASSERTIONS.
(CACHE_PROGRAM): Call ASSERT_ALIGNED_PROCEDURE.
* libguile/vm-i-system.c (long-local-ref): Add a missing semicolon.
* libguile/vm.c (really_make_boot_program): Rework to operate directly
on a malloc'd buffer, so that the program will be 8-byte aligned.
* module/language/assembly.scm (*program-header-len*): Add another 4 for
the padding.
(object->assembly): Fix case in which we would return (make-int8 0)
instead of (make-int8:0). This would throw off compile-assembly.scm's
use of addr+.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
out the padding int.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
And pop off the padding int too.
* module/language/glil/compile-assembly.scm (glil->assembly): Don't pack
the assembly, assume that assembly.scm has done it for us. If a
program has a meta, pad out the program so that meta will be aligned.
* test-suite/tests/asm-to-bytecode.test: Adapt to expect programs to
have the extra 4-byte padding int.
* libguile/objcodes.h (struct scm_objcode): Remove the "unused" field --
the old "nexts" -- and expand nlocs to 16 bits.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
the nlocs as a uint16.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
Decompile 16-bit nlocs. It seems this decompilation is little-endian
:-/
* test-suite/tests/asm-to-bytecode.test: Fix up to understand nlocs as a
little-endian value. The test does the right thing regarding
endianness.
* libguile/objcodes.c (make_objcode_by_mmap, scm_c_make_objcode_slice):
Verify the lengths with the meta-length.
(scm_objcode_meta): New procedure, for getting at the meta-info of an
objcode.
(scm_objcode_to_bytecode):
(scm_write_objcode): Write bytecode with the metadata too.
* module/system/vm/objcode.scm: Export object-meta.
* module/language/assembly.scm (byte-length):
* module/language/assembly/compile-bytecode.scm (write-bytecode):
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
* module/language/assembly/disassemble.scm (disassemble-load-program):
* module/language/glil/compile-assembly.scm (glil->assembly):
* test-suite/tests/asm-to-bytecode.test ("compiler"): Change to
load-program format to have meta-or-#f instead of meta-length, so that
we can serialize the meta as objcode without a load-program byte. Add a
test for writing out the meta.
* module/language/bytecode/Makefile.am:
* module/language/bytecode/spec.scm: Add another language to the stack,
bytecode. Bytecode is the u8vector form of object code..
* configure.in:
* module/language/Makefile.am:
* module/language/assembly/Makefile.am:
* test-suite/tests/asm-to-bytecode.test:
* module/language/assembly/spec.scm:
* module/language/assembly/compile-bytecode.scm: Update to include the
new pass.
* gdbinit: Untested attempts to get the stack fondling macros to deal
with the new program representation.
* libguile/frames.c (scm_vm_frame_arguments, scm_vm_frame_source)
(scm_vm_frame_local_ref, scm_vm_frame_local_set_x): SCM_PROGRAM_DATA is
a struct scm_objcode*.
* libguile/instructions.h:
* libguile/instructions.c: Hide the instruction table and the struct
scm_instruction structure; all access to instructions now goes through
procedures. This is because instructions are no longer in a packed
array indexed by opcode. Also, declare a mask that all instructions
should fit in.
* libguile/objcodes.h:
* libguile/objcodes.c: Rewrite so that object code directly maps its
arity and length from its bytecode. This makes it unnecessary to keep
this information in programs, allowing programs to be simple conses
between the code (objcodes) and data (the object table and the closure
variables).
* libguile/programs.c (scm_make_program): Rework so that make-program
takes objcode, an object table, and externals as arguments. It's much
clearer this way, and we avoid malloc().
* libguile/stacks.c (is_vm_bootstrap_frame): Update for program/objcode
changes.
* libguile/vm-engine.c (vm_run): Initialize the jump table on the first
run, with the opcodes declared in the instruction sources, and with bad
instructions raising an error instead of wandering off into the
Unknown.
* libguile/vm-engine.h (FETCH_LENGTH): Always represent lengths as 3
bytes. The old code was too error-prone.
(NEXT_JUMP): Mask the instruction with SCM_VM_INSTRUCTION_MASK.
(NEW_FRAME): Update for program/objcode changes.
* libguile/vm-expand.h (VM_DEFINE_FUNCTION, VM_DEFINE_INSTRUCTION)
(VM_DEFINE_LOADER): Update so that we explicitly specify opcodes, so
that we have a stable bytecode API.
* libguile/vm-i-loader.c: Update license to LGPLv2+. Explicitly declare
opcodes.
(load-integer): Use an int instead of a long as the accumulator; still
need to revisit this code at some point, I think.
(load-program): Simplify, thankfully!! Just creates the objcode slice
and rolls with it.
* libguile/vm-i-scheme.c: Number the opcodes explicitly.
* libguile/vm-i-system.c: Update license to LGPLv2+. Explicitly declare
opcodes.
(make-closure): Update for new program API.
* libguile/vm.c (vm_make_boot_program): Update for new program/objcode
API. Still a bit ugly.
(scm_load_compiled_with_vm): Update for new program/objcode API.
* module/language/assembly.scm (byte-length): Fix byte-length calculation
for loaders, and load-program.
(code-pack, code-unpack): Start to move things from (system vm conv)
here.
(object->code, code->object): More things from conv.scm.
* module/language/glil.scm (<glil-program>): Add a new field,
closure-level.
(make-glil-program, compute-closure-level): Calculate the "closure
level" when making a glil program. This is the maximum depth of
external binding refs in this closure.
(unparse-glil): Fix label serialization.
* module/language/glil/compile-assembly.scm (make-meta): Prepend #f for
the meta's object table, though maybe in the future we can avoid
creating assembly in the first place.
(assoc-ref-or-acons, object-index-and-alist): GRRR! Caught again by the
different sets of arguments to assoc and assoc-ref!
(glil->assembly): Attempt to make the <glil-program> case more
readable, and fix the bugs. Sorry I don't know how to comment this
change any more than this.
(glil->assembly): For <glil-module> serialize the whole key, not just
the name.
(dump-object): subprogram-code is already a list. Serialize integers as
strings, not u8vectors. Fix the order of lists and vectors.
* module/language/glil/spec.scm (glil): Switch orders, so we prefer glil
-> assembly -> objcode. Actually glil->objcode doesn't work any more,
needs to be removed I think.
* module/language/objcode/spec.scm (objcode->value):
s/objcode->program/make-program/.
* module/language/scheme/inline.scm: Add acons inline.
* module/system/vm/conv.scm (make-byte-decoder): Skip the first 8 bytes,
they are header. Handle subprograms properly. Still needs help though.
(decode-length): Lengths are always 3 bytes now.
* module/system/vm/disasm.scm: Superficial changes to keep things
working. I'd like to fix this better in the future.
* module/system/vm/frame.scm (bootstrap-frame?): Fixes for
program-bytecode.
* module/system/vm/program.scm: Export make-program. It's program-objcode
now, no more program-bytecode.
* module/system/vm/vm.scm (vm-load): Use make-program.
* test-suite/tests/asm-to-bytecode.test: New test, very minimal.
* module/system/vm/objcode.scm: Export word-size, byte-order, and
write-objcode.