* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump.
* libguile/vm-engine.c (vm_error_bad_wide_string_length): New error
case.
* libguile/vm-i-loader.c (load-unsigned-integer, load-integer)
(load-keyword): Remove these instructions. The former two are
obsoleted by make-int64/make-uint64, the latter via make-keyword.
(load-string): Only handle narrow strings.
(load-symbol): Only handle narrow symbols. The wide case is handled
via make-symbol.
(load-wide-string): New instruction, for wide strings.
* libguile/vm-i-system.c (define): Move here from loaders.c, as now it
just takes a sym on the stack.
(make-keyword, make-symbol): New instructions.
* module/language/assembly.scm: Remove removed instructions. No more
width byte in load-string etc.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Adapt
to change in instruction set.
* module/language/glil/compile-assembly.scm (glil->assembly): Compile
define by pushing the sym then emitting (define).
(dump-object): Dump narrow and wide strings differently. Use
make-keyword and make-symbol as appropriate.
* module/language/tree-il/compile-glil.scm (flatten): When compiling a
ref to a primitive (not a call), first see if the primitive is
actually bound in the root module. (That's not the case with e.g.
bytevector-u8-ref).
* module/system/xref.scm (program-callee-rev-vars): Don't parse out
"nexts".
* test-suite/tests/asm-to-bytecode.test ("compiler"): Adapt to bytecode
format change.
This adds full Unicode strings as a datatype, and it adds some
minimal functionality. The terminal and port encoding is assumed
to be ISO-8859-1. Non-ISO-8859-1 characters are written or
input as string character escapes.
The string character escapes now have 3 forms: \xXX \uXXXX and
\UXXXXXX, for unprintable characters that have 2, 4 or 6 hex digits.
The process for writing to strings has been modified. There is now a
function scm_i_string_start_writing that does the copy-on-write
conversion if necessary.
To compile strings that may be wide, the VM storage of strings and
string-likes has changed.
Most string-using functions have not yet been updated and may break
when used with wide strings.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
use variable width string bytecode format
* module/language/assembly.scm (byte-length): use variable width
bytecode format
* libguile/vm-i-loader.c (load-string, load-symbol):
(load-keyword, define): use variable-width bytecode format
* libguile/vm-engine.h (FETCH_WIDTH): new macro
* libguile/strings.h: new declarations
* libguile/strings.c (make_wide_stringbuf): new function
(widen_stringbuf): new function
(scm_i_make_wide_string): new function
(scm_i_is_narrow_string): new function
(scm_i_string_wide_chars): new function
(scm_i_string_start_writing): new function
(scm_i_string_ref): new function
(scm_i_string_set_x): new function
(scm_i_is_narrow_symbol): new function
(scm_i_symbol_wide_chars, scm_i_symbol_ref): new function
(scm_string_width): new function
(unistring_escapes_to_guile_escapes): new function
(scm_to_stringn): new function
(scm_i_stringbuf_free): modify for wide strings
(scm_i_substring_copy): modify for wide strings
(scm_i_string_chars, scm_string_append): modify for wide strings
(scm_i_make_symbol, scm_to_locale_stringn): modify for wide strings
(scm_string_dump, scm_symbol_dump, scm_to_locale_stringbuf):
(scm_string, scm_i_deprecated_string_chars): modify for wide strings
(scm_from_locale_string, scm_from_locale_stringn): add null test
* libguile/srfi-13.c: add calls for scm_i_string_start_writing for
each call of scm_i_string_stop_writing
(scm_string_for_each): modify for wide strings
* libguile/socket.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/rw.c: add calls for scm_i_string_start_writing for each
call of scm_i_string_stop_writing
* libguile/read.c (scm_read_string): allow reading of wide strings
* libguile/print.h: add declaration for scm_charprint
* libguile/print.c (iprin1): print wide strings and add new string
escapes
(scm_charprint): new function
* libguile/ports.h: new declarations for scm_lfwrite_substr and
scm_lfwrite_str
* libguile/ports.c (update_port_lf): new function
(scm_lfwrite): use update_port_lf
(scm_lfwrite_substr): new function
(scm_lfwrite_str): new function
* test-suite/tests/asm-to-bytecode.test ("compiler"): add string
width byte to sting-like asm tests
This adds the 32-bit standalone characters. Strings are still
8-bit. Characters larger than 8-bit can only be entered or
displayed in octal format at this point. At this point, the
terminal's display encoding is expected to be Latin-1.
* module/language/assembly/compile-bytecode.scm (write-bytecode):
add 32-bit char
* module/language/assembly.scm (object->assembly): add 32-bit char
(assembly->object): add 32-bit char
* libguile/vm-i-system.c (make-char32): new op
* libguile/print.c (iprin1): print 32-bit char
* libguile/numbers.h: add type scm_t_wchar
* libguile/numbers.c: add type scm_t_wchar
* libguile/chars.h: new type scm_t_wchar
(SCM_CODEPOINT_MAX): new
(SCM_IS_UNICODE_CHAR): new
(SCM_MAKE_CHAR): operate on 32-bit char
* libguile/chars.c: comparison operators now use Unicode
codepoints
(scm_c_upcase): now receives and returns scm_t_wchar
(scm_c_downcase): now receives and returns scm_t_wchar
* libguile/objcodes.c (OBJCODE_COOKIE): Bump again, as our jump offsets
are now multiplied by 8.
* libguile/vm-i-system.c (BR): Interpret the 16-bit offset as a relative
jump to the nearest 8-byte-aligned block -- increasing relative jump
range from +/-32K to +/-240K.
(mvra): Do the same for the mvra jump.
* libguile/vm.c (really_make_boot_program): Align the mvra.
* module/language/assembly.scm (align-block): New export, for aligning
blocks.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Emit
jumps to the nearest 8-byte-aligned block. Effectively our range is 18
bits in either direction. I would like to do this differently -- have
long-br and long-br-if, and all the other br instructions go to 8 bits
only. But the assembler doesn't have an appropriate representation to
allow me to do this yet, so for now this is what we have.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
Decode the 19-bit jumps.
* libguile/objcodes.c (OBJCODE_COOKIE): Bump objcode cookie, as we added
to struct scm_objcode.
* libguile/objcodes.h (struct scm_objcode): Add a uint32 after metalen
and before base, so that if the structure has 8-byte alignment, base
will have 8-byte alignment too. (Before, base was 12 bytes from the
start of the structure, now it's 16 bytes.)
* libguile/vm-engine.h (ASSERT_ALIGNED_PROCEDURE): Add a check that can
be turned on with VM_ENABLE_PARANOID_ASSERTIONS.
(CACHE_PROGRAM): Call ASSERT_ALIGNED_PROCEDURE.
* libguile/vm-i-system.c (long-local-ref): Add a missing semicolon.
* libguile/vm.c (really_make_boot_program): Rework to operate directly
on a malloc'd buffer, so that the program will be 8-byte aligned.
* module/language/assembly.scm (*program-header-len*): Add another 4 for
the padding.
(object->assembly): Fix case in which we would return (make-int8 0)
instead of (make-int8:0). This would throw off compile-assembly.scm's
use of addr+.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Write
out the padding int.
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
And pop off the padding int too.
* module/language/glil/compile-assembly.scm (glil->assembly): Don't pack
the assembly, assume that assembly.scm has done it for us. If a
program has a meta, pad out the program so that meta will be aligned.
* test-suite/tests/asm-to-bytecode.test: Adapt to expect programs to
have the extra 4-byte padding int.
* doc/ref/vm.texi (Loading Instructions): Remove references to
load-integer and load-unsigned-integer -- they're still in the VM but
will be removed at some point.
(Data Control Instructions): Add make-int64 and make-uint64.
* libguile/vm-i-loader.c (load-unsigned-integer): Allow 8-byte values.
But this instruction is on its way out, yo.
* libguile/vm-i-system.c (make-int64, make-uint64): New instructions.
* module/language/assembly.scm (object->assembly): Write out make-int64
and make-uint64 instructions, using bytevectors to do the endianness
conversion.
(assembly->object): And pretty-print them back, for disassembly.
* module/language/glil/compile-assembly.scm: Don't generate load-integer
/ load-unsigned-integer instructions.
* module/rnrs/bytevector.scm (rnrs):
* libguile/bytevectors.h:
* libguile/bytevectors.c (scm_uniform_array_to_bytevector): New function.
* libguile/unif.h:
* libguile/unif.c (scm_from_contiguous_typed_array): New function.
* libguile/vm-i-loader.c (load-array): New instruction, for loading byte
data into uniform vectors. Currently it copies out the data, though in
the future we could avoid that.
* module/language/assembly.scm (align-code): New exported function,
aligns code on some boundary.
(align-program): Use align-code.
* module/language/assembly/compile-bytecode.scm (write-bytecode): Support
the load-array instruction.
* module/language/glil/compile-assembly.scm (dump-object): Dump uniform
arrays. Neat :)
* module/language/assembly.scm (align-program): Whoops, align programs
properly.
* module/language/glil/compile-assembly.scm (compile-assembly): Start
with addr=-1, for the unserialized load-program byte.
(glil->assembly): Align programs in all cases.
* module/language/assembly.scm (addr+): New helper.
(align-program): New function, aligns a (load-program) form, currently
to 8-byte boundaries.
* module/language/glil/compile-assembly.scm (<subprogram>): Record the
object table and the program code separately, so that we can align the
program after the object table has been written.
(glil->assembly): Use addr+.
(dump-object): Rework to fold `addr' through dumping of compound
objects, so that procedures can be aligned properly.
* libguile/objcodes.c (make_objcode_by_mmap, scm_c_make_objcode_slice):
Verify the lengths with the meta-length.
(scm_objcode_meta): New procedure, for getting at the meta-info of an
objcode.
(scm_objcode_to_bytecode):
(scm_write_objcode): Write bytecode with the metadata too.
* module/system/vm/objcode.scm: Export object-meta.
* module/language/assembly.scm (byte-length):
* module/language/assembly/compile-bytecode.scm (write-bytecode):
* module/language/assembly/decompile-bytecode.scm (decode-load-program):
* module/language/assembly/disassemble.scm (disassemble-load-program):
* module/language/glil/compile-assembly.scm (glil->assembly):
* test-suite/tests/asm-to-bytecode.test ("compiler"): Change to
load-program format to have meta-or-#f instead of meta-length, so that
we can serialize the meta as objcode without a load-program byte. Add a
test for writing out the meta.
* module/language/assembly.scm: Refactor a bit; remove the name "code"
from the API, as it's too generic, and replace with "assembly".
* module/language/assembly/compile-bytecode.scm: Get byte lengths via,
well, byte-length.
* module/language/glil/Makefile.am:
* module/language/glil/spec.scm:
* module/language/glil/compile-objcode.scm: Remove compile-objcode, as we
just go through bytecode now.
* module/language/glil/compile-assembly.scm (glil->assembly)
(dump-object): s/object->code/object->assembly/.
* gdbinit: Untested attempts to get the stack fondling macros to deal
with the new program representation.
* libguile/frames.c (scm_vm_frame_arguments, scm_vm_frame_source)
(scm_vm_frame_local_ref, scm_vm_frame_local_set_x): SCM_PROGRAM_DATA is
a struct scm_objcode*.
* libguile/instructions.h:
* libguile/instructions.c: Hide the instruction table and the struct
scm_instruction structure; all access to instructions now goes through
procedures. This is because instructions are no longer in a packed
array indexed by opcode. Also, declare a mask that all instructions
should fit in.
* libguile/objcodes.h:
* libguile/objcodes.c: Rewrite so that object code directly maps its
arity and length from its bytecode. This makes it unnecessary to keep
this information in programs, allowing programs to be simple conses
between the code (objcodes) and data (the object table and the closure
variables).
* libguile/programs.c (scm_make_program): Rework so that make-program
takes objcode, an object table, and externals as arguments. It's much
clearer this way, and we avoid malloc().
* libguile/stacks.c (is_vm_bootstrap_frame): Update for program/objcode
changes.
* libguile/vm-engine.c (vm_run): Initialize the jump table on the first
run, with the opcodes declared in the instruction sources, and with bad
instructions raising an error instead of wandering off into the
Unknown.
* libguile/vm-engine.h (FETCH_LENGTH): Always represent lengths as 3
bytes. The old code was too error-prone.
(NEXT_JUMP): Mask the instruction with SCM_VM_INSTRUCTION_MASK.
(NEW_FRAME): Update for program/objcode changes.
* libguile/vm-expand.h (VM_DEFINE_FUNCTION, VM_DEFINE_INSTRUCTION)
(VM_DEFINE_LOADER): Update so that we explicitly specify opcodes, so
that we have a stable bytecode API.
* libguile/vm-i-loader.c: Update license to LGPLv2+. Explicitly declare
opcodes.
(load-integer): Use an int instead of a long as the accumulator; still
need to revisit this code at some point, I think.
(load-program): Simplify, thankfully!! Just creates the objcode slice
and rolls with it.
* libguile/vm-i-scheme.c: Number the opcodes explicitly.
* libguile/vm-i-system.c: Update license to LGPLv2+. Explicitly declare
opcodes.
(make-closure): Update for new program API.
* libguile/vm.c (vm_make_boot_program): Update for new program/objcode
API. Still a bit ugly.
(scm_load_compiled_with_vm): Update for new program/objcode API.
* module/language/assembly.scm (byte-length): Fix byte-length calculation
for loaders, and load-program.
(code-pack, code-unpack): Start to move things from (system vm conv)
here.
(object->code, code->object): More things from conv.scm.
* module/language/glil.scm (<glil-program>): Add a new field,
closure-level.
(make-glil-program, compute-closure-level): Calculate the "closure
level" when making a glil program. This is the maximum depth of
external binding refs in this closure.
(unparse-glil): Fix label serialization.
* module/language/glil/compile-assembly.scm (make-meta): Prepend #f for
the meta's object table, though maybe in the future we can avoid
creating assembly in the first place.
(assoc-ref-or-acons, object-index-and-alist): GRRR! Caught again by the
different sets of arguments to assoc and assoc-ref!
(glil->assembly): Attempt to make the <glil-program> case more
readable, and fix the bugs. Sorry I don't know how to comment this
change any more than this.
(glil->assembly): For <glil-module> serialize the whole key, not just
the name.
(dump-object): subprogram-code is already a list. Serialize integers as
strings, not u8vectors. Fix the order of lists and vectors.
* module/language/glil/spec.scm (glil): Switch orders, so we prefer glil
-> assembly -> objcode. Actually glil->objcode doesn't work any more,
needs to be removed I think.
* module/language/objcode/spec.scm (objcode->value):
s/objcode->program/make-program/.
* module/language/scheme/inline.scm: Add acons inline.
* module/system/vm/conv.scm (make-byte-decoder): Skip the first 8 bytes,
they are header. Handle subprograms properly. Still needs help though.
(decode-length): Lengths are always 3 bytes now.
* module/system/vm/disasm.scm: Superficial changes to keep things
working. I'd like to fix this better in the future.
* module/system/vm/frame.scm (bootstrap-frame?): Fixes for
program-bytecode.
* module/system/vm/program.scm: Export make-program. It's program-objcode
now, no more program-bytecode.
* module/system/vm/vm.scm (vm-load): Use make-program.
* test-suite/tests/asm-to-bytecode.test: New test, very minimal.
* module/system/vm/objcode.scm: Export word-size, byte-order, and
write-objcode.
* configure.in:
* module/language/Makefile.am:
* module/language/assembly/Makefile.am: Automakery.
* module/language/assembly.scm:
* module/language/assembly/spec.scm: Add a new language, which is oddly
even lower than GLIL. I got tired of GLIL's terrible
compile-objcode.scm, and wanted a cleaner intermediate format.
* module/language/glil/compile-assembly.scm: A purely-functional
assembler, that produces "assembly". Will document later.
* module/language/glil/spec.scm: Declare the compiler to assembly.