* libguile/instructions.c (FOR_EACH_INSTRUCTION_WORD_TYPE): Add word
types for immediate f64 and u64 values.
(TYPE_WIDTH): Bump up by a bit, now that we have 32 word types.
(NOP, parse_instruction): Use 64-bit meta type.
* libguile/vm-engine.c (load-f64, load-u64): New instructions.
* module/language/bytecode.scm (compute-instruction-arity): Add parser
for new instruction word types.
* module/language/cps/compile-bytecode.scm (compile-function): Add
special-cased assemblers for new instructions, and also for scm->u64
and u64->scm which I missed before.
* module/language/cps/effects-analysis.scm (load-f64, load-u64): New
instructions.
* module/language/cps/slot-allocation.scm (compute-needs-slot): load-f64
and load-u64 don't need slots.
(compute-var-representations): Update for new instructions.
* module/language/cps/specialize-primcalls.scm (specialize-primcalls):
Specialize scm->f64 and scm->u64 to make-f64 and make-u64.
* module/language/cps/types.scm (load-f64, load-u64): Wire up to type
inference, though currently type inference only runs before
specialization.
* module/language/cps/utils.scm (compute-defining-expressions): For some
reason I don't understand, it's possible to see two definitions that
are equal but not equal? here. Allow for now.
(compute-constant-values): Punch through type conversions to get
constant u64/f64 values.
* module/system/vm/assembler.scm (assembler): Support for new word
types. Export the new assemblers.
* libguile/vm-engine.c (add/immediate, sub/immediate)
(uadd/immediate, usub/immediate, umul/immediate): New instructions.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/slot-allocation.scm (compute-needs-slot):
* module/language/cps/types.scm:
* module/system/vm/assembler.scm (system):
* module/language/cps/effects-analysis.scm: Support
for new instructions.
* module/language/cps/optimize.scm (optimize-first-order-cps): Move
primcall specialization to the last step -- the only benefit of doing
it earlier was easier reasoning about side effects, and we're already
doing that in a more general way with (language cps types).
* module/language/cps/specialize-primcalls.scm (specialize-primcalls):
Specialize add and sub to add/immediate and sub/immediate, and
specialize u64 addition as well. U64 specialization doesn't work now
though because computing constant values doesn't work for U64s; oh
well.
* libguile/vm-engine.c: Remove add1 and sub1 instructions. Will replace
with add/immediate and sub/immediate.
* module/language/tree-il/peval.scm (peval): If we reify a new
<primcall>, expand it. Removes 1- and similar primcalls.
* module/language/tree-il/primitives.scm: Don't specialize (+ x 1) to 1+.
(expand-primcall): New export, does a single primcall expansion.
(expand-primitives): Use the new helper.
* module/language/cps/effects-analysis.scm:
* module/language/cps/primitives.scm:
* module/language/cps/types.scm:
* module/system/vm/assembler.scm: Remove support for add1 and sub1 CPS
primitives.
* test-suite/tests/peval.test ("partial evaluation"): Adapt tests that
expect 1+/1- to expect +/-.
* module/language/cps/types.scm (vector-ref, vector-set!)
(string-ref, string-set!, struct-ref, struct-set!)
(define-bytevector-accessors, define-bytevector-uaccessors): Clamp
range of object and index to be within the range of indices, with a
maximum of *max-size-t*.
* module/language/cps/types.scm (*max-size-t*): New definition.
(type-entry-saturating-union): Saturate more slowly, first stopping at
[0,*max-size-t*] then at [&range-min, &range-max] before saturating to
[-inf.0, +inf.0]. This allows most offset phi variables to have their
range inferred within the u64 range.
* module/language/cps/specialize-numbers.scm
(compute-specializable-vars): Refactor to work on any kind of
unboxable value, not just f64 values.
(compute-specializable-f64-vars, compute-specializable-u64-vars): New
helpers.
(apply-specialization): Support for u64 values.
* module/language/cps/specialize-numbers.scm
(compute-specializable-u64-vars): New stub.
* module/language/cps/specialize-numbers.scm
(compute-specializable-phis): Rename from
compute-specializable-f64-phis, and return an intmap instead of an
intset. The values distinguish f64 from u64 vars.
* module/language/cps/specialize-numbers.scm (apply-specialization):
Start of u64 phi unboxing.
* module/language/cps/specialize-numbers.scm (specialize-phis):
(specialize-numbers): Adapt.
* module/language/cps/specialize-numbers.scm
(specialize-u64-comparison): New function.
* module/language/cps/specialize-numbers.scm (specialize-operations):
Rename from specialize-f64-operations, as it will specialize both
kinds. Add a case to specialize u64 comparisons.
* module/language/cps/specialize-numbers.scm (specialize-numbers): Adapt
to specialize-operations name change.
* module/language/tree-il/compile-cps.scm (convert): bv-f32-ref,
bv-f32-set!, bv-f64-ref, and bv-f64-set! take the index as an untagged
u64 value.
* module/language/cps/types.scm (define-bytevector-uaccessors): New
helper, used while migrating bytevectors to take unboxed indexes.
Adapt f32/f64 accessors to use this definition helper.
* libguile/vm-engine.c (BV_FLOAT_REF, BV_FLOAT_SET): The index is
unboxed.
* module/language/cps/types.scm (*min-s32*, *max-s32*): Remove unused
definitions.
(&range-min, &range-max): New definitions, replacing min-fixnum and
max-fixnum as the bounds of precise range analysis.
(type-entry-min, type-entry-max): Store inf values directly as
-inf.0/+inf.0.
(type-entry-clamped-min, type-entry-clamped-max): Remove, as they are
no longer needed.
(clamp-min, clamp-max, make-type-entry): Clamp minimum and maximum
half-ranges in different ways.
(type-entry-union, type-entry-saturating-union)
(type-entry-intersection): Adapt to type-entry-min / type-entry-max
change.
(bv-u32-ref, bv-u32-set!):
(bv-s32-ref, bv-s32-set!):
(bv-u64-ref, bv-u64-set!):
(bv-s64-ref, bv-s64-set!): Precise range inference. This will allow
robust unboxing.
(ash): Infer 64-bit shifts.
* module/language/cps/compile-bytecode.scm (compile-function): Always
define a 'closure binding in slot 0.
* module/system/vm/frame.scm (available-bindings): No need to futz
around not having a closure binding.
* module/system/vm/debug.scm (arity-arguments-alist): Expect a closure
binding.
* test-suite/tests/rtl.test: Emit definitions for the closure.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/primitives.scm (*branching-primcall-arities*):
* module/language/cps/type-fold.scm (equal?):
* module/language/cps/types.scm (equal?):
* module/language/tree-il/compile-cps.scm (convert): `equal?' is no
longer a branching primcall, because it isn't inline. The
implementation could lead to bad backtraces also, as it didn't save
the IP, and actually could lead to segfaults as it didn't reload the
SP after the return. There is an eqv? fast-path, though.
* module/system/vm/assembler.scm (br-if-equal): Remove interface.
* module/system/vm/disassembler.scm (code-annotation):
(compute-labels): No need to handle br-if-equal.
* module/language/cps/specialize-numbers.scm (apply-f64-specialization):
Remove printout. I didn't see any when compiling Guile, which means
that probably this optimization doesn't hit for any code in Guile
itself, sadly :P
* module/language/cps/specialize-numbers.scm: New pass, to turn "add"
into "fadd", and similarly for sub, mul, and div.
* module/language/cps/optimize.scm:
* module/Makefile.am:
* bootstrap/Makefile.am: Wire up the new pass.
* libguile/vm-engine.c (fadd, fsub, fmul, fdiv): New instructions.
* module/language/cps/effects-analysis.scm:
* module/language/cps/types.scm: Wire up support for new instructions.
* module/system/vm/assembler.scm: Export emit-fadd and friends.
* module/language/tree-il/compile-cps.scm (convert): Box results of
bv-f32-ref and bv-f64-ref. Unbox the argument to bv-f32-set! and
bv-f64-set!.
* libguile/vm-engine.c (bv-f32-ref, bv-f64-ref): Results are raw.
(bv-f32-set!, bv-f64-set!): Take unboxed arguments.
* module/system/vm/assembler.scm (emit-scm->f64, emit-f64->scm):
Export.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/effects-analysis.scm: Add support for scm->f64 and
f64->scm.
* module/language/cps/slot-allocation.scm (compute-var-representations):
Add cases for primops returning raw values.
* module/language/cps/types.scm (bv-f32-ref, bv-f32-set!)
(bv-f64-ref, bv-f64-set!): Deal in &f64 values instead of reals.
* module/language/cps/types.scm (&f64): New type, for untagged f64
values. Having a distinct type prevents type folding from replacing
an untagged 3.0 with a tagged 3.0.
(scm->f64, f64->scm): Support these new primcalls.
* libguile/loader.c (scm_find_slot_map_unlocked): Rename from
scm_find_dead_slot_map_unlocked.
* libguile/vm.c (struct slot_map_cache_entry, struct slot_map_cache)
(find_slot_map): Rename, changing "dead_slot" to "slot".
(enum slot_desc): New type.
(scm_i_vm_mark_stack): Interpret slot maps as having two bits per
slot, allowing us to indicate that a slot is live but not a pointer.
* module/language/cps/compile-bytecode.scm (compile-function): Adapt to
emit-slot-map name change.
* module/system/vm/assembler.scm (<asm>): Rename dead-slot-maps field to
slot-maps.
(emit-slot-map): Rename from emit-dead-slot-map.
(link-frame-maps): 2 bits per slot.
* module/language/cps/slot-allocation.scm (lookup-slot-map): Rename from
lookup-dead-slot-map.
(compute-var-representations): New function.
(allocate-slots): Adapt to encode two-bit slot representations.
* module/language/cps/compile-bytecode.scm (compile-function): Remove
special cases for nullary and unary returns; instead always use
return-values and rely on hinting to try to place values in the right
slot already.
* module/system/vm/assembler.scm (emit-init-constants): Use
return-values.
* module/system/vm/disassembler.scm (code-annotation): Add annotation
for return-values.
* libguile/vm-engine.c (return-values): Change to also reset the frame,
if nlocals is nonzero.
* doc/ref/vm.texi (Procedure Call and Return Instructions): Updated
docs.
* module/language/cps/compile-bytecode.scm (compile-function): Adapt to
call emit-return-values with the right number of arguments.
* module/language/cps/cse.scm (compute-truthy-expressions):
(compute-equivalent-subexpressions):
(eliminate-common-subexpressions): Refactor to be able to work on
first-order CPS.
* module/language/cps/optimize.scm (define-optimizer)
(optimize-higher-order-cps, optimize-first-order-cps): Obfuscate a bit
so that the bootstrap build won't have to expand optimization passes.
Might marginally speed up the bootstrap process.
* module/scripts/compile.scm (%options): Resurrect -O option and make it
follow GCC, more or less. The default is equivalent to -O2.
* module/language/cps/compile-bytecode.scm (lower-cps):
* module/language/cps/optimize.scm (optimize-higher-order-cps): Move
split-rec to run unconditionally for now, as closure conversion fails
without it.
(define-optimizer): Only verify the result if we are debugging, to
save time.
(cps-default-optimization-options): New exported procedure.
* module/language/tree-il/optimize.scm
(tree-il-default-optimization-options): New exported procedure.
* libguile/vm-engine.c: S24/S12/S8 operands addressed relative to the
SP, not the FP. Cache the SP instead of a FP-relative locals
pointer. Further cleanups to follow.
* libguile/vm.c (vm_builtin_call_with_values_code): Adapt to mov operand
addresing change.
* module/language/cps/compile-bytecode.scm (compile-function): Reify
SP-relative local indexes where appropriate.
* module/system/vm/assembler.scm (emit-fmov*): New helper, exported as
emit-fmov.
(shuffling-assembler, define-shuffling-assembler): Rewrite to shuffle
via push/pop/drop.
(standard-prelude, opt-prelude, kw-prelude): No need to provide for
shuffling args.
* test-suite/tests/rtl.test: Update.
* module/language/cps/slot-allocation.scm: Don't reserve slots 253-255.
* libguile/vm-engine.c: Renumber opcodes, and take the opportunity to
fold recent additions into more logical places. Be more precise when
describing the encoding of operands, to shuffle local references only
and not constants, immediates, or other such values.
(SP_REF, SP_SET): New helpers.
(BR_BINARY, BR_ARITHMETIC): Take full 24-bit operands. Our shuffle
strategy is to emit push when needed to bring far locals near, then
pop afterwards, shuffling away far destination values as needed; but
that doesn't work for conditionals, unless we introduce a trampoline.
Let's just do the simple thing for now. Native compilation will use
condition codes.
(push, pop, drop): Back from the dead! We'll only use these for
temporary shuffling though, when an opcode can't address the full
24-bit range.
(long-fmov): New instruction, like long-mov but relative to the frame
pointer.
(load-typed-array, make-array): Don't use a compressed encoding so
that we can avoid the shuffling case. It would be a pain, given that
they have so many operands already.
* module/language/bytecode.scm (compute-instruction-arity): Update for
new instrution word encodings.
* module/system/vm/assembler.scm: Update to expose some opcodes
directly, without the need for shuffling wrappers. Adapt to
instruction word encodings change.
* module/system/vm/disassembler.scm (disassembler): Adapt to instruction
coding change.
Fixes <http://bugs.gnu.org/21614>.
Reported by tantalum <sph@posteo.eu>.
* module/language/tree-il/compile-cps.scm (convert): Add missing 'cps'
argument to the continuation passed to 'convert-arg'.
* module/language/cps/compile-bytecode.scm (compute-forwarding-labels):
Analyze forwarding labels before emitting code. This lets us elide
conts that cause no shuffles, allowing more fallthrough.
* module/language/cps/peel-loops.scm: New pass. Only enabled if the
loop has one successor.
* module/language/cps/optimize.scm: Peel instead of doing LICM on
higher-order CPS, then LICM on first-order CPS.
* module/Makefile.am: Wire up new pass.
* module/language/cps/utils.scm (solve-flow-equations): Revert to take
separate in and out maps. Take an optional initial worklist.
* module/language/cps/slot-allocation.scm: Adapt to solve-flow-equations
change.