This reduces total build time to around 30 minutes or so.
* Makefile.am (SUBDIRS): Visit bootstrap/ before module/.
* bootstrap/Makefile.am: New file.
* configure.ac: Generate bootstrap/Makefile.
* meta/uninstalled-env.in (top_builddir): Add bootstrap/ to the
GUILE_LOAD_COMPILED_PATH.
* module/Makefile.am: Simplify to just sort files in alphabetical order;
since bootstrap/ was already compiled, we don't need to try to
optimize compilation order. Although the compiler will get faster as
more of the compiler itself is optimized, this isn't a significant
enough effect to worry about.
* module/scripts/compile.scm (%options): Resurrect -O option and make it
follow GCC, more or less. The default is equivalent to -O2.
* module/language/cps/compile-bytecode.scm (lower-cps):
* module/language/cps/optimize.scm (optimize-higher-order-cps): Move
split-rec to run unconditionally for now, as closure conversion fails
without it.
(define-optimizer): Only verify the result if we are debugging, to
save time.
(cps-default-optimization-options): New exported procedure.
* module/language/tree-il/optimize.scm
(tree-il-default-optimization-options): New exported procedure.
* doc/ref/vm.texi: Update for new stack layout.
* module/system/vm/disassembler.scm (code-annotation): Print the frame
sizes after alloc-frame, reset-frame, etc to make reading the
disassembly easier.
* module/system/vm/disassembler.scm (define-stack-effect-parser)
(stack-effect-parsers, instruction-stack-size-after): New stack size
facility.
(define-clobber-parser, clobber-parsers, instruction-slot-clobbers):
Take incoming and outgoing stack sizes as arguments to interpret
SP-relative clobbers.
* module/system/vm/frame.scm (compute-frame-sizes): New helper that
computes frame sizes for each position in a function.
(compute-killv): Adapt to compute the clobbered set given the computed
frame sizes.
* libguile/vm-engine.c: S24/S12/S8 operands addressed relative to the
SP, not the FP. Cache the SP instead of a FP-relative locals
pointer. Further cleanups to follow.
* libguile/vm.c (vm_builtin_call_with_values_code): Adapt to mov operand
addresing change.
* module/language/cps/compile-bytecode.scm (compile-function): Reify
SP-relative local indexes where appropriate.
* module/system/vm/assembler.scm (emit-fmov*): New helper, exported as
emit-fmov.
(shuffling-assembler, define-shuffling-assembler): Rewrite to shuffle
via push/pop/drop.
(standard-prelude, opt-prelude, kw-prelude): No need to provide for
shuffling args.
* test-suite/tests/rtl.test: Update.
* module/language/cps/slot-allocation.scm: Don't reserve slots 253-255.
* libguile/vm-engine.c: Renumber opcodes, and take the opportunity to
fold recent additions into more logical places. Be more precise when
describing the encoding of operands, to shuffle local references only
and not constants, immediates, or other such values.
(SP_REF, SP_SET): New helpers.
(BR_BINARY, BR_ARITHMETIC): Take full 24-bit operands. Our shuffle
strategy is to emit push when needed to bring far locals near, then
pop afterwards, shuffling away far destination values as needed; but
that doesn't work for conditionals, unless we introduce a trampoline.
Let's just do the simple thing for now. Native compilation will use
condition codes.
(push, pop, drop): Back from the dead! We'll only use these for
temporary shuffling though, when an opcode can't address the full
24-bit range.
(long-fmov): New instruction, like long-mov but relative to the frame
pointer.
(load-typed-array, make-array): Don't use a compressed encoding so
that we can avoid the shuffling case. It would be a pain, given that
they have so many operands already.
* module/language/bytecode.scm (compute-instruction-arity): Update for
new instrution word encodings.
* module/system/vm/assembler.scm: Update to expose some opcodes
directly, without the need for shuffling wrappers. Adapt to
instruction word encodings change.
* module/system/vm/disassembler.scm (disassembler): Adapt to instruction
coding change.
Fixes <http://bugs.gnu.org/21614>.
Reported by tantalum <sph@posteo.eu>.
* module/language/tree-il/compile-cps.scm (convert): Add missing 'cps'
argument to the continuation passed to 'convert-arg'.
* module/language/cps/compile-bytecode.scm (compute-forwarding-labels):
Analyze forwarding labels before emitting code. This lets us elide
conts that cause no shuffles, allowing more fallthrough.
* module/language/cps/peel-loops.scm: New pass. Only enabled if the
loop has one successor.
* module/language/cps/optimize.scm: Peel instead of doing LICM on
higher-order CPS, then LICM on first-order CPS.
* module/Makefile.am: Wire up new pass.
* module/language/cps/utils.scm (solve-flow-equations): Revert to take
separate in and out maps. Take an optional initial worklist.
* module/language/cps/slot-allocation.scm: Adapt to solve-flow-equations
change.
* module/language/cps/rotate-loops.scm (rotate-loop): Instead of
restricting rotation to loops with just one exit node, restrict to
loops with just one exit successor.
* module/language/cps/slot-allocation.scm (compute-lazy-vars):
(compute-live-variables): Adapt to solve-flow-equations interface
change.
* module/language/cps/utils.scm (solve-flow-equations): Move here. Use
an init value instead of an init map.
* module/language/cps/intset.scm (intset-intersect): Remove new-leaf
procedure, inlining to single call site. An empty intersection
properly produces #f so that the set can be pruned.
* module/language/cps2/reify-primitives.scm (uniquify-receive):
(reify-primitives): Ensure that $kreceive conts can have only one
predecessor. Otherwise return shuffles are incorrectly allocated.
* module/system/repl/debug.scm (print-frame): Pass #:top-frame? #t for
the top frame.
* module/system/vm/frame.scm (available-bindings): Be permissive and
allow #:top-frame? #f even when the IP is at the start of the
function.
* module/language/cps2/utils.scm (compute-successors): kfun is
optional.
(compute-sorted-strongly-connected-components): New function, moved
from split-rec.scm. Doesn't assume that 0 is a free node identifier.
* module/language/cps2/split-rec.scm
(compute-sorted-strongly-connected-components): Remove, use utils.scm
version instead.
* module/language/cps2/closure-conversion.scm (intset-select): Remove
unused function.
* module/language/cps/intset.scm (intset-prev): New function.
(make-intset-folder): Add forward? argument like make-intmap-folder.
(intset-fold-right): New function.
* module/language/cps2/optimize.scm: Move comments here from
cps/compile-bytecode.scm.
* module/language/cps/compile-bytecode.scm: Remove optimization and
closure conversion calls, since CPS2 does this for us.
* module/language/cps2/compile-cps.scm (compile-cps): Use set! to save
memory at bootstrap-time. Optimize first-order CPS, to get rid of
strangeness introduced in closure conversion.
* module/language/cps2/dce.scm (compute-live-code): Use the live-labels
set to indicate function liveness. $closure and $callk mark their
associated functions as live.
(process-eliminations): Handle $closure.
* module/language/cps2/effects-analysis.scm (expression-effects): Handle
$closure.
* module/language/cps2/utils.scm (compute-reachable-functions): New
function.
* module/language/cps2/verify.scm (check-label-partition)
(compute-reachable-labels): Use the new function.
* module/language/cps2/simplify.scm (compute-singly-referenced-vars):
Allow $closure.
(compute-eta-reductions, compute-beta-reductions): Use
compute-reachable-functions, which besides being a simplification also
allows simplification to work on first-order CPS.
* module/language/cps2/closure-conversion.scm
(rewrite-shared-closure-calls): Fix to make shared closures call the
right label.
(closure-label): New helper.
(prune-free-vars): If a shared closure is not well-known, don't use
the alias optimization.
(convert-one): Fix for shared closures with one not-well-known
closure.
* module/language/cps/compile-bytecode.scm (compile-bytecode): Only
convert closures if the #:cps2-convert? option is not passed.
* module/language/cps2/compile-cps.scm (conts->fun*, compile-cps): Add
support for CPS2 closure conversion, disabled by default.