* module/language/cps/dfg.scm (lookup-cont, lookup-block):
(lookup-def, constant-needs-allocation?): Rework these accessors to
avoid completely destructuring the $dfg.
* module/language/cps/compile-bytecode.scm (compile-bytecode): Renumber
a function before going to compile it, so that the vars and labels are
contiguous within each function.
* module/language/cps/dfg.scm ($dfg): Change to store conts, blocks, and
use-maps as vectors. A DFG also records the minimum label, minimum
variable, and the number of labels and variables. The first entry in
one of these vectors corresponds to the minimum. This can be
optimum in the local case if the conts and variables have been renamed
appropriately.
Adapt callers.
(compute-live-variables): Adapt. This is currently suboptimal but it
works, so it's a useful base for optimization.
* module/language/cps/dfg.scm (lookup-cont): Change to take a DFG
instead of a cont table.
(build-cont-table): Change to return a vector.
* module/language/cps/arities.scm:
* module/language/cps/contification.scm:
* module/language/cps/dce.scm:
* module/language/cps/effects-analysis.scm:
* module/language/cps/elide-values.scm:
* module/language/cps/reify-primitives.scm:
* module/language/cps/simplify.scm:
* module/language/cps/slot-allocation.scm: Adapt to lookup-cont and
build-cont-table changes.
* module/language/cps.scm (make-cont-folder): Add global? parameter, and
make public.
(fold-conts): Adapt.
(fold-local-conts): Use make-cont-folder, and take a function instead
of a continuation.
* module/language/cps/arities.scm (fix-clause-arities, fix-arities*):
* module/language/cps/compile-bytecode.scm (collect-conts):
* module/language/cps/elide-values.scm (elide-values*): Adapt to
fold-local-conts change.
* module/language/cps/simplify.scm (compute-beta-reductions):
(beta-reduce): Separate state into two tables, so we can relax current
guarantee that vars and labels are mutually unique.
* module/language/tree-il/compile-cps.scm (fold-formals)
(unbound?, init-default-value, convert): Arrange to rename incoming
gensyms as small integers.
(canonicalize): Convert vector and abort here too.
* .dir-locals.el: Add with-fresh-name-state.
* module/language/cps.scm (fresh-label, fresh-var): Signal an error if
the counters are not initialized.
(with-fresh-name-state): New macro.
(make-cont-folder): New macro, generates an n-ary folder.
(compute-max-label-and-var): New function, uses make-cont-folder.
(fold-conts): Use make-cont-folder.
(let-gensyms): Remove.
* module/language/cps/arities.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/constructors.scm:
* module/language/cps/dce.scm:
* module/language/cps/elide-values.scm:
* module/language/cps/reify-primitives.scm:
* module/language/cps/specialize-primcalls.scm: Use let-fresh instead of
let-gensyms, and wrap in a with-fresh-name-state as needed.
* module/language/tree-il/compile-cps.scm: Remove hack to avoid
importing let-gensyms from (language tree-il).
* module/language/cps.scm (label-counter, var-counter): New parameters,
for producing fresh label and var names.
(fresh-label, fresh-var): New procedures.
(let-fresh): New macro, will replace let-gensyms.
(build-cps-term): Use let-fresh.
* module/language/tree-il/compile-cps.scm: Use let-fresh to generate
fresh names.
* module/system/vm/assembler.scm (make-meta, begin-kw-arity): Allow
exact integers as labels.
(link-debug): Explicitly mark low-pc as being an "addr" value.
* module/language/cps/simplify.scm (prune-continuations): Prune
continuations as a post-pass with a fresh DFG. Using a
pre-eta-conversion DFG as we were doing before missed some cases.
* module/language/cps/simplify.scm (compute-eta-reductions): Avoid
trying to eta-reduce a jump-to-self, as in (let lp () (lp)). This
caused the compiler to hang.
* module/language/tree-il/peval.scm (peval): When going to peval a call
whose operator isn't just a lambda but is a let-bound lambda, as one
bound via define-inlinable, don't create a new counter if the lambda
is only referenced once in the source. Avoids needless failure to
inline once-referenced procedures.
* test-suite/tests/peval.test ("partial evaluation"): Wheeeee
* module/language/cps/prune-top-level-scopes.scm: New pass, to prune
unneeded "cache-current-module!" forms.
* module/language/cps/compile-bytecode.scm:
* module/Makefile.am: Add the new pass to the build and enable by
default.
* module/language/cps.scm ($callk): New expression type, for calls to
known labels. Part of "low CPS".
* module/language/cps/arities.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-bytecode.scm:
* module/language/cps/dce.scm:
* module/language/cps/dfg.scm:
* module/language/cps/effects-analysis.scm:
* module/language/cps/simplify.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/verify.scm: Adapt call sites.
* libguile/vm-engine.c (call-label, tail-call-label): New instructions.
Renumber the rest; this is an ABI change.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION):
* module/system/vm/assembler.scm (*bytecode-minor-version*): Bump.
* doc/ref/compiler.texi (CPS in Guile): Document $callk.
* module/language/cps/slot-allocation.scm (lookup-dead-slot-map)
(allocate-slots): For each non-tail call in a function, compute the
set of slots that are dead after the function has begun the call.
* module/language/cps/compile-bytecode.scm (compile-fun): Emit the
`dead-slot-map' macro instruction for non-tail calls.
* module/system/vm/assembler.scm (<asm>): Add `dead-slot-maps' member.
(dead-slot-map): New macro-instruction.
(link-frame-maps, link-dynamic-section, link-objects): Write dead
slots information into .guile.frame-maps sections of ELF files.
* module/system/vm/elf.scm (DT_GUILE_FRAME_MAPS): New definition.
* libguile/loader.h:
* libguile/loader.c (DT_GUILE_FRAME_MAPS, process_dynamic_segment):
(load_thunk_from_memory, register_elf): Arrange to parse
DT_GUILE_FRAME_MAPS out of the dynamic section.
(find_mapped_elf_image_unlocked, find_mapped_elf_image): New helpers.
(scm_find_mapped_elf_image): Refactor.
(scm_find_dead_slot_map_unlocked): New interface.
* libguile/vm.c (scm_i_vm_mark_stack): Mark the hottest frame
conservatively, as before. Otherwise use the dead slots map, if
available, to avoid marking data that isn't live.
* module/language/cps/compile-bytecode.scm (compile-fun): Now that all
$call expressions continue to $ktail or $ktrunc, remove the $kargs
case, and make receive-values bail if too many values are returned.
* module/language/tree-il/compile-cps.scm (init-default-value, convert):
Explicitly insert $ktrunc nodes on all places that can truncate to
single values.
* module/language/cps/slot-allocation.scm (allocate-slots): For
truncating calls, shuffle the first return value (if any). Avoids
frame size growth due to sparse locals, pegged where they were left by
procedure call returns. With this patch, eval with $ktrunc nodes goes
from 31 locals to 18 (similar to the size before adding $ktrunc
nodes).
* module/language/cps/slot-allocation.scm (allocate-slots): Fix bug in
allocate!, whereby a previously hinted allocation would not be added
to the live set if a hint was not given later.
* module/language/cps.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-bytecode.scm:
* module/language/cps/dfg.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/verify.scm:
* module/language/tree-il/compile-cps.scm: Remove "pop" member from
$prompt data type, as it is no longer used.