* module/language/cps/slot-allocation.scm (allocate-slots): Avoid
allocating locals in the range [253,255].
* module/system/vm/assembler.scm: List exports explicitly. For
operations with limited-range operands, export wrapper assemblers that
handle shuffling their operands into and out of their range.
(define-assembler): Get rid of enclosing begin.
(shuffling-assembler, define-shuffling-assembler): New helpers to
define shuffling wrapper assemblers.
(emit-mov*, emit-receive*): New functions.
(shuffle-up-args): New helper.
(standard-prelude, opt-prelude, kw-prelude): Call shuffle-up-args
after finishing.
* test-suite/tests/compiler.test ("limits"): Add test cases.
* module/language/cps/slot-allocation.scm (dead-after-def?):
(dead-after-use?, allocate-slots): Remove some needless remapping
between label indexes in the CFA, the DFA, and their names.
* module/language/cps.scm ($closure, $program): New CPS types, part of
low-level (first-order) CPS.
(build-cps-exp, build-cps-term, parse-cps, unparse-cps)
(compute-max-label-and-var): Update for new CPS types.
* module/language/cps/closure-conversion.scm: Rewrite to produce a
$program with $closures, and no $funs.
* module/language/cps/reify-primitives.scm:
* module/language/cps/compile-bytecode.scm (compile-fun):
(compile-bytecode): Adapt to new first-order format.
* module/language/cps/dfg.scm (compute-dfg): Add $closure case.
* module/language/cps/renumber.scm (renumber): Allow this pass to work
on either format.
* module/language/cps/slot-allocation.scm (allocate-slots): Add $closure
case.
* module/language/cps/slot-allocation.scm (allocate-slots): Rework to
avoid computing a CFA, and just relying on the incoming term to have
sorted labels.
* module/language/cps.scm ($kclause, $kentry): Instead of having an
entry continuation contain a list of clauses, have the clauses contain
clauses (as in Tree-IL). In some ways it's not as convenient but it
does reflect the continuation tree correctly.
* module/language/cps/arities.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-bytecode.scm:
* module/language/cps/constructors.scm:
* module/language/cps/contification.scm:
* module/language/cps/dce.scm:
* module/language/cps/dfg.scm:
* module/language/cps/elide-values.scm:
* module/language/cps/prune-top-level-scopes.scm:
* module/language/cps/reify-primitives.scm:
* module/language/cps/renumber.scm:
* module/language/cps/simplify.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/specialize-primcalls.scm:
* module/language/cps/verify.scm:
* module/language/tree-il/compile-cps.scm: Adapt aaaaaaall users.
* module/language/cps/dfg.scm (lookup-cont): Change to take a DFG
instead of a cont table.
(build-cont-table): Change to return a vector.
* module/language/cps/arities.scm:
* module/language/cps/contification.scm:
* module/language/cps/dce.scm:
* module/language/cps/effects-analysis.scm:
* module/language/cps/elide-values.scm:
* module/language/cps/reify-primitives.scm:
* module/language/cps/simplify.scm:
* module/language/cps/slot-allocation.scm: Adapt to lookup-cont and
build-cont-table changes.
* module/language/cps.scm ($callk): New expression type, for calls to
known labels. Part of "low CPS".
* module/language/cps/arities.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-bytecode.scm:
* module/language/cps/dce.scm:
* module/language/cps/dfg.scm:
* module/language/cps/effects-analysis.scm:
* module/language/cps/simplify.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/verify.scm: Adapt call sites.
* libguile/vm-engine.c (call-label, tail-call-label): New instructions.
Renumber the rest; this is an ABI change.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION):
* module/system/vm/assembler.scm (*bytecode-minor-version*): Bump.
* doc/ref/compiler.texi (CPS in Guile): Document $callk.
* module/language/cps/slot-allocation.scm (lookup-dead-slot-map)
(allocate-slots): For each non-tail call in a function, compute the
set of slots that are dead after the function has begun the call.
* module/language/cps/compile-bytecode.scm (compile-fun): Emit the
`dead-slot-map' macro instruction for non-tail calls.
* module/system/vm/assembler.scm (<asm>): Add `dead-slot-maps' member.
(dead-slot-map): New macro-instruction.
(link-frame-maps, link-dynamic-section, link-objects): Write dead
slots information into .guile.frame-maps sections of ELF files.
* module/system/vm/elf.scm (DT_GUILE_FRAME_MAPS): New definition.
* libguile/loader.h:
* libguile/loader.c (DT_GUILE_FRAME_MAPS, process_dynamic_segment):
(load_thunk_from_memory, register_elf): Arrange to parse
DT_GUILE_FRAME_MAPS out of the dynamic section.
(find_mapped_elf_image_unlocked, find_mapped_elf_image): New helpers.
(scm_find_mapped_elf_image): Refactor.
(scm_find_dead_slot_map_unlocked): New interface.
* libguile/vm.c (scm_i_vm_mark_stack): Mark the hottest frame
conservatively, as before. Otherwise use the dead slots map, if
available, to avoid marking data that isn't live.
* module/language/cps/slot-allocation.scm (allocate-slots): For
truncating calls, shuffle the first return value (if any). Avoids
frame size growth due to sparse locals, pegged where they were left by
procedure call returns. With this patch, eval with $ktrunc nodes goes
from 31 locals to 18 (similar to the size before adding $ktrunc
nodes).
* module/language/cps/slot-allocation.scm (allocate-slots): Fix bug in
allocate!, whereby a previously hinted allocation would not be added
to the live set if a hint was not given later.
* module/language/cps.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-bytecode.scm:
* module/language/cps/dfg.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/verify.scm:
* module/language/tree-il/compile-cps.scm: Remove "pop" member from
$prompt data type, as it is no longer used.
* module/language/cps/slot-allocation.scm (allocate-slots): Don't
allocate slots to unused results of function calls. This can allow us
to avoid consing a rest list for call-with-values with an ignored rest
parameter, and can improve the parallel move code.
* module/language/cps/compile-bytecode.scm (compile-fun): Adapt to avoid
emitting bind-rest in values context if the rest arg is unused.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump for frame layout
change.
* libguile/frames.c: Update some static checks.
(scm_frame_num_locals, scm_frame_local_ref, scm_frame_local_set_x):
Update to not skip over uninitialized frames, as that's not a thing
any more.
* libguile/frames.h: Update to remove MVRA. Woo!
* libguile/vm-engine.c (ALLOC_FRAME, RETURN_ONE_VALUE):
(rtl_vm_engine): Update for 3 words per frame instead of 4.
* libguile/vm.c (vm_return_to_continuation): Likewise.
* module/language/cps/slot-allocation.scm (allocate-slots): 3 words per
frame, not 4.
* module/system/vm/assembler.scm (*bytecode-minor-version*): Bump. Also
remove a couple of tc7's that aren't around any more.
* module/language/cps/slot-allocation.scm (allocate-slots): Convert
cont-table to a vector, for ease of access. Run a pass before
allocation that determines the set of variables whose slot allocation
can and should be delayed, so that they can ideally be allocated
directly in an argument slot.
* module/language/cps/slot-allocation.scm ($allocation): Refactor
internal format of allocations. Instead of an allocation being a hash
table of small $allocation objects, it is an $allocation object that
contains packed vectors.
(find-first-trailing-zero): Rework to not need a maximum.
(lookup-maybe-slot): New interface.
(lookup-slot): Raise an error if a var has no slot.
(lookup-call-allocation): New helper.
(lookup-constant-value, lookup-maybe-constant-value):
(lookup-call-proc-slot, lookup-parallel-moves): Adapt to $allocation
change
(allocate-slots): Rewrite so that instead of being recursive, it
traverses the blocks in CFA order. Also, procedure call frames are
now allocated with respect to the live set after using arguments (and
killing any dead-after-use vars); this should make call frames more
compact but it does necessitate a parallel move solution. Therefore
parallel moves are recorded for all calls, for arguments; also if the
continuation is a $ktrunc, the continuation gets parallel moves for
the results.
This rewrite is in preparation to allocating call args directly in the
appropriate slots, where possible.
* module/language/cps/compile-rtl.scm (compile-fun): Adapt to slot
allocation changes, using lookup-maybe-slot where appropriate,
performing parallel moves when calling functions, and expecting return
moves to be associated with $ktrunc continuations.
* module/language/cps.scm ($continue, $cont): Put source information on
the $continue, not on the $cont. Otherwise it is difficult for CPS
conversion to preserve source information.
($fun): Add a src member to $fun. Otherwise we might miss the source
info for the start of the function.
* .dir-locals.el:
* module/language/cps/arities.scm:
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-rtl.scm:
* module/language/cps/constructors.scm:
* module/language/cps/contification.scm:
* module/language/cps/dfg.scm:
* module/language/cps/elide-values.scm:
* module/language/cps/reify-primitives.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/verify.scm:
* module/language/tree-il/compile-cps.scm: Update the whole CPS world
for this change.
* module/language/cps/compile-rtl.scm (compile-fun): Rewrite to visit
conts in reverse-post-order, which is a topological sort on the basic
blocks.
* module/language/cps/slot-allocation.scm (allocate-slots): Expect a DFG
as an argument.
* module/language/cps/dfg.scm (reverse-post-order): Add an optional
"fold-all-conts" argument.
(compute-live-variables): Take the function as an arg instead of the
start continuation, and implement fold-all-conts so that nodes that
never reach the tail also get liveness information.
* module/language/cps.scm ($prompt): Add a "pop" field, indicating the
continuation at which this prompt is popped. The body of the prompt
is dominated by the prompt, and post-dominated by the pop. Adapt all
builders and users.
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-rtl.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/verify.scm:
* module/language/tree-il/compile-cps.scm: Adapt.
* module/language/cps/dfg.scm (visit-fun): Add an arc from the pop to
the handler, to keep handler variables alive through the prompt body.
* module/language/cps/dfg.scm (control-point?): New interface, replaces
branch?.
(dead-after-def?, dead-after-use?, dead-after-branch?): Remove these.
The first one was fine; dead-after-use? was conservative but OK; but
dead-after-branch? was totally bogus. Instead we use precise liveness
information in the allocator.
* module/language/cps/slot-allocation.scm ($allocation): Remove "def"
and "dead" slots. We'll communicate liveness information in some
other way to the compiler.
(allocate-slots): Rework to use precise liveness information.
* libguile/vm-engine.c (prompt): Adapt to explicitly set the saved SP so
we know how many incoming values the handler will receive, and to make
escape-only? a flag.
* module/language/cps/compile-rtl.scm (emit-rtl-sequence): $prompt
should only be found in a "seq" context, as it just pushes on a prompt
and doesn't bind any values. On the other hand it should emit
appropriate code for the handler to bind its values, so do that.
* module/language/cps/slot-allocation.scm ($cont-allocation): Add a note
that proc-slot is used by prompts as well.
(allocate-slots): Compute the allocation of a prompt handler's args.
* module/language/tree-il/compile-cps.scm (convert): Use "unwind"
instead of the nonexistent "pop-prompt".
* module/system/vm/disassembler.scm (code-annotation): Adapt to change
in prompt VM op.