* module/language/cps/cse.scm (compute-equivalent-subexpressions): When
CSE sees a definition like `(cons a b)', it will also record an
"auxiliary definition" for `(car x)', where x is the variable defined
by the cons, whereby calling `(car x)' can reduce to `a' if there is
no intervening effect that clobbers the definitions. However, when
the successor of the cons is a control-flow join, then any variables
defined there have multiple definitions. It's incorrect to add the
aux definition in that case.
* test-suite/tests/compiler.test ("cse auxiliary definitions"): New
test.
* module/language/cps/closure-conversion.scm (compute-elidable-closures):
New function.
(convert-one, convert-closures): Add ability to set "self" variable of
$kfun to $f, hopefully avoiding passing that argument in some cases.
* module/language/cps/compile-bytecode.scm (compile-function): Pass the
has-closure? bit on through to the assembler.
* module/system/vm/assembler.scm (begin-standard-arity)
(begin-opt-arity, begin-kw-arity): Only reserve space for the closure
as appropriate.
* module/language/cps/slot-allocation.scm (allocate-args)
(compute-defs-and-uses, compute-needs-slot)
(compute-var-representations): Allow for closure slot allocation
differences.
* module/language/cps/cse.scm (compute-defs):
* module/language/cps/dce.scm (compute-live-code):
* module/language/cps/renumber.scm (renumber, compute-renaming):
(allocate-args):
* module/language/cps/specialize-numbers.scm (compute-significant-bits):
(compute-defs):
* module/language/cps/split-rec.scm (compute-free-vars):
* module/language/cps/types.scm (infer-types):
* module/language/cps/utils.scm (compute-max-label-and-var):
* module/language/cps/verify.scm (check-distinct-vars):
(compute-available-definitions): Allow closure to be #f.
* module/language/cps.scm ($branch): Refactor to be its own CPS term
type, not relying on $continue to specify a continuation (which before
was only for the false case) or a source location. Update allllllll
callers.
* module/language/cps/cse.scm (compute-available-expressions):
(compute-equivalent-subexpressions): Improve algorithmic complexity of
CSE by pre-computing the labels whose reads are clobbered by a label's
writes.
* module/language/cps/cse.scm (compute-equivalent-subexpressions): Minor
optimization to reduce the size of equivalent expression keys, and to
avoid some work if an expression has no key.
* module/language/cps/compile-bytecode.scm (compile-function): Add
support for tag-fixnum/unlikely.
* module/language/cps/cse.scm (compute-equivalent-subexpressions): Add
equivalent subexpressions for tag-fixnum.
* module/language/cps/effects-analysis.scm:
* module/language/cps/primitives.scm (*macro-instruction-arities*): Add
tag-fixnum/unlikely.
* module/language/cps/specialize-numbers.scm (specialize-u64-unop)
(specialize-u64-binop, specialize-u64-shift)
(specialize-u64-comparison): Make the arg unboxers and result boxers
into keyword arguments.
(specialize-s64-unop): New helper.
(specialize-fixnum-comparison, specialize-fixnum-scm-comparison)
(specialize-scm-fixnum-comparison): Rename from
specialize-s64-comparison et al. Perhaps this should be expanded
again to include the whole s64 range, once we start to expand scm->s64
et al.
(specialize-operations): Specialize arithmetic, etc on signed
operands and results. Use less powerful unboxing/boxing ops if
possible -- e.g. tag-fixnum instead of u64->scm. Prefer fixnum
comparisons over u64 comparisons.
(compute-specializable-fixnum-vars): New helper.
(compute-specializable-phis): Specialize fixnum phis as well.
(specialize-primcalls): Specialize untag-fixnum of a constant to
load-s64.
* module/language/cps/type-fold.scm (u64->scm, s64->scm):
(scm->s64, scm->u64): Reduce to fixnum ops where possible.
* module/language/cps/types.scm: Remove type checkers for ops that don't
throw type errors. Alias tag-fixnum/unlikely to tag-fixnum.
* module/language/cps.scm ($primcall): Add "param" member, which will be
a constant parameter to the primcall. The idea is that constants used
by primcalls as immediates don't need to participate in optimizations
in any way -- they should not participate in CSE, have the same
lifetime as the primcall so not part of DCE either, and don't need
slot allocation. Indirecting them through a named $const binding is
complication for no benefit. This change should eventually improve
compilation time and memory usage, once we fully take advantage of it,
as the number of labels and variables will go down.
* module/language/cps/closure-conversion.scm:
* module/language/cps/compile-bytecode.scm:
* module/language/cps/constructors.scm:
* module/language/cps/contification.scm:
* module/language/cps/cse.scm:
* module/language/cps/dce.scm:
* module/language/cps/effects-analysis.scm:
* module/language/cps/elide-values.scm:
* module/language/cps/handle-interrupts.scm:
* module/language/cps/licm.scm:
* module/language/cps/peel-loops.scm:
* module/language/cps/prune-bailouts.scm:
* module/language/cps/prune-top-level-scopes.scm:
* module/language/cps/reify-primitives.scm:
* module/language/cps/renumber.scm:
* module/language/cps/rotate-loops.scm:
* module/language/cps/self-references.scm:
* module/language/cps/simplify.scm:
* module/language/cps/slot-allocation.scm:
* module/language/cps/specialize-numbers.scm:
* module/language/cps/specialize-primcalls.scm:
* module/language/cps/split-rec.scm:
* module/language/cps/type-checks.scm:
* module/language/cps/type-fold.scm:
* module/language/cps/types.scm:
* module/language/cps/utils.scm:
* module/language/cps/verify.scm:
* module/language/tree-il/compile-cps.scm: Adapt all users.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/cse.scm (compute-equivalent-subexpressions):
* module/language/cps/effects-analysis.scm:
* module/language/cps/primitives.scm (*macro-instruction-arities*):
* module/language/cps/specialize-numbers.scm (compute-specializable-vars):
* module/language/cps/types.scm: Add new variants of u64->scm and
s64->scm that can't be replaced by CSE's auxiliary definitions, so we
can sink unlikely allocations to side branches. This is a hack until
we can get allocation sinking working
* module/language/cps/cse.scm (compute-truthy-expressions):
(compute-equivalent-subexpressions):
(eliminate-common-subexpressions): Refactor to be able to work on
first-order CPS.
* module/language/cps.scm ($rec): Replace $letrec with $rec, which is an
expression, not a term. This means that the names bound by the letrec
appear twice: once in the $rec term, and once in the continuation.
This is not very elegant, but the situation is better than it was
before. Adapt all callers.
* doc/ref/compiler.texi (CPS in Guile): Incomplete documentation
updates. I'll update these later when the IL settles down.
* module/language/cps/cse.scm:
* module/language/cps/dfg.scm (compute-idoms, compute-dom-edges): Move
these procedures from cse.scm to dfg.scm.
Remove loop-detection code; that can come back later but it is
bitrotten for now.
* module/language/cps/effects-analysis.scm: Rewrite so that instead of
the depends/causes effects, there is just &type-check, &allocation,
&read, and &write. The object kind is a separate part of the
bitfield, and the field in the object (if appropriate) is another
field. Effects are still a fixnum. This enables precise effects for
vectors and structs on all architectures.
This kind of effects analysis was not possible in Tree-IL because
Tree-IL relied on logior-ing effects of subexpressions, whereas with
CPS we have no sub-expressions and we do flow analysis instead.
(effect-clobbers?): Replace effects-commute? with this inherently
directional and precise predicate.
* module/language/cps/cse.scm (compute-always-available-expressions):
(compute-equivalent-subexpressions): Adapt to effects analysis
change.
* module/language/cps/dce.scm (compute-live-code): Likewise.
* module/language/cps/cse.scm (compute-always-available-expressions):
Use constant? instead of zero?, to avoid punching through the effects
abstraction.
* module/language/cps/cse.scm (compute-available-expressions):
Simplify initialization.
(compute-equivalent-subexpressions): When synthesizing definitions,
use substed vars. Add synthetic definitions after processing an
expression, to take advantage of the substed vars.
* module/language/cps/effects-analysis.scm (effects-clobber): New
helper.
(length): Only depend on &cdr.
(synthesize-definition-effects!): New interface.
* module/language/cps/cse.scm (compute-available-expressions): Don't
count out constructors here -- we'll do that below.
(compute-defs): Add a comment.
(compute-equivalent-subexpressions): Synthesize getter calls at
constructor/setter sites, so that (set-car! x y) can cause a
future (car x) to just reference y. The equiv-labels set now stores
the defined vars, so there is no need for the defs vector.
(cse, apply-cse): Adapt to compute-equivalent-subexpressions change.