This fixes a bug whereby the compiler would sometimes allocate floats in
marked space.
* libguile/gc-inline.h (scm_inline_gc_malloc_pointerless_words): New
internal helper.
* libguile/intrinsics.h (SCM_FOR_ALL_VM_INTRINSICS):
* libguile/intrinsics.c (allocate_pointerless_words):
(allocate_pointerless_words_with_freelist): New intrinsics.
* libguile/jit.c (compile_allocate_pointerless_words):
(compile_allocate_pointerless_words_immediate): New compilers.
* libguile/vm-engine.c (allocate_pointerless_words)
(allocate_pointerless_words_immediate): New opcodes.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/effects-analysis.scm (param):
* module/language/cps/reify-primitives.scm (reify-primitives):
* module/language/cps/specialize-primcalls.scm (specialize-primcalls):
* module/language/cps/types.scm (allocate-words):
(allocate-words/immediate):
* module/system/vm/assembler.scm (system): Add support for the new
opcodes.
* libguile/intrinsics.c (scm_atan1): New intrinsic, wrapping scm_atan.
(scm_bootstrap_intrinsics): Add new intrinsics.
* libguile/intrinsics.h (scm_t_f64_from_f64_f64_intrinsic): New
intrinsic type.
(SCM_FOR_ALL_VM_INTRINSICS): Add intrinsics for floor, ceiling, sin,
cos, tan, asin, acos, atan, and their unboxed counterparts.
* libguile/jit.c (sp_f64_operand): New helper.
(compile_call_f64_from_f64, compile_call_f64_from_f64_f64): Call out
to intrinsics.
* libguile/vm-engine.c (call-f64<-f64-f64): New opcode.
* module/language/cps/effects-analysis.scm: Add new intrinsics.
* module/language/cps/reify-primitives.scm (compute-known-primitives):
Add new intrinsics.
* module/language/cps/slot-allocation.scm (compute-var-representations):
Add 'f64 slot types for the new unboxed intrinsics.
* module/language/cps/specialize-numbers.scm (specialize-operations):
Support unboxing the new intrinsics.
* module/language/cps/types.scm: Define type inferrers for the new
intrinsics.
* module/language/tree-il/cps-primitives.scm: Define CPS translations
for the new intrinsics.
* module/language/tree-il/primitives.scm (*interesting-primitive-names*):
(*effect-free-primitives*, atan): Define primitive resolvers.
* module/system/vm/assembler.scm: Export assemblers for the new
intrinsics.
(define-f64<-f64-f64-intrinsic): New helper.
Some components of this have been wired up for a while; this commit
finishes the compiler, runtime, and JIT support.
* libguile/intrinsics.h (SCM_FOR_ALL_VM_INTRINSICS):
* libguile/intrinsics.c (scm_bootstrap_intrinsics): Declare the new
intrinsics.
* libguile/jit.c (compile_call_f64_from_f64): Define code generators for
the new intrinsics.
* libguile/vm-engine.c (call-f64<-f64): New instruction.
* module/language/cps/effects-analysis.scm:
* module/language/cps/reify-primitives.scm (compute-known-primitives):
* module/language/cps/slot-allocation.scm (compute-var-representations):
* module/language/cps/specialize-numbers.scm (specialize-operations):
* module/language/tree-il/cps-primitives.scm (abs):
* module/system/vm/assembler.scm (system, define-f64<-f64-intrinsic):
(sqrt, abs, fsqrt, fabs):
* module/language/cps/types.scm (fsqrt, fabs): Add new f64<-f64
primitives.
* libguile/jit.c (OLD_FP_FOR_RETURN_TRAMPOLINE): Initialize static const
var from CPP define instead of T0.
(compile_return_values, emit_return_to_interpreter_trampoline): Adapt
to upper-casing.
This change speeds up the indirect branches at return sites by taking
advantage of the CPU's return address stack.
* libguile/jit.c (emit_push_frame): Don't store the mra; we do that via
a trampoline.
(emit_handle_interrupts_trampoline): Take MRA from link register
instead of T0.
(compile_call, compile_call_label): Compute MRA via the new
jmpi_with_link lightening instruction.
(compile_return_values): Return to caller via ret instead of jmp.
(compile_handle_interrupts): Jump to handle-interrupts trampoline via
jmpi_with_link, to provide the MRA.
(initialize_jit): Bless the trampolines so that they are valid
operands to BX on ARM.
This patch is a bit unfortunate, in the sense that it exposes some of
the JIT guts to the rest of the VM. Code needs to treat "machine return
addresses" as valid if non-NULL (as before) and also not equal to a
tier-down trampoline. This is because tier-down at a return needs the
old frame pointer to load the "virtual return address", and the way this
patch works is that it passes the vra in a well-known register. It's a
custom calling convention for a certain kind of return.
* libguile/jit.h (scm_jit_return_to_interpreter_trampoline): New
internal global.
* libguile/jit.c: (scm_jit_clear_mcode_return_addresses): Move here,
from vm.c. Instead of zeroing return addresses, set them to the
return-to-interpreter trampoline.
* libguile/vm-engine.c (return-values): Don't enter mcode if the mra is
scm_jit_return_to_interpreter_trampoline.
* libguile/vm.c (capture_continuation): Treat the tier-down trampoline
as NULL.
* libguile/jit.c (compile_alloc_frame): Stop initializing locals.
(compile_bind_rest): Use emit_alloc_frame.
* libguile/vm-engine.c (assert_nargs_ee_locals, allocate_frame): Don't
initialize locals.
(bind_rest): Don't initialize locals, and assert that the locals count
has a minimum.
* libguile/jit.c (jit_alloc_fn): On targets that need a dynamically
allocated literal pool, we will need to trace that pool, so pass a
pointerful malloc. Fixes JIT on AArch64.
* libguile/jit.c (compile_ursh_immediate):
(compile_ulsh_immediate): Fix immediate/register variant calling.
Happily a benefit of lightening, as type safety did this for us.
(DEFINE_CLOBBER_RECORDING_EMITTER_R_R_2): Pass JIT state.
* libguile/jit.c: Operands have their ABI in them. We can now have
addends on GPR and MEM operands, which can improve register
allocation. Use new jit_calli_3, etc helper APIs.
* libguile/jit.c (compute_mcode): Move analysis outside the code
emitter, as it doesn't need to re-run on overflow.
(compile): Clear labels before emitting, as they may have changed if we
overflowed.
* libguile/jit.c (fp_scm_operand): Fix assertion about register state.
(compile_call_scm_sz_u32): Fix ABI declaration for immediate.
Some whitespace cleanups as well.
* libguile/jit.c (prepare_jit_state): Remove unused function.
(initialize_thread_jit_state): Since the lightening state is allocated
using GC memory, trace the JIT state.
(compute_mcode): Avoid double-compile.
* libguile/jit.c (struct scm_jit_state): Remove entry_mcode member.
(add_inter_instruction_patch): Fix off-by-one.
(compile): Reset reloc_idx when restarting a compile. All instructions
record their addresses.
* libguile/jit.c (compile_u64_imm_less): Compare high word using
not-equal, to avoid a signedness compare.
(compile_s64_imm_less, compile_imm_s64_less): Fix the not-less cases.