* libguile/jit.c (jit_stop_after, jit_pause_when_stopping): New locals.
(scm_jit_compute_mcode): Add ability to stop after N compilations.
(scm_jit_enter_mcode): Comment out printfs for the time being.
(scm_init_jit): Init locals from environment variables.
* libguile/jit.c (compile1): Add debug for when instructions are first
compiled. Will be removed when all is working.
(compute_mcode): Add debugging about what code is compiled.
(scm_sys_jit_compile): Remove per-instruction output.
(scm_jit_compute_mcode): Actually compile JIT code. Use
GUILE_JIT_COUNTER_THRESHOLD to control when JIT happens.
* libguile/jit.c (scm_jit_counter_threshold): Make a static variable
instead of a compile-time constant.
(scm_init_jit): Init scm_jit_counter_threshold from
GUILE_JIT_COUNTER_THRESHOLD environment variable. Default is -1
indicating "never JIT".
* libguile/vm-engine.c (instrument-entry, instrument-loop): Adapt to new
variable.
* libguile/control.c (compose_continuation_code): Fix offset of code
end.
* libguile/jit.c (compile_compose_continuation): Fix test for mcode not
null.
* libguile/jit.c (compile_prompt): Actually push the MRA arg.
(analyze): Mark call continuations as entries, as both FP and SP are
set then, and also mark prompt handlers as entries (and blocks).
* libguile/intrinsics.c (add_immediate, sub_immediate, less_p)
(numerically_equal_p): Add fast paths. Makes one test locally go from
.77s interpreted to .60s.
(scm_to_uint64_truncate): Add a likelihood annotation.
* libguile/jit.c (struct scm_jit_state): Add beginnings of a little
local register allocator.
(reset_register_state): New helper.
(clear_scratch_register_state): Use new helper.
(record_gpr_clobber, record_fpr_clobber): New helpers, used when there
may be cached variables in registers, called when registers are
written.
(set_sp_cache_gpr, set_sp_cache_fpr): New helpers, called when results
are written to the stack.
(emit_retval, emit_movi, emit_ldxi, DEFINE_CLOBBER_RECORDING_EMITTER_R)
(DEFINE_CLOBBER_RECORDING_EMITTER_P, DEFINE_CLOBBER_RECORDING_EMITTER_R_I)
(DEFINE_CLOBBER_RECORDING_EMITTER_R_R): New wrappers for Lightning API
that also records register clobbers. Update callers.
(save_reloadable_register_state): New helper.
(restore_reloadable_register_state): Rename from
ensure_register_state.
* libguile/jit.c (scm_jit_state): Add op_attrs array, for a pre-pass,
and store state of what's in registers.
(SP, FP): Reassign to scratch registers, as in general these need to
be reloaded anyway after callouts.
(die, DIE, ASSERT, UNREACHABLE): Add better invariant-testing.
(clear_register_state, clear_scratch_register_state)
(set_register_state, has_register_state, ASSERT_HAS_REGISTER_STATE):
Add machinery to track state of SP and FP. Can eventually track
scratch register assignments as well. Adapt code to use these.
(compile_atomic_ref_scm_immediate): Compile to a vanilla load on x86.
(compile_handle_interrupts): Analogous atomic-ref changes here.
(analyze): New helper, a simple once-through pre-pass to identify
branch targets.
(compile): Only generate labels for branch targets. Reset register
state at branch targets.
(compute_mcode): Initialize j->op_attrs appropriately.
* libguile/jit.c (emit_push_frame): Simplification; we never need to
store old_fp and new_fp at once.
(compile_alloc_frame): Fix to not keep a pointer into the stack across
a stack expansion.
* libguile/jit.c (struct scm_jit_state): Store next ip, so that
compilers can fuse opcodes.
(op_lengths): New static variable.
(emit_direct_tail_call): Add a fast case for self-recursion.
(compile1): Move IP advancement out of the specific arity compilers;
instead precompute a "next_ip", that can be incremented.
* libguile/jit.c (add_inter_instruction_patch, compile, compute_mcode):
Add support for labels.
(compile_uadd_immediate, compile_usub_immediate): Fix cases where we
were adding the wrong operand as an immediate.
* libguile/jit.c (emit_handle_interrupts_trampoline)
(initialize_handle_interrupts_trampoline)
(compile_handle_interrupts): Handle most of the slow case of
handle-interrupts out-of-line, to avoid code bloat.
* libguile/vm-engine.c (instrument-entry): Eagerly check if data->mcode
is already set, and in that case just jump directly without checking
the counter.
* libguile/jit.c (emit_entry_trampoline): Don't bother hackily trying to
save registers; the "jit_frame" call handles that.
(compile_return_values, compile_return_from_interrupt): Fix bug when
computing previous FP: no need to add frame_overhead_slots.
(emit_load_prev_fp_offset, emit_store_prev_fp_offset): Rename from
emit_load_prev_frame_size, emit_store_prev_frame_size.
(emit_push_frame): Adapt to emit_store_prev_frame_size. Don't
subtract off the frame_overhead_slots.
(scm_jit_enter_mcode): Comment out a printf for the time being.
* libguile/jit.c (emit_alloc_frame_for_sp, compile_current_thread): Fix
some ldxr/ldxi stxr/stxi confusions.
(compile_alloc_frame): Omit if the frame size is already correct.
(compile, compute_mcode, scm_sys_jit_compile, scm_jit_enter_mcode):
Add a bit more debugging.
* libguile/lightning/lib/lightning.c (_jit_emit): The default code
allocator will simply mmap a code buffer, try to emit into that
buffer, and if it fails, try again with a larger buffer. However the
buffer size starts at 0, for some reason. Why? I can't see the
reason. Change the default to 4096. In the future we will need to
implement our own allocator anyway so that we can pack multiple JIT
runs in one page.
* libguile/init.c (scm_i_init_guile): Call scm_init_jit ().
* libguile/jit.c (enter_mcode, exit_mcode): New static members; code
pointers for the JIT trampoline.
(emit_exit): New helper. The Lightning tramp/frame mechanism that we
use needs to exit via a jmp instead of a return. Adapt callers of
jit_ret.
(emit_entry_trampoline): Use the "frame" mechanism to enter the JIT.
(compile1): Add missing "break" after case statements. Oops!
(compile): Add prolog and "tramp" to compiled functions.
(initialize_jit): New local routine to init the JIT on demand.
(compute_mcode): New helper, to compile a function.
(scm_sys_jit_compile): New function, exported to Scheme as
%jit-compile.
(scm_jit_compute_mcode): Return the existing mcode if the function is
at the start.
(scm_jit_enter_mcode): Call the enter_mcode trampoline.
* libguile/jit.h (struct scm_jit_state): Declare, so we can make
pointers to it.
* libguile/threads.h (struct scm_thread): Add jit_state member.
* libguile/threads.c (on_thread_exit): Free the jit state.
* libguile/intrinsics.h: Add "intrinsic" for handle-interrupts code.
Unlike the other intrinsics, this one isn't a function.
* libguile/programs.c (try_parse_arity): Add cases for instructions used
in VM builtins.
(scm_primitive_call_ip): Return #f if call-ip not found.
* libguile/vm-engine.c (handle-interrupts): Get code from intrinsics.
* libguile/vm.c
* libguile/vm.c (instrumented_code, define_vm_builtins): Add
instrumentation to the builtins, so that they can be JIT-compiled.
(INIT_BUILTIN): Remove min-arity setting; the fallback min-arity
interpreter should figure it out.
(scm_bootstrap_vm): Call the new define_vm_builtins function.
* libguile/gsubr.c (primitive_call_ip): Return 0 if call IP not found.
(primitive_subr_idx): Interpret call ip == 0 as not-a-subr.
* module/system/vm/program.scm (program-arguments-alist): Allow a #f
call-ip.
* libguile/intrinsics.h:
* libguile/intrinsics.c (atomic_ref_scm, atomic_set_scm):
(atomic_swap_scm, atomic_compare_and_swap_scm): New intrinsics, given
that lightning doesn't know atomics.
(scm_bootstrap_intrinsics): Init new intrinsics.
* libguile/vm-engine.c (atomic-scm-ref/immediate)
(atomic-scm-set!/immediate, atomic-scm-swap!/immediate)
(atomic-scm-compare-and-swap!/immediate): Use intrinsics, to be like
the JIT.
* libguile/intrinsics.h (INDIRECT_INT64_INTRINSICS): New definition. If
true, int64 args and return values are passed by reference. Here to
make JIT easier.
* libguile/intrinsics.c (indirect_scm_to_int64, indirect_scm_to_uint64):
(indirect_scm_to_uint64_truncate, indirect_scm_from_int64):
(indirect_scm_from_uint64, indirect_lsh, indirect_rsh): New indirect
variants.
(scm_bootstrap_intrinsics): Use indirect variants as appropriate.
* libguile/vm-engine.c: Update to call indirect intrinsics if
appropriate.
* libguile/intrinsics.h:
* libguile/intrinsics.c (string_set_x): Change to take size_t and u32 as
args.
(allocate_words): Change to take size_t as arg.
* libguile/vm.c (expand_apply_argument): Rename from rest_arg_length,
and also handle the stack manipulation.
* libguile/vm-engine.c (expand-apply-argument): Update for intrinsic
change.
(call-scm-sz-u32): Rename from call-scm-u64-u64, as it matches its
uses and will compile better on 32-bit systems.
* module/system/vm/assembler.scm (define-scm-sz-u32-intrinsic):
(string-set!): Update for new instrinsic call inst.
* libguile/jit.c (compile_call_scm_sz_u32): Adapt.
* libguile/intrinsics.h (SCM_FOR_ALL_VM_INTRINSICS):
* libguile/intrinsics.c (error_wrong_num_args): Take the thread as an
arg, instead of the ostensible callee.
* libguile/vm-engine.c: Update callers of wrong-num-args intrinsic to
pass a thread instead.
* libguile/intrinsics.h (SCM_FOR_ALL_VM_INTRINSICS): Update prototype of
capture-continuation.
* libguile/jit.h:
* libguile/jit.c (scm_jit_enter_mcode): Return void, not the vra.
Instead, we expect the code to set vp->ip for the vra.
* libguile/vm-engine.c (instrument-entry, instrument-loop)
(return-values, abort): Adapt scm_jit_enter_mcode calling convention.
(capture-continuation): No need to pass an mra; the intrinsic will
read it from the stack.
* libguile/vm.c (capture_continuation): Remove mra arg, as we take mra
from the continuation.
(scm_call_n): Adapt to scm_jit_enter_mcode change.