* libguile/gc-inline.h:
* libguile/threads.h (SCM_INLINE_GC_GRANULE_WORDS)
(SCM_INLINE_GC_GRANULE_BYTES, SCM_INLINE_GC_FREELIST_COUNT): Move
definitions here, from gc-inline.h.
(struct scm_thread): Inline freelist vectors.
* libguile/threads.c (thread_mark): Update marker for pointerless
freelists.
(on_thread_exit): Clear individual freelist entries, instead of the
vector as a whole.
(guilify_self_2): No need to alloc freelist vectors.
* libguile/threads.h (struct scm_thread): Move async-related bits up, so
that the VM can access them easier. Likewise for freelists (which we
will inline soon).
* libguile/jit.c (OLD_FP_FOR_RETURN_TRAMPOLINE): Initialize static const
var from CPP define instead of T0.
(compile_return_values, emit_return_to_interpreter_trampoline): Adapt
to upper-casing.
This change speeds up the indirect branches at return sites by taking
advantage of the CPU's return address stack.
* libguile/jit.c (emit_push_frame): Don't store the mra; we do that via
a trampoline.
(emit_handle_interrupts_trampoline): Take MRA from link register
instead of T0.
(compile_call, compile_call_label): Compute MRA via the new
jmpi_with_link lightening instruction.
(compile_return_values): Return to caller via ret instead of jmp.
(compile_handle_interrupts): Jump to handle-interrupts trampoline via
jmpi_with_link, to provide the MRA.
(initialize_jit): Bless the trampolines so that they are valid
operands to BX on ARM.
The existing calli / callr interface is for ABI calls. Sometimes though
you want to call some of your own code, just to get the current return
address. ARM's branch-and-link instructions are ideal for this but they
don't exist on x86; there we emulate them by adding corresponding
pop_link_register / push_link_register instructions that are no-ops on
ARM.
* lightening.h (FOR_EACH_INSTRUCTION): Add jit_jmpi_with_link,
pop_link_register, push_link_register.
* lightening/arm-cpu.c:
* lightening/x86-cpu.c:
* lightening/aarch64-cpu.c (jmpi_with_link, push_link_register)
(pop_link_register): Add implementations.
* lightening/arm.h:
* lightening/aarch64.h:
* lightening/x86.h (JIT_LR): New definition.
* tests/link-register.c: New test.
This patch is a bit unfortunate, in the sense that it exposes some of
the JIT guts to the rest of the VM. Code needs to treat "machine return
addresses" as valid if non-NULL (as before) and also not equal to a
tier-down trampoline. This is because tier-down at a return needs the
old frame pointer to load the "virtual return address", and the way this
patch works is that it passes the vra in a well-known register. It's a
custom calling convention for a certain kind of return.
* libguile/jit.h (scm_jit_return_to_interpreter_trampoline): New
internal global.
* libguile/jit.c: (scm_jit_clear_mcode_return_addresses): Move here,
from vm.c. Instead of zeroing return addresses, set them to the
return-to-interpreter trampoline.
* libguile/vm-engine.c (return-values): Don't enter mcode if the mra is
scm_jit_return_to_interpreter_trampoline.
* libguile/vm.c (capture_continuation): Treat the tier-down trampoline
as NULL.
* module/language/cps/closure-conversion.scm (compute-elidable-closures):
New function.
(convert-one, convert-closures): Add ability to set "self" variable of
$kfun to $f, hopefully avoiding passing that argument in some cases.
* module/language/cps/compile-bytecode.scm (compile-function): Pass the
has-closure? bit on through to the assembler.
* module/system/vm/assembler.scm (begin-standard-arity)
(begin-opt-arity, begin-kw-arity): Only reserve space for the closure
as appropriate.
* module/language/cps/slot-allocation.scm (allocate-args)
(compute-defs-and-uses, compute-needs-slot)
(compute-var-representations): Allow for closure slot allocation
differences.
* module/language/cps/cse.scm (compute-defs):
* module/language/cps/dce.scm (compute-live-code):
* module/language/cps/renumber.scm (renumber, compute-renaming):
(allocate-args):
* module/language/cps/specialize-numbers.scm (compute-significant-bits):
(compute-defs):
* module/language/cps/split-rec.scm (compute-free-vars):
* module/language/cps/types.scm (infer-types):
* module/language/cps/utils.scm (compute-max-label-and-var):
* module/language/cps/verify.scm (check-distinct-vars):
(compute-available-definitions): Allow closure to be #f.
* libguile/jit.c (compile_alloc_frame): Stop initializing locals.
(compile_bind_rest): Use emit_alloc_frame.
* libguile/vm-engine.c (assert_nargs_ee_locals, allocate_frame): Don't
initialize locals.
(bind_rest): Don't initialize locals, and assert that the locals count
has a minimum.
* libguile/intrinsics.h (SCM_FOR_ALL_VM_INTRINSICS):
* libguile/intrinsics.c (scm_bootstrap_intrinsics): Remove the atomic
intrinsics, as the JIT no longer needs them.
* libguile/vm-engine.c (VM_NAME): Inline the intrinsics.
These operations emit the same code that GCC does for corresponding
operations under the sequential consistency memory model. It would be
possible to relax to acquire/release or something in the future.
* libguile/atomics-internal.h (scm_atomic_compare_and_swap_scm): Change
to use same API as atomic-box-compare-and-swap!.
* libguile/intrinsics.c (atomic_compare_and_swap_scm): Remove loop, as
we're using the strong variant now.
* libguile/async.c (scm_i_async_pop, scm_i_async_push): Adapt to new API
of scm_atomic_compare_and_swap_scm.
* libguile/i18n.c (SCM_MAX_ALLOCA): New macro.
(SCM_STRING_TO_U32_BUF): Accept an additional variable to remember
whether we used malloc to allocate the buffer. Use malloc if the
allocation size is greater than SCM_MAX_ALLOCA.
(SCM_CLEANUP_U32_BUF): New macro.
(compare_u32_strings, compare_u32_strings_ci, str_to_case): Adapt.
* libguile/strings.c (SCM_MAX_ALLOCA): New macro.
(normalize_str, unistring_escapes_to_r6rs_escapes): Use malloc if the
allocation size is greater than SCM_MAX_ALLOCA.
* test-suite/tests/i18n.test, test-suite/tests/strings.test: Add tests.
Previously, 'put-u8' used textual I/O to write a single character,
relying on the usual practice of setting the port encoding to ISO-8859-1
for binary ports.
* libguile/r6rs-ports.c (scm_put_u8): Use 'scm_c_write', not 'scm_putc'.
* libguile/numbers.c (scm_exact_integer_sqrt, scm_sqrt)
(exact_integer_is_perfect_square, exact_integer_floor_square_root):
Where it is trivial to do so, use GMP's low-level mpn functions to
avoid heap allocation.
* test-suite/tests/r6rs-ports.test ("put-bytevector [2 args]")
("put-bytevector [3 args]", "put-bytevector [4 args]"): Set the default
port encoding instead of setting the locale.
This reverts the change to SCM_MAKE_CHAR made in the previous commit
63818453ad, which used an arithmetic trick
to avoid evaluating its argument more than once.
Here, we restore the previous implementation of SCM_MAKE_CHAR, which
evaluates its argument twice. Instead, we introduce a new inlinable
function 'scm_c_make_char' and replace uses of SCM_MAKE_CHAR with calls
to 'scm_c_make_char' where appropriate.
* libguile/chars.h (scm_c_make_char): New inline function.
* libguile/inline.c: Include chars.h.
* libguile/srfi-13.c (REF_IN_CHARSET, scm_string_any, scm_string_every)
(scm_string_trim, scm_string_trim_right, scm_string_trim_both)
(scm_string_index, scm_string_index_right, scm_string_skip)
(scm_string_skip_right, scm_string_count, string_titlecase_x)
(string_reverse_x, scm_string_fold, scm_string_fold_right)
(scm_string_for_each, scm_string_filter, scm_string_delete):
Use 'scm_c_make_char' instead of 'SCM_MAKE_CHAR' in cases where the
argument calls a function.
* libguile/chars.c (scm_char_upcase, scm_char_downcase, scm_char_titlecase),
libguile/ports.c (scm_port_decode_char),
libguile/print.c (scm_simple_format),
libguile/read.c (scm_read_character),
libguile/strings.c (scm_string_ref, scm_c_string_ref),
The motivation for this change is that SCM_MAKE_CHAR is sometimes passed
an expression that involves a procedure call that is not always trivial.
In other cases, the results are not guaranteed to be the same both
times, which could lead to the creation of invalid SCM objects.
* libguile/chars.h (SCM_MAKE_CHAR): Reimplement.
* libguile/scmsigs.c (signal_delivery_thread): Call scm_async_tick to
give any pending asyncs a chance to run before we block indefinitely
waiting for a signal to arrive.