Implements wishlist item <https://debbugs.gnu.org/18592>.
Requested by Frank Terbeck <ft@bewatermyfriend.org>.
Based on a proposed patch by Nala Ginrut <nalaginrut@gmail.com>.
Patch ported to 2.2 by Andy Wingo <wingo@pobox.com>.
* libguile/foreign.c (cif_to_procedure): Add 'with_errno' argument.
If true, truncate result to only one return value.
(scm_i_foreign_call): Separate the arguments. Always return errno.
(pointer_to_procedure): New static function.
(scm_pointer_to_procedure_with_errno): New C API function, implemented
in terms of 'pointer_to_procedure'.
(scm_pointer_to_procedure): Reimplement in terms of
'pointer_to_procedure', no longer bound to "pointer->procedure". See
below.
(scm_i_pointer_to_procedure): New C function bound to
"pointer->procedure" which now accepts the optional #:return-errno?
keyword argument, implemented in terms of 'pointer_to_procedure'.
(k_return_errno): New keyword #:return-errno?.
* libguile/foreign.h (scm_pointer_to_procedure_with_errno): Add prototype.
* doc/ref/api-foreign.texi (Dynamic FFI): Adjust documentation.
* libguile/vm-engine.c (foreign-call): Return two values.
There are two goals: one, to use less memory per dynamic state in order
to allow millions of dynamic states to be allocated in light-weight
threading scenarios. The second goal is to prevent dynamic states from
being actively mutated in two threads at once. This second goal does
mean that dynamic states object that escape into scheme are now copies
that won't receive further updates; an incompatible change, but one
which we hope doesn't affect anyone.
* libguile/cache-internal.h: New file.
* libguile/fluids.c (is_dynamic_state, get_dynamic_state)
(save_dynamic_state, restore_dynamic_state, add_entry)
(copy_value_table): New functions.
(scm_i_fluid_print, scm_i_dynamic_state_print): Move up.
(new_fluid): No need for a number.
(scm_fluid_p: scm_is_fluid): Inline IS_FLUID uses.
(fluid_set_x, fluid_ref): Adapt to dynamic state changes.
(scm_fluid_set_x, scm_fluid_unset_x): Call fluid_set_x.
(scm_swap_fluid): Rewrite in terms of fluid_ref and fluid_set.
(swap_fluid): Use internal fluid_set_x.
(scm_i_make_initial_dynamic_state): Adapt to dynamic state
representation change.
(scm_dynamic_state_p, scm_is_dynamic_state): Use new accessors.
(scm_current_dynamic_state): Use make_dynamic_state.
(scm_dynwind_current_dynamic_state): Use new accessor.
* libguile/fluids.h: Remove internal definitions. Add new struct
definition.
* libguile/threads.h (scm_i_thread): Use scm_t_dynamic_state for dynamic
state.
* libguile/threads.c (guilify_self_1, guilify_self_2):
(scm_i_init_thread_for_guile, scm_init_guile):
(scm_call_with_new_thread):
(scm_init_threads, scm_init_threads_default_dynamic_state): Adapt to
scm_i_thread change.
(scm_i_with_guile, with_guile): Remove "and parent" suffix.
(scm_i_reset_fluid): Remove unneeded function.
* doc/ref/api-scheduling.texi (Fluids and Dynamic States): Remove
scm_make_dynamic_state docs. Update current-dynamic-state docs.
* libguile/vm-engine.c (vm_engine): Update fluid-ref and fluid-set!
inlined fast paths for dynamic state changes.
* libguile/vm.c (vm_error_unbound_fluid): Remove now-unused function.
* NEWS: Update.
* module/ice-9/deprecated.scm (make-dynamic-state): New definition.
* libguile/deprecated.h:
* libguile/deprecated.c (scm_make_dynamic_state): Move here.
* libguile/__scm.h (scm_t_dynamic_state): New typedef.
* libguile/dynstack.h:
* libguile/dynstack.c (scm_dynstack_push_fluid):
(scm_dynstack_unwind_fluid): Take raw dynstate in these internal
functions.
* libguile/throw.c (catch): Adapt to dynstack changes.
* libguile/vm-engine.c (fluid-set!): Fix name of opcode to correspond
with name of Tree-IL primitive. Fixes compilation of fluid-set! to
actually use the fluid-set! opcode.
* doc/ref/vm.texi (Dynamic Environment Instructions): Update.
* module/language/cps/compile-bytecode.scm (compile-function): Add
fluid-set! case.
* module/system/vm/assembler.scm: Update export name for
emit-fluid-set!.
* libguile/foreign.c (CODE, get_foreign_stub_code): Add explicit
handle-interrupts and return-values calls, as foreign-call will fall
through.
* libguile/gsubr.c (A, B, C, AB, AC, BC, ABC, SUBR_STUB_CODE)
(scm_i_primitive_call_ip): Same.
* libguile/vm-engine.c (VM_HANDLE_INTERRUPTS): Inline into
handle-interrupts.
(RETURN_ONE_VALUE, RETURN_VALUE_LIST): Inline into callers, and fall
through instead of returning.
(BR_BINARY, BR_UNARY, BR_ARITHMETIC, BR_U64_ARITHMETIC): Remove
conditional VM_HANDLE_INTERRUPTS, as the compiler already inserted the
handle-interrupts calls if needed.
(vm_engine): Remove VM_HANDLE_INTERRUPTS invocations except in the
handle-interrupts instruction.
* libguile/__scm.h (SCM_TICK): Always define as scm_async_tick().
* libguile/error.c (scm_syserror, scm_syserror_msg):
* libguile/fports.c (fport_read, fport_write):
* libguile/_scm.h (SCM_SYSCALL): Replace SCM_ASYNC_TICK with
scm_async_tick ().
(SCM_ASYNC_TICK, SCM_ASYNC_TICK_WITH_CODE)
(SCM_ASYNC_TICK_WITH_GUARD_CODE): Remove internal definitions. We
inline into vm-engine.c, the only place where it matters.
* libguile/async.h:
* libguile/async.c (scm_async_tick, scm_i_setup_sleep):
(scm_i_reset_sleep, scm_system_async_mark_for_thread):
* libguile/threads.h (struct scm_thread_wake_data):
* libguile/threads.h (scm_i_thread):
* libguile/threads.c (block_self, guilify_self_1, scm_std_select):
Rewrite to use sequentially-consistent atomic references.
* libguile/atomics-internal.h (scm_atomic_set_pointer):
(scm_atomic_ref_pointer): New definitions.
* libguile/finalizers.c (queue_finalizer_async): We can allocate, so
just use scm_system_async_mark_for_thread instead of the set-cdr!
shenanigans.
* libguile/scmsigs.c (take_signal):
* libguile/gc.c (queue_after_gc_hook): Adapt to new asyncs mechanism.
Can't allocate but we're just manipulating the current thread when no
other threads are running so we should be good.
* libguile/vm-engine.c (VM_HANDLE_INTERRUPTS): Inline the async_tick
business.
* doc/ref/vm.texi (Inlined Atomic Instructions): New section.
* libguile/vm-engine.c (VM_VALIDATE_ATOMIC_BOX, make-atomic-box)
(atomic-box-ref, atomic-box-set!, atomic-box-swap!)
(atomic-box-compare-and-swap!): New instructions.
* libguile/vm.c: Include atomic and atomics-internal.h.
(vm_error_not_a_atomic_box): New function.
* module/ice-9/atomic.scm: Register primitives with the compiler.
* module/language/cps/compile-bytecode.scm (compile-function): Add
support for atomic ops.
* module/language/cps/effects-analysis.scm: Add comment about why no
effects analysis needed.
* module/language/cps/reify-primitives.scm (primitive-module): Add case
for (ice-9 atomic).
* module/language/tree-il/primitives.scm (*effect-free-primitives*):
(*effect+exception-free-primitives*): Add atomic-box?.
* module/system/vm/assembler.scm: Add new instructions.
* test-suite/tests/atomic.test: Test with compilation and
interpretation.
* doc/ref/vm.texi (Top-Level Environment Instructions): Update
documentation.
* libguile/_scm.h (SCM_OBJCODE_MINOR_VERSION): Bump, sadly.
* module/system/vm/assembler.scm (*bytecode-minor-version*): Bump.
* libguile/vm-engine.c (define!): Change to store variable in dst slot.
* module/language/tree-il/compile-cps.scm (convert):
* module/language/cps/compile-bytecode.scm (compile-function): Adapt to
define! change.
* module/language/cps/effects-analysis.scm (current-module): Fix define!
effects. Incidentally here was the bug: in Guile 2.2 you can't have
effects on different object kinds in one instruction, without
reverting to &unknown-memory-kinds.
* test-suite/tests/compiler.test ("regression tests"): Add a test.
* libguile/vm-engine.c (SP_REF_SLOT, SP_SET_SLOT): New defines.
(push, pop, mov, long-mov): Move full slots. Fixes 32-bit with
unboxed 64-bit stack values; before when shuffling these values
around, we were only shuffling the lower 32 bits on 32-bit platforms.
* libguile/vm-engine.c (VM_VALIDATE): Refactor some type-related
assertions to use a common macro.
(vector-length, vector-set!/immediate): Fix the proc mentioned in the
error message.
* libguile/vm-engine.c (BR_U64_SCM_COMPARISON): New helper.
(br-if-u64-<=-scm, br-if-u64-<-scm, br-if-u64-=-scm)
(br-if-u64->-scm, br-if-u64->=-scm): New instructions, to compare an
untagged u64 with a tagged SCM. Avoids many u64->scm operations.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/effects-analysis.scm:
* module/language/cps/type-fold.scm:
* module/system/vm/assembler.scm:
* module/system/vm/disassembler.scm (code-annotation, compute-labels):
* module/language/cps/primitives.scm (*branching-primcall-arities*): Add
support for new opcodes.
* module/language/cps/specialize-numbers.scm
(specialize-u64-scm-comparison): New helper.
* module/language/cps/specialize-numbers.scm (specialize-operations):
Specialize u64 comparisons.
* module/language/cps/types.scm (true-comparison-restrictions): New helper.
(define-comparison-inferrer): Use the new helper. Add support for
u64-<-scm et al.
* libguile/vm-engine.c (logsub): New op.
* module/language/cps/effects-analysis.scm (logsub):
* module/language/cps/types.scm (logsub):
* module/system/vm/assembler.scm (system): Add support for the new op.
* module/language/tree-il/compile-cps.scm (canonicalize):
Rewrite (logand x (lognot y)) to (logsub x y).
* libguile/vm-engine.c (bv-s8-ref, bv-s16-ref, bv-s32-ref, bv-s64-ref):
Unbox index and return unboxed S32 value.
(bv-s8-set!, bv-s16-set!, bv-s32-set!, bv-s64-set!): Unbox index and
take unboxed S32 value.
(bv-u8-ref, bv-u16-ref, bv-u32-ref, bv-u64-ref)
(bv-s8-set!, bv-s16-set!, bv-s32-set!, bv-s64-set!): Likewise, but
with unsigned values.
(bv-f32-ref, bv-f32-set!, bv-f64-ref, bv-f64-set!): Use memcpy to
access the value so we don't have to think about alignment. GCC will
inline this to a single instruction on architectures that support
unaligned access.
* libguile/vm.c (vm_error_out_of_range_uint64)
(vm_error_out_of_range_int64): New helpers.
* module/language/cps/slot-allocation.scm (compute-var-representations):
All bytevector ref operations produce untagged values.
* module/language/cps/types.scm (define-bytevector-accessors): Update
for bytevector untagged indices and values.
* module/language/cps/utils.scm (compute-constant-values): Fix s64
case.
* module/language/tree-il/compile-cps.scm (convert): Box results of all
bytevector accesses, and unbox incoming indices and values.
* libguile/instructions.c (FOR_EACH_INSTRUCTION_WORD_TYPE): Add word
types for immediate f64 and u64 values.
(TYPE_WIDTH): Bump up by a bit, now that we have 32 word types.
(NOP, parse_instruction): Use 64-bit meta type.
* libguile/vm-engine.c (load-f64, load-u64): New instructions.
* module/language/bytecode.scm (compute-instruction-arity): Add parser
for new instruction word types.
* module/language/cps/compile-bytecode.scm (compile-function): Add
special-cased assemblers for new instructions, and also for scm->u64
and u64->scm which I missed before.
* module/language/cps/effects-analysis.scm (load-f64, load-u64): New
instructions.
* module/language/cps/slot-allocation.scm (compute-needs-slot): load-f64
and load-u64 don't need slots.
(compute-var-representations): Update for new instructions.
* module/language/cps/specialize-primcalls.scm (specialize-primcalls):
Specialize scm->f64 and scm->u64 to make-f64 and make-u64.
* module/language/cps/types.scm (load-f64, load-u64): Wire up to type
inference, though currently type inference only runs before
specialization.
* module/language/cps/utils.scm (compute-defining-expressions): For some
reason I don't understand, it's possible to see two definitions that
are equal but not equal? here. Allow for now.
(compute-constant-values): Punch through type conversions to get
constant u64/f64 values.
* module/system/vm/assembler.scm (assembler): Support for new word
types. Export the new assemblers.
* libguile/vm-engine.c (add/immediate, sub/immediate)
(uadd/immediate, usub/immediate, umul/immediate): New instructions.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/slot-allocation.scm (compute-needs-slot):
* module/language/cps/types.scm:
* module/system/vm/assembler.scm (system):
* module/language/cps/effects-analysis.scm: Support
for new instructions.
* module/language/cps/optimize.scm (optimize-first-order-cps): Move
primcall specialization to the last step -- the only benefit of doing
it earlier was easier reasoning about side effects, and we're already
doing that in a more general way with (language cps types).
* module/language/cps/specialize-primcalls.scm (specialize-primcalls):
Specialize add and sub to add/immediate and sub/immediate, and
specialize u64 addition as well. U64 specialization doesn't work now
though because computing constant values doesn't work for U64s; oh
well.
* libguile/vm-engine.c: Remove add1 and sub1 instructions. Will replace
with add/immediate and sub/immediate.
* module/language/tree-il/peval.scm (peval): If we reify a new
<primcall>, expand it. Removes 1- and similar primcalls.
* module/language/tree-il/primitives.scm: Don't specialize (+ x 1) to 1+.
(expand-primcall): New export, does a single primcall expansion.
(expand-primitives): Use the new helper.
* module/language/cps/effects-analysis.scm:
* module/language/cps/primitives.scm:
* module/language/cps/types.scm:
* module/system/vm/assembler.scm: Remove support for add1 and sub1 CPS
primitives.
* test-suite/tests/peval.test ("partial evaluation"): Adapt tests that
expect 1+/1- to expect +/-.
* module/language/tree-il/compile-cps.scm (convert): bv-f32-ref,
bv-f32-set!, bv-f64-ref, and bv-f64-set! take the index as an untagged
u64 value.
* module/language/cps/types.scm (define-bytevector-uaccessors): New
helper, used while migrating bytevectors to take unboxed indexes.
Adapt f32/f64 accessors to use this definition helper.
* libguile/vm-engine.c (BV_FLOAT_REF, BV_FLOAT_SET): The index is
unboxed.
* libguile/vm-engine.c (vm_engine)
* libguile/vm.c (vm_apply_non_program_code): Arrange so that the code to
apply a non-program has its own IP, so that frame-instruction-pointer
for a non-program application doesn't point into the previously active
frame.
* libguile/vm-engine.c (fadd, fsub, fmul, fdiv): New instructions.
* module/language/cps/effects-analysis.scm:
* module/language/cps/types.scm: Wire up support for new instructions.
* module/system/vm/assembler.scm: Export emit-fadd and friends.
* module/language/tree-il/compile-cps.scm (convert): Box results of
bv-f32-ref and bv-f64-ref. Unbox the argument to bv-f32-set! and
bv-f64-set!.
* libguile/vm-engine.c (bv-f32-ref, bv-f64-ref): Results are raw.
(bv-f32-set!, bv-f64-set!): Take unboxed arguments.
* module/system/vm/assembler.scm (emit-scm->f64, emit-f64->scm):
Export.
* module/language/cps/compile-bytecode.scm (compile-function):
* module/language/cps/effects-analysis.scm: Add support for scm->f64 and
f64->scm.
* module/language/cps/slot-allocation.scm (compute-var-representations):
Add cases for primops returning raw values.
* module/language/cps/types.scm (bv-f32-ref, bv-f32-set!)
(bv-f64-ref, bv-f64-set!): Deal in &f64 values instead of reals.
* libguile/vm-engine.c (return-values): Change to also reset the frame,
if nlocals is nonzero.
* doc/ref/vm.texi (Procedure Call and Return Instructions): Updated
docs.
* module/language/cps/compile-bytecode.scm (compile-function): Adapt to
call emit-return-values with the right number of arguments.
* libguile/gsubr.c (scm_apply_subr): New internal helper.
* libguile/vm-engine.c (subr-call): Call out to scm_apply_subr.
* doc/ref/vm.texi (subr-call): Don't specify how the foreign pointer is
obtained.
* libguile/vm-engine.c: S24/S12/S8 operands addressed relative to the
SP, not the FP. Cache the SP instead of a FP-relative locals
pointer. Further cleanups to follow.
* libguile/vm.c (vm_builtin_call_with_values_code): Adapt to mov operand
addresing change.
* module/language/cps/compile-bytecode.scm (compile-function): Reify
SP-relative local indexes where appropriate.
* module/system/vm/assembler.scm (emit-fmov*): New helper, exported as
emit-fmov.
(shuffling-assembler, define-shuffling-assembler): Rewrite to shuffle
via push/pop/drop.
(standard-prelude, opt-prelude, kw-prelude): No need to provide for
shuffling args.
* test-suite/tests/rtl.test: Update.
* module/language/cps/slot-allocation.scm: Don't reserve slots 253-255.
* libguile/vm-engine.c: Renumber opcodes, and take the opportunity to
fold recent additions into more logical places. Be more precise when
describing the encoding of operands, to shuffle local references only
and not constants, immediates, or other such values.
(SP_REF, SP_SET): New helpers.
(BR_BINARY, BR_ARITHMETIC): Take full 24-bit operands. Our shuffle
strategy is to emit push when needed to bring far locals near, then
pop afterwards, shuffling away far destination values as needed; but
that doesn't work for conditionals, unless we introduce a trampoline.
Let's just do the simple thing for now. Native compilation will use
condition codes.
(push, pop, drop): Back from the dead! We'll only use these for
temporary shuffling though, when an opcode can't address the full
24-bit range.
(long-fmov): New instruction, like long-mov but relative to the frame
pointer.
(load-typed-array, make-array): Don't use a compressed encoding so
that we can avoid the shuffling case. It would be a pain, given that
they have so many operands already.
* module/language/bytecode.scm (compute-instruction-arity): Update for
new instrution word encodings.
* module/system/vm/assembler.scm: Update to expose some opcodes
directly, without the need for shuffling wrappers. Adapt to
instruction word encodings change.
* module/system/vm/disassembler.scm (disassembler): Adapt to instruction
coding change.
* libguile/vm-engine.c (vm_engine): Cache the address of local 0 instead
of the FP. This makes locals access a bit cheaper, but we still have
to negate the index. The right fix is to index relative to the SP
instead. That's a more involved change, so we punt until later.
Adapt VM stack to grow downward. This will make native compilation look
more like the VM code, as we will be able to use native CALL
instructions, taking proper advantage of the return address buffer.
* libguile/continuations.c (scm_i_continuation_to_frame): Record offsets
from stack top.
* libguile/control.c (scm_i_prompt_pop_abort_args_x): Adapt for reversed
order of arguments, and instead of relying on the abort to push on the
number of arguments, make the caller save the stack depth, which
allows us to compute the number of arguments ourselves.
(reify_partial_continuation, scm_c_abort): Adapt to reversed stack
order.
* libguile/dynstack.c (scm_dynstack_wind_prompt): Since we wind the
stack in a downward direction, subtract the reloc instead of adding
it.
* libguile/dynstack.h (SCM_F_DYNSTACK_PROMPT_ESCAPE_ONLY): Remove flag;
instead rely on prompt-establishing code to save the stack depth.
* libguile/eval.c (eval): Remove extraneous "volatile" declarations for
variables that are not re-set between the setjmp and any longjmp.
Adapt to save stack depth before instating the prompt.
* libguile/foreign.c (scm_i_foreign_call): Adapt to receive arguments in
reverse order.
* libguile/frames.c (frame_stack_top, scm_i_frame_stack_top): Adapt to
compute stack top instead of stack bottom.
(scm_c_frame_closure): Adapt to stack growth change.
(scm_frame_num_locals, scm_frame_local_ref, scm_frame_set_x): Use
union data type to access stack.
(RELOC): Reformat.
(scm_c_frame_previous): Adapt to stack growth change.
* libguile/frames.h: Adapt stack diagram to indicate that the stack
grows up.
(union scm_vm_stack_element): New data type used to access items on
the stack.
(SCM_FRAME_PREVIOUS_SP)
(SCM_FRAME_RETURN_ADDRESS, SCM_FRAME_SET_RETURN_ADDRESS)
(SCM_FRAME_DYNAMIC_LINK, SCM_FRAME_SET_DYNAMIC_LINK)
(SCM_FRAME_LOCAL, SCM_FRAME_NUM_LOCALS): Adapt to stack representation
change.
(SCM_FRAME_SLOT): New helper.
(SCM_VM_FRAME_FP, SCM_VM_FRAME_SP): Adapt to stack growth change.
* libguile/stacks.c (scm_make_stack): Record offsets from top of stack.
* libguile/throw.c (catch): Adapt to scm_i_prompt_pop_abort_args_x
change.
* libguile/vm-engine.c (ALLOC_FRAME, RESET_FRAME):
(FRAME_LOCALS_COUNT_FROM): Adapt to stack growth change.
(LOCAL_ADDRESS): Use SCM_FRAME_SLOT to get the address as the proper
data type.
(RETURN_ONE_VALUE, RETURN_VALUE_LIST): Adapt to stack growth change.
(apply): Shuffling up the SMOB apply args can cause the stack to
expand, so use ALLOC_FRAME instead of RESET_FRAME.
(vm_engine): Adapt for stack growth change.
* libguile/vm.c (vm_increase_sp, vm_push_sp, vm_restore_sp): Adapt to
stack representation change.
(scm_i_vm_cont_to_frame): Adapt to take offsets from the top.
(scm_i_vm_capture_stack): Adapt to capture from the top.
(vm_return_to_continuation_inner): Adapt for data type changes.
(vm_return_to_continuation): Likewise, and instead of looping, just
splat the saved arguments on with memcpy.
(vm_dispatch_hook): Adapt to receive arguments in the reverse order.
Adapt callers.
(vm_abort): There is never a tail argument. Adapt to stack
representation change.
(vm_reinstate_partial_continuation)
(vm_reinstate_partial_continuation_inner): Adapt to stack growth
change.
(allocate_stack, free_stack): Adapt to data type change.
(expand_stack): Don't try to mremap(), as you can't grow a mapping
from the bottom. Without knowing that there's a free mapping space
right below the old stack, which there usually isn't on Linux, we have
to copy. We can't use MAP_GROWSDOWN because Linux is buggy.
(make_vm): Adapt to stack representation changes.
(return_unused_stack_to_os): Round down instead of up, as the stack
grows down.
(scm_i_vm_mark_stack): Adapt to walk up the stack.
(scm_i_vm_free_stack): Adapt to scm_vm changes.
(vm_expand_stack_inner, reset_stack_limit, vm_expand_stack): Adapt to
the stack growing down.
(scm_call_n): Adapt to the stack growing down. Don't allow argv to
point into the stack.
* libguile/vm.h (struct scm_vm, struct scm_vm_cont): Adapt to hold the
stack top and bottom.
* libguile/vm-engine.c (RETURN_ONE_VALUE, RETURN_VALUE_LIST): These
helpers, used in subr-call and the like, might not actually have
enough space to push the return values. Use ALLOC_FRAME instead of
RESET_FRAME, for that reason.