* ice-9/boot-9.scm (compile-time-environment): New function, with
documentation. The trick is that the compiler recognizes calls to
(compile-time-environment) and replaces it with a representation of the
*available* lexicals. Note that this might not be all the lexicals;
only the heap-allocated ones are returned.
* module/language/scheme/translate.scm (custom-transformer-table):
Compile `compile-time-environment' to <ghil-reified-env>.
* module/system/il/compile.scm (codegen): Add <ghil-reified-env> clause,
which calls ghil-env-reify.
* module/system/il/ghil.scm (ghil-env-reify): New procedure, returns a
list of (NAME . EXTERNAL-INDEX).
(<ghil>): Add <ghil-reified-env> object.
* module/system/base/syntax.scm (define-type): Rework to not require the
`|', which confuses Emacs.
* module/system/il/ghil.scm (<ghil>):
* module/system/il/glil.scm (<glil>): Adapt to define-type changes.
* libguile/threads.h (held_mutex): New field.
* libguile/threads.c (enqueue, remqueue, dequeue): Use critical
section to protect access to the queue.
(guilify_self_1): Initialize held_mutex field.
(on_thread_exit): If held_mutex non-null, unlock it.
(fat_mutex_unlock, fat_cond_free, scm_make_condition_variable,
fat_cond_signal, fat_cond_broadcast): Delete now unnecessary uses
of c->lock.
(fat_mutex_unlock): Pass m->lock to block_self() instead of
c->lock; move scm_i_pthread_mutex_unlock(m->lock) call from before
block_self() to after.
(scm_pthread_cond_wait, scm_pthread_cond_timedwait,
scm_i_thread_sleep_for_gc): Set held_mutex before pthread call;
reset it afterwards.
I was seeing a hang in srfi-18.test, when running make check in master,
in the "exception handler installation is thread-safe" test. It wasn't
100% reproducible, so looked like a race.
The problem is that wait-condition-variable is not actually
atomic in the way that it is supposed to be. It unlocks the mutex,
then starts waiting on the cond var. So it is possible for another
thread to lock the same mutex, and signal the cond var, before the
wait-condition-variable thread starts waiting.
In order for wait-condition-variable to be atomic - e.g. in a race
where thread A holds (Scheme-level) mutex M, and calls
(wait-condition-variable C M), and thread B calls (begin (lock-mutex
M) (signal-condition-variable C)) - it needs to call pthread_cond_wait
with the same underlying mutex as is involved in the `lock-mutex'
call. In terms of the threads.c code, this means that it has to use
M->lock, not C->lock.
block_self() used its mutex arg for two purposes: for protecting
access and changes to the wait queue, and for the pthread_cond_wait
call. But it wouldn't work reliably to use M->lock to protect C's
wait queue, because in theory two threads can call
(wait-condition-variable C M1) and (wait-condition-variable C M2)
concurrently, with M1 and M2 different. So we either have to pass
both C->lock and M->lock into block_self(), or use some other mutex to
protect the wait queue. For this patch, I switched to using the
critical section mutex, because that is a global and so easily
available. (If that turns out to be a problem for performance, we
could make each queue structure have its own mutex, but there's no
reason to believe yet that it is a problem, because the critical
section mutex isn't used much overall.)
So then we call block_self() with M->lock, and move where M->lock is
unlocked to after the block_self() call, instead of before.
That solves the first hang, but introduces a new one, when a SRFI-18
thread is terminated (`thread-terminate!') between being launched
(`make-thread') and started (`thread-start!'). The problem now is
that pthread_cond_wait is a cancellation point (see man
pthread_cancel), so the pthread_cond_wait call is one of the few
places where a thread-terminate! call can take effect. If the thread
is cancelled at that point, M->lock ends up still being locked, and
then when do_thread_exit() tries to lock M->lock again, it hangs.
The fix for that is a new `held_mutex' field in scm_i_thread, which is
set to point to the mutex just before a pthread_cond_(timed)wait call,
and set to NULL again afterwards. If on_thread_exit() finds that
held_mutex is non-NULL, it unlocks that mutex.
A detail is that checking and unlocking held_mutex must be done before
on_thread_exit() calls scm_i_ensure_signal_delivery_thread(), because
the innards of scm_i_ensure_signal_delivery_thread() can do another
pthread_cond_wait() call and so overwrite held_mutex. But that's OK,
because it's fine for the mutex check and unlock to happen outside
Guile mode.
Lastly, C->lock is then not needed, so I've removed it.
* libguile/threads.h (held_mutex): New field.
* libguile/threads.c (enqueue, remqueue, dequeue): Use critical
section to protect access to the queue.
(guilify_self_1): Initialize held_mutex field.
(on_thread_exit): If held_mutex non-null, unlock it.
(fat_mutex_unlock, fat_cond_free, scm_make_condition_variable,
fat_cond_signal, fat_cond_broadcast): Delete now unnecessary uses
of c->lock.
(fat_mutex_unlock): Pass m->lock to block_self() instead of
c->lock; move scm_i_pthread_mutex_unlock(m->lock) call from before
block_self() to after.
(scm_pthread_cond_wait, scm_pthread_cond_timedwait,
scm_i_thread_sleep_for_gc): Set held_mutex before pthread call;
reset it afterwards.
I was seeing a hang in srfi-18.test, when running make check in master,
in the "exception handler installation is thread-safe" test. It wasn't
100% reproducible, so looked like a race.
The problem is that wait-condition-variable is not actually
atomic in the way that it is supposed to be. It unlocks the mutex,
then starts waiting on the cond var. So it is possible for another
thread to lock the same mutex, and signal the cond var, before the
wait-condition-variable thread starts waiting.
In order for wait-condition-variable to be atomic - e.g. in a race
where thread A holds (Scheme-level) mutex M, and calls
(wait-condition-variable C M), and thread B calls (begin (lock-mutex
M) (signal-condition-variable C)) - it needs to call pthread_cond_wait
with the same underlying mutex as is involved in the `lock-mutex'
call. In terms of the threads.c code, this means that it has to use
M->lock, not C->lock.
block_self() used its mutex arg for two purposes: for protecting
access and changes to the wait queue, and for the pthread_cond_wait
call. But it wouldn't work reliably to use M->lock to protect C's
wait queue, because in theory two threads can call
(wait-condition-variable C M1) and (wait-condition-variable C M2)
concurrently, with M1 and M2 different. So we either have to pass
both C->lock and M->lock into block_self(), or use some other mutex to
protect the wait queue. For this patch, I switched to using the
critical section mutex, because that is a global and so easily
available. (If that turns out to be a problem for performance, we
could make each queue structure have its own mutex, but there's no
reason to believe yet that it is a problem, because the critical
section mutex isn't used much overall.)
So then we call block_self() with M->lock, and move where M->lock is
unlocked to after the block_self() call, instead of before.
That solves the first hang, but introduces a new one, when a SRFI-18
thread is terminated (`thread-terminate!') between being launched
(`make-thread') and started (`thread-start!'). The problem now is
that pthread_cond_wait is a cancellation point (see man
pthread_cancel), so the pthread_cond_wait call is one of the few
places where a thread-terminate! call can take effect. If the thread
is cancelled at that point, M->lock ends up still being locked, and
then when do_thread_exit() tries to lock M->lock again, it hangs.
The fix for that is a new `held_mutex' field in scm_i_thread, which is
set to point to the mutex just before a pthread_cond_(timed)wait call,
and set to NULL again afterwards. If on_thread_exit() finds that
held_mutex is non-NULL, it unlocks that mutex.
A detail is that checking and unlocking held_mutex must be done before
on_thread_exit() calls scm_i_ensure_signal_delivery_thread(), because
the innards of scm_i_ensure_signal_delivery_thread() can do another
pthread_cond_wait() call and so overwrite held_mutex. But that's OK,
because it's fine for the mutex check and unlock to happen outside
Guile mode.
Lastly, C->lock is then not needed, so I've removed it.
* oop/goops.scm (define-generic, define-accessor): Define as defmacros. I
find their semantics to be a bit odd, though -- but the test case
checks for this behavior, so we'll follow the test cases.
* oop/goops.scm: Use srfi-1, as util.scm already does.
(kw-do-map): New helper for processing keyword args.
(define-class-pre-definition, define-class): Rework so that
define-class is a defmacro without side effects. There are two
functional differences: we don't check that define-class is called only
at the toplevel, because defining a lexical class might makes sense,
and defmacros don't give us the toplevel check that we would want.
Second in the redefinition case, we don't do a `define', as we don't
actually need a new variable.
(class): Similarly, make `class' a defmacro.
* gdbinit (pp, inst): New commands.
* libguile/vm-engine.c (vm_error_not_a_pair): New error case.
* libguile/vm-i-scheme.c (VM_VALIDATE_CONS): New macro -- use this
instead of SCM_VALIDATE_* because SCM_VALIDATE will exit nonlocally
before we have a chance to sync the regs.
(car, cdr, set-car, set-cdr): Use VM_VALIDATE_CONS.
* libguile/vm-i-system.c (goto/args): Bugfix: when doing a
self-tail-recursion, allocate fresh externals. Fixes use of match.go.
* module/system/vm/assemble.scm (dump-object!): Add some checks that we
aren't dumping out values that the VM can't handle.
* module/system/vm/disasm.scm (disassemble-externals): Fix rotten call to
`print-info'.
* oop/goops/dispatch.scm: Add a FIXME.
* testsuite/Makefile.am (vm_test_files):
* testsuite/t-closure4.scm (extract-symbols): New test, distilled with
much effort out of match.scm.
* ice-9/Makefile.am (NOCOMP_SOURCES): Re-enable compilation of match.scm.
Yay!
For explanation, see comments and text in the new file
libguile/measure-hwm.scm.
* .gitignore: Add libguile/stack-limit-calibration.scm.
* check-guile.in: Load libguile/stack-limit-calibration.scm.
* configure.in: Add AC_CONFIG_FILES to generate test-use-srfi from
test-use-srfi.in.
* libguile/Makefile.am (TESTS, TESTS_ENVIRONMENT,
stack-limit-calibration.scm): New targets, so that `make check'
calibrates the stack limit before running the Guile test suite.
* libguile/measure-hwm.scm: New file, calibrates stack limit for `make
check'.
* libguile/stackchk.c (scm_sys_get_stack_size): New primitive.
* libguile/stackchk.h (scm_sys_get_stack_size): New primitive
(declaration).
* test-suite/standalone/test-use-srfi: Renamed test-use-srfi.in, so
that ./configure can fill in variables in it.
* test-suite/standalone/test-use-srfi.in: Load
libguile/stack-limit-calibration.scm.
* test-suite/tests/elisp.test: If running the '(apply foo nil) test
fails with a vm-error, throw UNRESOLVED. This allows the test suite to
pass in the compiled boot-9.scm while still keeping the elisp apply
issue open.
* test-suite/tests/elisp.test: Enlarge the stack for the duration of the
elisp test. It's a hack, but it at least allows the test to run with a
compiled ice-9.
* module/system/vm/assemble.scm (make-temp-binding, btemp:name)
(btemp:extp, btemp:index): Don't abuse program.scm's make-binding to
make something that actually isn't a binding.
(codegen): Do use program.scm's make-binding to make something that
actually is a binding.
* module/system/vm/program.scm (binding:start, binding:end): New
accessors.
(make-binding): Expand to have the start and end arguments in the
constructor.
I saw this problem when running elisp.test -- it tries to apply a
function to an arglist ending in nil, which obviously is not null.
* libguile/vm-engine.h (PUSH_LIST): New helper macro, pushes the elements
of a list onto the stack. Checks to make sure that the list is proper.
* libguile/vm-i-system.c (list-break, mv-call, apply, goto/apply)
(goto/cc): Use LIST_BREAK.
* libguile/vm-engine.c (vm_error_improper_list): New error case.
* libguile/vm-i-system.c (goto/args): Sync the registers before doing the
SCM_TICK. We probably need a different SCM_TICK that saves the regs
only if necessary. This fixes GC problems with a compiled popen.scm.
* ice-9/Makefile.am: Re-enable popen.scm compilation.
* module/system/vm/disasm.scm (disassemble-program): Fix misunderstanding
of nlocs: the *actual* number of locals is nlocs + nargs, even if the
arg is heap-allocated -- because our calling convention always puts the
initial val on the stack. Also: don't disassemble the objects, they are
now woven into the text.
(code-annotation): Fix external-{ref,set} handling to allow for
referencing externals from enclosed stack frames. Really this should be
statically determined, though. Add late-variable-{ref,set} handling.
* module/system/vm/assemble.scm (pop): Define a pop here too.
(codegen): Rework how bindings are represented in a program's
meta-info, so they declare their range in the binding list instead of
you having to figure out when they end.
* module/system/vm/conv.scm (make-byte-decoder): Return the end-address
as well; requires a change to callers.
* module/system/vm/disasm.scm (disassemble-objcode, disassemble-program)
(disassemble-bytecode, disassemble-objects, disassemble-externals)
(disassemble-meta, source->string, make-int16, code-annotation)
(print-info): Rework to display my domination of `format', and, more
seriously, start to integrate the "subsections" of the disassembly into
the main disassembly text.
* module/system/vm/program.scm (program-bindings-as-lambda-list): Update
for new bindings format; should be more correct.
* libguile/vm-i-system.c (call/cc, goto/cc): Don't assert that ip matches
vp->ip, because vp->ip is not restored by vm_reset_stack, and indeed
it's re-set to 0 by `halt'. But still, perhaps reset_stack and halt
should indeed reset vp->ip.
* ice-9/Makefile.am: Don't compile popen.scm, its behaviour at runtime
is not consistent -- seems to miss some GC references? I suspect a bug
in the compiler. In any case without popen.scm being compiled,
continuations.test, r4rs.tes, and r5rs_pitfall.test do pass.
* libguile/threads.h (scm_i_thread):
* libguile/threads.c (thread_mark, guilify_self_2): Add a field for the
thread's vm. Previously I had this as a fluid, but it seems that newly
created threads share their fluid values from the creator thread; as
expected, I guess. In any case one VM should not be active in two
threads.
* libguile/vm.c (scm_the_vm): Change to access the thread-local vm,
instead of accessing a fluid.
(scm_the_vm_fluid): Removed.
* module/system/vm/vm.scm: Removed *the-vm*.
* libguile/threads.c (scm_threads_mark_stacks): Cast `&t->regs' to
`(void *)' rather than `(SCM_STACKITEM *)' to avoid "warning:
dereferencing type-punned pointer will break strict-aliasing rules"
with GCC 4.2.1 on `i386-unknown-freebsd7.0'.
* libguile/vm-i-system.c (goto/cc): Add some asserts here.
* libguile/vm.c (capture_vm_cont): Add some asserts here too.
(reinstate_vm_cont): Null the correct number of bytes. Add a FIXME.
(vm_reset_stack): Make the code a bit clearer. Null the correct number
of bytes.
* libguile/vm-engine.h (NULLSTACK_FOR_NONLOCAL_EXIT): New macro, handles
a very tricky case that took me days to find! Amply commented. Expands
to nothing in the normal case.
* libguile/vm-i-system.c (call, goto/args, mv-call): Call
NULLSTACK_FOR_NONLOCAL_EXIT in the right places. Fixes
continuations.test.
* libguile/vm-engine.c (vm_error_wrong_num_args): Sync the registers
before calling scm_wrong_num_args. (The other cases are handled more
uniformly.)
* libguile/vm.c (vm_heapify_frames_1): Add a FIXME: I don't think we
should be modifying the stack.
(scm_vm_save_stack): If stack nulling is enabled, verify the stack here
before reifying it.
* module/language/scheme/spec.scm (scheme): Use primitive-eval here
instead of eval, because at the repl we do want to allow evaluations to
have side effects like setting the current module.
* libguile/vm-engine.h (CHECK_STACK_LEAK, NULLSTACK): Add a new mode,
VM_ENABLE_STACK_NULLING, that tries to ensure that all stack data past
the top of the stack is NULL. This helps to verify the VM's
consistency. If VM_ENABLE_STACK_NULLING is not defined, there is no
overhead.
(DROP, DROPN): Hook into NULLSTACK.
(POP_LIST): Hoo, fix a good bug: if CONS triggered a GC, the elements
of the list that had not yet been consed would not be marked, because
the sp was already below them.
(NEXT): Hook into CHECK_STACK_LEAK.
(INIT_ARGS): Add a note that consing the rest arg can cause GC.
(NEW_FRAME): Cons up the external data after initializing the frame, so
that if GC is triggered, the precise marker sees a well-formed frame.
* libguile/vm-i-loader.c (load-program): In the four-integers case, use
the POP macro so that we can hook into NULLSTACK (if necessary).
* libguile/vm-i-scheme.c (ARGS2, ARGS3): Hook into NULLSTACK.
* libguile/vm-i-system.c (halt): Null the nvalues. Rework some asserts
into using ASSERT, and null the stack when we free the frame.
(variable-set): Use DROPN instead of sp -= 2.
(BR): Hook into NULLSTACK.
(goto/args): Hook into NULLSTACK. In the non-self case, delay updating
the frame until after INIT_ARGS so that GC sees a well-formed frame.
Delay consing the externals until after the frame is set up, as in
NEW_FRAME.
(call/cc): Add some asserts.
(return): Rework some asserts into ASSERT, and hook into NULLSTACK.
(return/values): Hook into NULLSTACK, and use ASSERT.
(return/values*) Use ASSERT.
* libguile/vm.c (VM_ENABLE_ASSERTIONS, VM_ENABLE_STACK_NULLING): These
are the variables that control assertions and nulling. Perhaps we can
do these per-engine when we start compiling the debug engine separate
from a speedy engine.
(vm_mark_stack): Add a precise stack marker. Yay!
(vm_cont_mark): Mark the continuation stack precisely.
(capture_vm_cont): Record the difference from the vp's stack_base too,
so that we can translate the dynamic links when marking the
continuation stack. Memset the stack to NULL if we are doing nulling.
(reinstate_vm_cont): If we are nulling, null out the relevant part
of the stack.
(vm_reset_stack): When resetting sp due to a nonlocal exit, null out
the stack too.
(vm_mark): If we are nulling, assert that there are no extra values on
the stack. Mark the stack precisely.
* configure.in:
* gdb-pre-inst-guile.in: Add gdb-pre-inst-guile, because I'm tired of
typos. You can run it just like Guile. For compiling, you might try
GUILE=./gdb-pre-inst-guile scripts/compile foo.scm.
* libguile/vm-engine.c: Call scm_wrong_num_args in the wrong-num-args
case, to be more like the interpreter.
* libguile/vm-engine.h (ASSERT): New macro.
* libguile/vm-i-system.c (apply, goto/apply): Assert that nargs >= 2,
because the compiler should always feed us correct instructions.
(call/cc): If no values are returned to the continuation, signal
no_values instead of wrong_num_args.
* ice-9/i18n.scm: Load the i18n extension when compiling too, so that the
macros that depend on (provided? 'nl-langinfo) actually have
nl-langinfo. Fixes the i18n test.
* libguile/vm-engine.h (POP_CONS_MARK): New macro, analagous to
POP_LIST_MARK; used in quasiquote on improper lists.
* libguile/vm-i-system.c (cons-mark): New instruction. You know the
drill, remove all your .go files please.
* module/system/il/compile.scm (codegen): Compile quasiquoted improper
lists with splices correctly. Additionally check that we don't have
slices in the CDR of an improper list.
* testsuite/t-quasiquote.scm: Add a test for unquote-splicing in improper
lists.
* gdbinit: Update to be a bit more useful.
* libguile/vm-i-system.c: Make sure that arguments to C procedures are
visible on the stack so they get marked. Could be a source for the
missed references.
* oop/goops.scm: Update so as not to use (the-environment), which no
longer exists. I think that the speed characteristics are the same,
broadly speaking.
* libguile/vm-engine.c (vm_run): Add new error case for resolving @ or @@
references, but there is no such module. Possible if
module-public-interface returns #f.
* libguile/vm-i-loader.c (link-now): Allow the stack arg to be a sym, as
before, or a list, indicating an absolute reference. Could be two
separate instructions, but I'm lazy.
* libguile/vm-i-system.c (late-variable-ref, late-variable-set): As in
link-now, allow the lazy reference to be a list, for @ and @@.
* module/language/scheme/translate.scm (custom-transformer-table):
Compile @ and @@, and set! forms for both of them. This will ease the
non-hygienic pain for exported macros.
* module/system/il/compile.scm (make-glil-var): Translate public and
private module variable references into glil-module variables.
* module/system/il/ghil.scm (ghil-var-at-module!): New function, resolves
a variable for @ or @@.
* module/system/il/glil.scm (<glil-module>): Revival of <glil-module>,
this time with the semantics that it really links to a particular
module.
* module/system/vm/assemble.scm (<vlink-now>, <vlink-later>): Redefine as
taking a "key" as the argument, which may be a sym or a list; see the
notes on link-now for more details.
(codegen): Compile <glil-module> appropriately. Some duplication here,
probably could use some cleanup later.