* libguile/smob.c (scm_smob_prehistory): When initializing
SMOB_GC_KIND, pass 1 as the CLEAR_NEW_OBJECTS argument to
`GC_new_kind ()'. Without this, an assertion failure is
triggered in libgc's `reclaim.c'.
The `on_thread_exit ()' function allocates memory via libgc. When
called from the context of a pthread key detructor, the thread is
essentially "dead" already and `GC_lookup_thread ()' returns NULL,
which triggers an assertion in libgc's `thread_local_alloc.c'. This
patch arranges so that `on_thread_exit ()' is called from a suitable
context.
* libguile/threads.c (on_thread_exit): Remove now invalid comment
about access to libgc's TLS.
(init_thread_key): Don't pass `on_thread_exit ()' to
`scm_i_pthread_key_create ()'.
(scm_leave_guile_cleanup): Invoke `do_thread_exit ()'.
(really_launch): Invoke `pthread_exit ()'.
* libguile/Makefile.am (stack-limit-calibration.scm): Use $(srcdir), to
support building in a different directory.
(MOSTLYCLEANFILES): Add stack-limit-calibration.scm.
* libguile/threads.h (held_mutex): New field.
* libguile/threads.c (enqueue, remqueue, dequeue): Use critical
section to protect access to the queue.
(guilify_self_1): Initialize held_mutex field.
(on_thread_exit): If held_mutex non-null, unlock it.
(fat_mutex_unlock, fat_cond_free, scm_make_condition_variable,
fat_cond_signal, fat_cond_broadcast): Delete now unnecessary uses
of c->lock.
(fat_mutex_unlock): Pass m->lock to block_self() instead of
c->lock; move scm_i_pthread_mutex_unlock(m->lock) call from before
block_self() to after.
(scm_pthread_cond_wait, scm_pthread_cond_timedwait,
scm_i_thread_sleep_for_gc): Set held_mutex before pthread call;
reset it afterwards.
I was seeing a hang in srfi-18.test, when running make check in master,
in the "exception handler installation is thread-safe" test. It wasn't
100% reproducible, so looked like a race.
The problem is that wait-condition-variable is not actually
atomic in the way that it is supposed to be. It unlocks the mutex,
then starts waiting on the cond var. So it is possible for another
thread to lock the same mutex, and signal the cond var, before the
wait-condition-variable thread starts waiting.
In order for wait-condition-variable to be atomic - e.g. in a race
where thread A holds (Scheme-level) mutex M, and calls
(wait-condition-variable C M), and thread B calls (begin (lock-mutex
M) (signal-condition-variable C)) - it needs to call pthread_cond_wait
with the same underlying mutex as is involved in the `lock-mutex'
call. In terms of the threads.c code, this means that it has to use
M->lock, not C->lock.
block_self() used its mutex arg for two purposes: for protecting
access and changes to the wait queue, and for the pthread_cond_wait
call. But it wouldn't work reliably to use M->lock to protect C's
wait queue, because in theory two threads can call
(wait-condition-variable C M1) and (wait-condition-variable C M2)
concurrently, with M1 and M2 different. So we either have to pass
both C->lock and M->lock into block_self(), or use some other mutex to
protect the wait queue. For this patch, I switched to using the
critical section mutex, because that is a global and so easily
available. (If that turns out to be a problem for performance, we
could make each queue structure have its own mutex, but there's no
reason to believe yet that it is a problem, because the critical
section mutex isn't used much overall.)
So then we call block_self() with M->lock, and move where M->lock is
unlocked to after the block_self() call, instead of before.
That solves the first hang, but introduces a new one, when a SRFI-18
thread is terminated (`thread-terminate!') between being launched
(`make-thread') and started (`thread-start!'). The problem now is
that pthread_cond_wait is a cancellation point (see man
pthread_cancel), so the pthread_cond_wait call is one of the few
places where a thread-terminate! call can take effect. If the thread
is cancelled at that point, M->lock ends up still being locked, and
then when do_thread_exit() tries to lock M->lock again, it hangs.
The fix for that is a new `held_mutex' field in scm_i_thread, which is
set to point to the mutex just before a pthread_cond_(timed)wait call,
and set to NULL again afterwards. If on_thread_exit() finds that
held_mutex is non-NULL, it unlocks that mutex.
A detail is that checking and unlocking held_mutex must be done before
on_thread_exit() calls scm_i_ensure_signal_delivery_thread(), because
the innards of scm_i_ensure_signal_delivery_thread() can do another
pthread_cond_wait() call and so overwrite held_mutex. But that's OK,
because it's fine for the mutex check and unlock to happen outside
Guile mode.
Lastly, C->lock is then not needed, so I've removed it.
For explanation, see comments and text in the new file
libguile/measure-hwm.scm.
* .gitignore: Add libguile/stack-limit-calibration.scm.
* check-guile.in: Load libguile/stack-limit-calibration.scm.
* configure.in: Add AC_CONFIG_FILES to generate test-use-srfi from
test-use-srfi.in.
* libguile/Makefile.am (TESTS, TESTS_ENVIRONMENT,
stack-limit-calibration.scm): New targets, so that `make check'
calibrates the stack limit before running the Guile test suite.
* libguile/measure-hwm.scm: New file, calibrates stack limit for `make
check'.
* libguile/stackchk.c (scm_sys_get_stack_size): New primitive.
* libguile/stackchk.h (scm_sys_get_stack_size): New primitive
(declaration).
* test-suite/standalone/test-use-srfi: Renamed test-use-srfi.in, so
that ./configure can fill in variables in it.
* test-suite/standalone/test-use-srfi.in: Load
libguile/stack-limit-calibration.scm.
* libguile/threads.c (scm_threads_mark_stacks): Cast `&t->regs' to
`(void *)' rather than `(SCM_STACKITEM *)' to avoid "warning:
dereferencing type-punned pointer will break strict-aliasing rules"
with GCC 4.2.1 on `i386-unknown-freebsd7.0'.
* libguile/threads.c (struct select_args): New.
(do_std_select): New function.
(scm_std_select): Don't use `scm_{leave,enter}_guile ()' since they don't
have any effect; use `scm_without_guile ()' instead.
* libguile/read.c (scm_read_string): Use `scm_i_make_read_only_string ()' to
return a read-only string, as mandated by R5RS. Reported by Bill
Schottstaedt <bil@ccrma.Stanford.EDU>.
* libguile/strings.c (scm_i_make_read_only_string): New function.
(scm_i_shared_substring_read_only): Special-case the empty string
so that the read-only and read-write empty strings are `eq?'. This
optimization is relied on by the `substring/shared' `empty string'
test case in `srfi-13.test'.
* libguile/strings.h (scm_i_make_read_only_string): New declaration.
* test-suite/tests/strings.test ("string-set!")["literal string"]: New test.
* NEWS: Update.
* libguile/strings.c (scm_i_symbol_substring): Return a read-only string
since R5RS requires `symbol->string' to return a read-only string.
Reported by Bill Schottstaedt <bil@ccrma.Stanford.EDU>.
* test-suite/tests/symbols.test: Add `define-module' clause.
(exception:immutable-string): Adjust to current exception.
("symbol->string")["result is an immutable string"]: Use
`pass-if-exception' instead of `expect-fail-exception'.
* NEWS: Update.
(reported by Bill Schottstaedt)
* libguile/numbers.c (scm_gcd): When only one arg given, use scm_abs
to ensure that result is non-negative.
* test-suite/tests/numbers.test ("gcd"): New test, (gcd -2).
* libguile/gc.c (scm_gc): Don't use `scm_gc_running_p' as
an lvalue.
* libguile/gc.h (scm_gc_running_p): Define to 0.
* libguile/threads.h (scm_i_thread)[gc_running_p]: Remove.
Actually, threads would "go to sleep" either by blocking on a heap
allocation, or by noticing `scm_i_thread_go_to_sleep' is set when
running `SCM_TICK', which limits the applicability of this technique
(e.g., it was not appropriate for the shared string code).
* libguile/threads.c (scm_i_thread_go_to_sleep, scm_i_thread_put_to_sleep,
scm_i_thread_invalidate_freelists, scm_i_thread_wake_up,
scm_i_thread_sleep_for_gc): Remove.
* libguile/threads.h (scm_i_thread_go_to_sleep, scm_i_thread_put_to_sleep,
scm_i_thread_invalidate_freelists, scm_i_thread_wake_up,
scm_i_thread_sleep_for_gc): Remove declarations.
(SCM_THREAD_SWITCHING_CODE): Do nothing.
* libguile/strings.c (scm_i_string_writable_chars): Remove use of
`scm_i_thread_put_to_sleep ()'. This leaves a race condition,
which is hopefully not harmful.
* libguile/fluids.c (all_dynamic_states, all_fluids): Remove. Together,
they prevented dynamic states and fluids to be collected. Callers no
longer use them.
(resize_all_states): Remove.
(grow_dynamic_state): New function.
(next_fluid_num): Don't call `resize_all_states ()'.
(scm_i_fluid_num, scm_i_fast_fluid_ref, scm_i_fast_fluid_set_x): Remove,
as they broke encapsulation and would have needed duplication of the lazy
dynamic state growing code.
(scm_fluid_ref, scm_fluid_set_x): Lazily grow the dynamic state's fluid
vector.
(scm_fluids_prehistory): Don't set an `scm_after_sweep_c_hook'.
* libguile/fluids.h (SCM_FLUID_NUM, SCM_FAST_FLUID_REF, SCM_FAST_FLUID_SET_X,
scm_i_fluid_num, scm_i_fast_fluid_set_x, scm_i_fast_fluid_ref): Remove.
* libguile/load.c (the_reader_fluid_num): Remove.
(scm_primitive_load): Use `scm_fluid_ref ()' instead of
`SCM_FAST_FLUID_REF ()'.
(scm_init_load): Likewise.
Idea and original patch were by Ludovic Courts, this is Neil Jerram's
reworking of it.
* libguile/srfi-4.c (scm_uniform_vector_read_x): Use scm_c_read,
instead of equivalent code here.
* libguile/ports.c (scm_fill_input): Add assertion that read
buffer is empty when called.
(port_and_swap_buffer, swap_buffer): New, for...
(scm_c_read): Use caller's buffer for reading, to avoid making N
1-byte low-level read calls, in the case where the port is
unbuffered (or has a very small buffer).