1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-05-02 04:40:29 +02:00
guile/libguile/threads.h
Neil Jerram 78aa4a8850 Fix continuation problems on IA64.
* Specific problems in IA64 make check

** test-unwind

Representation of the relevant dynamic context:

                  non-rewindable
           catch      frame       make cont.
  o----o-----a----------b-------------c
        \
         \             call cont.
          o-----o-----------d

A continuation is captured at (c), with a non-rewindable frame in the
dynamic context at (b).  If a rewind through that frame was attempted,
Guile would throw to the catch at (a).  Then the context unwinds back
past (a), then winds forwards again, and the captured continuation is
called at (d).

We should end up at the catch at (a).  On ia64, we get an "illegal
instruction".

The problem is that Guile does not restore the ia64 register backing
store (RBS) stack (which is saved off when the continuation is
captured) until all the unwinding and rewinding is done.  Therefore,
when the rewind code (scm_i_dowinds) hits the non-rewindable frame at
(b), the RBS stack hasn't yet been restored.  The throw finds the
jmp_buf (for the catch at (a)) correctly from the dynamic context, and
jumps back to (a), but the RBS stack is invalid, hence the illegal
instruction.

This could be fixed by restoring the RBS stack earlier, at the same
point (copy_stack) where the normal stack is restored.  But that
causes a problem in the next test...

** continuations.test

The dynamic context diagram for this case is similar:

                   non-rewindable
  catch                 frame       make cont.
    a----x-----o----------b-------------c
          \
           \    call cont.
            o-------d

The only significant difference is that the catch point (a) is
upstream of where the dynamic context forks.  This means that the RBS
stack at (d) already contains the correct RBS contents for throwing
back to (a), so it doesn't matter whether the RBS stack that was saved
off with the continuation gets restored.

This test passes with the Guile 1.8.4 code, but fails (with an
"illegal instruction") when the code is changed to restore the RBS
stack earlier as described above.

The problem now is that the RBS stack is being restored _too_ early;
specifically when there is still stuff to do that relies on the old
RBS contents.  When a continuation is called, the sequence of relevant
events is:

  (1) Grow the (normal) stack until it is bigger than the (normal)
      stack saved off in the continuation.  (scm_dynthrow, grow_stack)

  (2) scm_i_dowinds calls itself recursively, such that

      (2.1) for each rewind (from (x) to (c)) that will be needed,
            another frame is added to the stack (both normal and RBS),
            with local variables specifying the required rewind; the
            rewinds don't actually happen yet, they will happen when
            the stack unwinds again through these frames

      (2.2) required unwinds - back from where the continuation was
            called (d) to the fork point (x) - are done immediately.

  (3) The normal (i.e. non-RBS) stack that was stored in the
      continuation is restored (i.e. copied on top of the actual
      stack).

      Note that this doesn't overwrite the frames that were added in
      (2.1), because the growth in (1) ensures that the added frames
      are beyond the end of the restored stack.

  (4) ? Restore the RBS stack here too ?

  (5) Return (from copy_stack) through the (2.1) frames, which means
      that the rewinds now happen.

  (6) setcontext (or longjmp) to the context (c) where the
      continuation was captured.

The trouble is that step (1) does not create space in the RBS stack in
the same kind of way that it does for the normal stack.  Therefore, if
the saved (in the continuation) RBS stack is big enough, it can
overwrite the RBS of the (2.1) frames that still need to complete.
This causes an illegal instruction when we return through those frames
and try to perform the rewinds.

* Fix

The key to the fix is that the saved RBS stack only needs to be
restored at some point before the next setcontext call, and that doing
it as close to the setcontext call as possible will avoid bad
interactions with the pre-setcontext stack.  Therefore we do the
restoration at the last possible point, immediately before the next
setcontext call.

The situation is complicated by there being two ways that the next
setcontext call can happen.

  - If the unwinding and rewinding is all successful, the next
    setcontext will be the one from step (6) above.  This is the
    "normal" continuation invocation case.

  - If one of the rewinds throws an error, the next setcontext will
    come from the throw implementation code.  (And the one in step (6)
    will never happen.)  This is the rewind error case.

In the rewind error case, the code calling setcontext knows nothing
about the continuation.  So to cover both cases, we:

  - copy (in step (4) above) the address and length of the
    continuation's saved RBS stack to the current thread state
    (SCM_I_CURRENT_THREAD)

  - modify all setcontext callers so that they check the current
    thread state for a saved RBS stack, and restore it if so before
    calling setcontext.

* Notes

** I think rewinders cannot rely on using any stack data

Unless it can be guaranteed that the data won't go into a register.
I'm not 100% sure about this, but I think it follows from the fact
that the RBS stack is not restored until after the rewinds have
happened.

Note that this isn't a regression caused by the current fix.  In Guile
1.8.4, the RBS stack was restored _after_ the rewinds, and this is
still the case now.

** Most setcontext calls for `throw' don't need to change the RBS stack

In the absence of continuation invocation, the setcontext call in the
throw implementation code always sets context to a place higher up the
same stack (both normal and RBS), hence no stack restoration is
needed.

* Other changes

** Using setcontext for all non-local jumps (for __ia64__)

Along the way, I read a claim somewhere that setcontext was more
reliable than longjmp, in cases where the stack has been manipulated.

I don't now have any reason to believe this, but it seems reasonable
anyway to leave the __ia64__ code using getcontext/setcontext, instead
of setjmp/longjmp.

(I think the only possible argument against this would be performance -
if getcontext was significantly slower than setjmp.  It that proves to
be the case, we should revisit this.)

** Capping RBS base for non-main threads

Somewhere else along the way, I hit a problem in GC, involving the RBS
stack of a non-main thread.  The problem was, in
SCM_MARK_BACKING_STORE, that scm_ia64_register_backing_store_base was
returning a value that was massively greater than the value of
scm_ia64_ar_bsp, leading to a seg fault.  This is because the
implementation of scm_ia64_register_backing_store_base is only valid
for the main thread.  I couldn't find a neat way of getting the true
RBS base of a non-main thread, but one idea is simply to call
scm_ia64_ar_bsp when guilifying a thread, and use the value returned
as an upper bound for that thread's RBS base.  (Note that the RBS
stack grows upwards.)

(Were it not for scm_init_guile, we could be much more definitive
about this.  We could take the value of scm_ia64_ar_bsp as a
definitive base address for the part of the RBS stack that Guile cares
about.  We could also then discard
scm_ia64_register_backing_store_base.)
2008-05-12 23:06:04 +01:00

223 lines
6.8 KiB
C
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

/* classes: h_files */
#ifndef SCM_THREADS_H
#define SCM_THREADS_H
/* Copyright (C) 1996,1997,1998,2000,2001, 2002, 2003, 2004, 2006 Free Software Foundation, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "libguile/__scm.h"
#include "libguile/procs.h"
#include "libguile/throw.h"
#include "libguile/root.h"
#include "libguile/iselect.h"
#include "libguile/dynwind.h"
#include "libguile/continuations.h"
#if SCM_USE_PTHREAD_THREADS
#include "libguile/pthread-threads.h"
#endif
#if SCM_USE_NULL_THREADS
#include "libguile/null-threads.h"
#endif
/* smob tags for the thread datatypes */
SCM_API scm_t_bits scm_tc16_thread;
SCM_API scm_t_bits scm_tc16_mutex;
SCM_API scm_t_bits scm_tc16_condvar;
typedef struct scm_i_thread {
struct scm_i_thread *next_thread;
SCM handle;
scm_i_pthread_t pthread;
SCM join_queue;
SCM result;
int exited;
SCM sleep_object;
scm_i_pthread_mutex_t *sleep_mutex;
scm_i_pthread_cond_t sleep_cond;
int sleep_fd, sleep_pipe[2];
/* This mutex represents this threads right to access the heap.
That right can temporarily be taken away by the GC.
*/
scm_i_pthread_mutex_t heap_mutex;
/* The freelists of this thread. Each thread has its own lists so
that they can all allocate concurrently.
*/
SCM freelist, freelist2;
int clear_freelists_p; /* set if GC was done while thread was asleep */
int gc_running_p; /* non-zero while this thread does GC or a
sweep. */
/* Other thread local things.
*/
SCM dynamic_state;
scm_t_debug_frame *last_debug_frame;
SCM dynwinds;
/* For system asyncs.
*/
SCM active_asyncs; /* The thunks to be run at the next
safe point */
unsigned int block_asyncs; /* Non-zero means that asyncs should
not be run. */
unsigned int pending_asyncs; /* Non-zero means that asyncs might be pending.
*/
/* The current continuation root and the stack base for it.
The continuation root is an arbitrary but unique object that
identifies a dynamic extent. Continuations created during that
extent can also only be invoked during it.
We use pairs where the car is the thread handle and the cdr links
to the previous pair. This might be used for better error
messages but is not essential for identifying continuation roots.
The continuation base is the far end of the stack upto which it
needs to be copied.
*/
SCM continuation_root;
SCM_STACKITEM *continuation_base;
/* For keeping track of the stack and registers. */
SCM_STACKITEM *base;
SCM_STACKITEM *top;
jmp_buf regs;
#ifdef __ia64__
void *register_backing_store_base;
scm_t_contregs *pending_rbs_continuation;
#endif
} scm_i_thread;
#define SCM_I_IS_THREAD(x) SCM_SMOB_PREDICATE (scm_tc16_thread, x)
#define SCM_I_THREAD_DATA(x) ((scm_i_thread *) SCM_SMOB_DATA (x))
#define SCM_VALIDATE_THREAD(pos, a) \
scm_assert_smob_type (scm_tc16_thread, (a))
#define SCM_VALIDATE_MUTEX(pos, a) \
scm_assert_smob_type (scm_tc16_mutex, (a))
#define SCM_VALIDATE_CONDVAR(pos, a) \
scm_assert_smob_type (scm_tc16_condvar, (a))
SCM_API SCM scm_spawn_thread (scm_t_catch_body body, void *body_data,
scm_t_catch_handler handler, void *handler_data);
SCM_API void *scm_without_guile (void *(*func)(void *), void *data);
SCM_API void *scm_with_guile (void *(*func)(void *), void *data);
SCM_API void *scm_i_with_guile_and_parent (void *(*func)(void *), void *data,
SCM parent);
extern int scm_i_thread_go_to_sleep;
void scm_i_thread_put_to_sleep (void);
void scm_i_thread_wake_up (void);
void scm_i_thread_invalidate_freelists (void);
void scm_i_thread_sleep_for_gc (void);
void scm_threads_prehistory (SCM_STACKITEM *);
void scm_threads_init_first_thread (void);
SCM_API void scm_threads_mark_stacks (void);
SCM_API void scm_init_threads (void);
SCM_API void scm_init_thread_procs (void);
SCM_API void scm_init_threads_default_dynamic_state (void);
#define SCM_THREAD_SWITCHING_CODE \
do { \
if (scm_i_thread_go_to_sleep) \
scm_i_thread_sleep_for_gc (); \
} while (0)
SCM_API SCM scm_call_with_new_thread (SCM thunk, SCM handler);
SCM_API SCM scm_yield (void);
SCM_API SCM scm_join_thread (SCM t);
SCM_API SCM scm_make_mutex (void);
SCM_API SCM scm_make_recursive_mutex (void);
SCM_API SCM scm_lock_mutex (SCM m);
SCM_API void scm_dynwind_lock_mutex (SCM mutex);
SCM_API SCM scm_try_mutex (SCM m);
SCM_API SCM scm_unlock_mutex (SCM m);
SCM_API SCM scm_make_condition_variable (void);
SCM_API SCM scm_wait_condition_variable (SCM cond, SCM mutex);
SCM_API SCM scm_timed_wait_condition_variable (SCM cond, SCM mutex,
SCM abstime);
SCM_API SCM scm_signal_condition_variable (SCM cond);
SCM_API SCM scm_broadcast_condition_variable (SCM cond);
SCM_API SCM scm_current_thread (void);
SCM_API SCM scm_all_threads (void);
SCM_API int scm_c_thread_exited_p (SCM thread);
SCM_API SCM scm_thread_exited_p (SCM thread);
SCM_API void scm_dynwind_critical_section (SCM mutex);
#define SCM_I_CURRENT_THREAD \
((scm_i_thread *) scm_i_pthread_getspecific (scm_i_thread_key))
SCM_API scm_i_pthread_key_t scm_i_thread_key;
#define scm_i_dynwinds() (SCM_I_CURRENT_THREAD->dynwinds)
#define scm_i_set_dynwinds(w) (SCM_I_CURRENT_THREAD->dynwinds = (w))
#define scm_i_last_debug_frame() (SCM_I_CURRENT_THREAD->last_debug_frame)
#define scm_i_set_last_debug_frame(f) \
(SCM_I_CURRENT_THREAD->last_debug_frame = (f))
SCM_API scm_i_pthread_mutex_t scm_i_misc_mutex;
/* Convenience functions for working with the pthread API in guile
mode.
*/
#if SCM_USE_PTHREAD_THREADS
SCM_API int scm_pthread_mutex_lock (pthread_mutex_t *mutex);
SCM_API void scm_dynwind_pthread_mutex_lock (pthread_mutex_t *mutex);
SCM_API int scm_pthread_cond_wait (pthread_cond_t *cond,
pthread_mutex_t *mutex);
SCM_API int scm_pthread_cond_timedwait (pthread_cond_t *cond,
pthread_mutex_t *mutex,
const struct timespec *abstime);
#endif
/* More convenience functions.
*/
SCM_API unsigned int scm_std_sleep (unsigned int);
SCM_API unsigned long scm_std_usleep (unsigned long);
#endif /* SCM_THREADS_H */
/*
Local Variables:
c-file-style: "gnu"
End:
*/