Hello,
I'm having a problem with a deadlock when using pthread_cond_signal() and
pthread_cond_wait(). One thread waits for the signal and another signals.
However the waiting thread does not recieve the signal. I tracked it down to
pthread_cond_wait() not returning. My code is posted below to ensure I
didn't do anything silly:
m_Condition is a pthread_cond_t.
CCondition::CCondition()
{
pthread_cond_init(&m_Condition, NULL);
}
CCondition::~CCondition()
{
pthread_cond_destroy(&m_Condition);
}
bool CCondition::Wait()
{
bool bRet;
m_Mutex.Lock();
bRet = pthread_cond_wait(&m_Condition, m_Mutex.GetMutex()) == 0;
return bRet;
}
void CCondition::Signal()
{
pthread_cond_signal(&m_Condition);
}
From what I understood the mutex (wrapped into m_Mutex object) needs to be
locked before calling pthread_cond_wait().
Thanks in advance.
-- John
|
|
0
|
|
|
|
Reply
|
John
|
2/8/2005 10:44:11 AM |
|
"John Smith" <john.smith@x-formation.com> wrote in message
news:cua4ud$81m$1@news.net.uni-c.dk...
> My code is posted below to ensure I
> didn't do anything silly:
The main problem is that you are using a condition variable without an
associated condition. You need to use pthread_cond_wait like this:
pthread_mutex_lock(...)
while(cant_work_now()) pthread_cond_wait(...)
do_work()
pthread_mutex_unlock(...)
This is not the most obvious problem, but it's the one that will get you
next.
> m_Condition is a pthread_cond_t.
>
> CCondition::CCondition()
> {
> pthread_cond_init(&m_Condition, NULL);
> }
>
> CCondition::~CCondition()
> {
> pthread_cond_destroy(&m_Condition);
> }
>
> bool CCondition::Wait()
> {
> bool bRet;
> m_Mutex.Lock();
> bRet = pthread_cond_wait(&m_Condition, m_Mutex.GetMutex()) == 0;
>
> return bRet;
> }
> void CCondition::Signal()
> {
> pthread_cond_signal(&m_Condition);
> }
>
> From what I understood the mutex (wrapped into m_Mutex object) needs to be
> locked before calling pthread_cond_wait().
Correct, otherwise how could you tell whether you needed to wait or not?
Your class, as written could not possibly be used. How would code know
whether or not to call 'Wait'?
DS
|
|
0
|
|
|
|
Reply
|
David
|
2/8/2005 6:24:38 PM
|
|
Thank you David.
Assume we have two threads 1 and 2. Both starts up and 2 goes into blocking
state and won't wake up until thread 1 notifies it. This is my scenario
which is pretty classic.
> The main problem is that you are using a condition variable without an
> associated condition. You need to use pthread_cond_wait like this:
>
> pthread_mutex_lock(...)
> while(cant_work_now()) pthread_cond_wait(...)
> do_work()
> pthread_mutex_unlock(...)
>
This code is called from thread 2. I'm unsure what cant_work_now() would be.
Essentially I want it to block in pthread_cond_wait() until it recieves the
signal.
> > From what I understood the mutex (wrapped into m_Mutex object) needs to
be
> > locked before calling pthread_cond_wait().
>
> Correct, otherwise how could you tell whether you needed to wait or
not?
> Your class, as written could not possibly be used.
> How would code know whether or not to call 'Wait'?
>
Because thread 1 will signal much later then thread 2 goes into a blocked
state I see no problem.
I'm still a bit unsure how to model my scenario into code. Maybe theres some
alternative pthread functions I could look into?
Thanks.
-- John
|
|
0
|
|
|
|
Reply
|
John
|
2/8/2005 7:34:43 PM
|
|
John Smith wrote:
> Thank you David.
>
> Assume we have two threads 1 and 2. Both starts up and 2 goes into blocking
> state and won't wake up until thread 1 notifies it. This is my scenario
> which is pretty classic.
>
>
>> The main problem is that you are using a condition variable without an
>>associated condition. You need to use pthread_cond_wait like this:
>>
>>pthread_mutex_lock(...)
>>while(cant_work_now()) pthread_cond_wait(...)
>>do_work()
>>pthread_mutex_unlock(...)
>>
>
>
> This code is called from thread 2. I'm unsure what cant_work_now() would be.
> Essentially I want it to block in pthread_cond_wait() until it recieves the
> signal.
>
>
>>>From what I understood the mutex (wrapped into m_Mutex object) needs to
>
> be
>
>>>locked before calling pthread_cond_wait().
>>
>> Correct, otherwise how could you tell whether you needed to wait or
>
> not?
>
>>Your class, as written could not possibly be used.
>
>
>>How would code know whether or not to call 'Wait'?
>>
>
> Because thread 1 will signal much later then thread 2 goes into a blocked
> state I see no problem.
>
> I'm still a bit unsure how to model my scenario into code. Maybe theres some
> alternative pthread functions I could look into?
You may have been misled by the name of pthread_cond_signal().
Perhaps you think "I need to send a signal to a thread; here's a
function with both `thread' and `signal' in its name; that must be
the function I'm looking for."
Here's a way to think of it: Calling pthread_cond_signal()
doesn't mean "Something happened; wake up and attend to it," which
seems to be the kind of "signal" you want to send. Rather, it
means "Something happened; wake up and check whether `something'
was what you were waiting for, and if so attend to it."
Furthermore, pthread_cond_wait() doesn't mean "Sleep until
the thing I'm waiting for happens," it means "Sleep until something
happens that might be what I'm waiting for, or until a passing
dog barks in the night." When pthread_cond_wait() returns, it
does not mean that what you're waiting for has come to pass, it
means that some other thread thought it might be a good idea for
you to wake up and check the state of affairs, and perhaps just
roll over and go back to sleep again. (The curious incident of
the dog in the night refers to the possibility that you might get
awakened even though *no* other thread calls pthread_cond_signal();
pthread_cond_wait() can return "spuriously.")
So: How do you send a "go ahead" signal from one thread to
the other? Here's one way (initialization and error checking
omitted for brevity):
int thread2_can_work = 0;
pthread_mutex_t mutex;
pthread_cond_t condvar;
...
void *thread1(void *arg) {
...
/* Tell thread2 to start */
pthread_mutex_lock(&mutex);
thread2_can_work = 1;
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&condvar);
...
}
...
void *thread2(void *arg) {
...
/* Await permission from thread1 */
pthread_mutex_lock(&mutex);
while (! thread2_can_work)
pthread_cond_wait(&condvar, &mutex);
/* Thread1 has given permission */
pthread_mutex_unlock(&mutex);
...
}
A scheme like this makes your program independent of the
rates of progress of the two threads; you needn't rely on hard-
to-enforce notions like "much later" for correctness. If thread1
happens to run rapidly while thread2 gets bogged down with page
faults or something, your program still works: thread1 sets the
boolean and sends the signal (which no one is waiting for), and
when thread2 tests the boolean it's already been set so thread2
goes straight ahead without waiting. Also, if thread2 goes to
sleep but is awakened by the barking dog before thread1 signals,
the first thing it does is re-test the boolean -- and finding it
still false, thread2 just goes back to sleep again.
The sine qua non of this pattern (which you should learn and
learn well) is the test-and-wait in a loop, all protected by a
single mutex acquisition:
pthread_mutex_lock(...);
while (...)
pthread_cond_wait(...);
You can spell this pattern differently, if you like, using `for'
or `goto' or whatever your heart desires -- but this is, as far
as I know, THE only correct way to sleep on a condvar while
awaiting a state change.
--
Eric.Sosman@sun.com
|
|
0
|
|
|
|
Reply
|
Eric
|
2/8/2005 8:29:23 PM
|
|
"John Smith" <john.smith@x-formation.com> wrote in message
news:4209145a$0$33733$edfadb0f@dread16.news.tele.dk...
> Assume we have two threads 1 and 2. Both starts up and 2 goes into
> blocking
> state and won't wake up until thread 1 notifies it. This is my scenario
> which is pretty classic.
>
>> The main problem is that you are using a condition variable without
>> an
>> associated condition. You need to use pthread_cond_wait like this:
>>
>> pthread_mutex_lock(...)
>> while(cant_work_now()) pthread_cond_wait(...)
>> do_work()
>> pthread_mutex_unlock(...)
>>
>
> This code is called from thread 2. I'm unsure what cant_work_now() would
> be.
> Essentially I want it to block in pthread_cond_wait() until it recieves
> the
> signal.
That is not a sensible thing to do.
>> How would code know whether or not to call 'Wait'?
>>
> Because thread 1 will signal much later then thread 2 goes into a blocked
> state I see no problem.
How do you guarantee that?
> I'm still a bit unsure how to model my scenario into code. Maybe theres
> some
> alternative pthread functions I could look into?
When you explain what synchronization in your code guarantees that
thread 1 will signal much later than thread 2 goes into a blocked state, I
can answer your question. Until then, it is impossible for me to do so.
Here's some code that does what I think you want to do:
wake_thread is a bool
CCondition::CCondition()
{
pthread_cond_init(&m_Condition, NULL);
wake_thread=false;
}
CCondition::~CCondition()
{
pthread_cond_destroy(&m_Condition);
}
void CCondition::Wait()
{ // call with mutex unlocked
m_Mutex.Lock();
while(!wake_thread) pthread_cond_wait(&m_Condition, m_Mutex.GetMutex()) ==
0;
wake_thread=false;
m_Mutex.Unlock();
}
void CCondition::Signal()
{ // call with mutex unlocked
m_Mutex.Lock();
wake_thread=true;
m-Mutex.Unlock();
pthread_cond_signal(&m_Condition);
}
DS
|
|
0
|
|
|
|
Reply
|
David
|
2/8/2005 8:31:56 PM
|
|
The code I pasted 'saves up' signals. If you don't want to do this, code
like this:
sleeping_threads is an int, wake_threads is an int.
CCondition::CCondition()
{
pthread_cond_init(&m_Condition, NULL);
wake_threads=0;
sleeping_threads=0;
}
CCondition::~CCondition()
{
pthread_cond_destroy(&m_Condition);
}
void CCondition::Wait()
{ // call with mutex unlocked
m_Mutex.Lock();
sleeping_threads++;
while(wake_threads==0)
pthread_cond_wait(&m_Condition, m_Mutex.GetMutex());
sleeping_threads--;
wake_threads=0;
m_Mutex.Unlock();
}
void CCondition::SignalOne()
{ // call with mutex unlocked
m_Mutex.Lock();
if (wake_threads<sleeping_threads) )
wake_threads++;
m-Mutex.Unlock();
pthread_cond_signal(&m_Condition);
}
void CCondition::SignalAll()
{ // call with mutex unlocked
m_Mutex.Lock();
wake_threads=sleeping_threads;
m-Mutex.Unlock();
pthread_cond_broadcast(&m_Condition);
}
DS
|
|
0
|
|
|
|
Reply
|
David
|
2/8/2005 8:35:51 PM
|
|
"David Schwartz" <davids@webmaster.com> wrote in message
news:cub7ra$o8l$1@nntp.webmaster.com...
> void CCondition::Wait()
> { // call with mutex unlocked
> m_Mutex.Lock();
> sleeping_threads++;
> while(wake_threads==0)
> pthread_cond_wait(&m_Condition, m_Mutex.GetMutex());
> sleeping_threads--;
> wake_threads=0;
> m_Mutex.Unlock();
> }
D'oh! That 'wake_threads=0' should be 'wake_threads--'.
DS
|
|
0
|
|
|
|
Reply
|
David
|
2/8/2005 8:36:48 PM
|
|
In article <cub7f4$t48$1@news1brm.Central.Sun.COM>, Eric Sosman wrote:
>
>
> John Smith wrote:
>> Thank you David.
>>
>> Assume we have two threads 1 and 2. Both starts up and 2 goes into blocking
>> ...
>>>How would code know whether or not to call 'Wait'?
>>>
>>
>> Because thread 1 will signal much later then thread 2 goes into a blocked
>> state I see no problem.
>>
>> I'm still a bit unsure how to model my scenario into code. Maybe theres some
>> alternative pthread functions I could look into?
>
> You may have been misled by the name of pthread_cond_signal().
> Perhaps you think "I need to send a signal to a thread; here's a
> function with both `thread' and `signal' in its name; that must be
> the function I'm looking for."
>
> Here's a way to think of it: Calling pthread_cond_signal()
> doesn't mean "Something happened; wake up and attend to it," which
> seems to be the kind of "signal" you want to send. Rather, it
> means "Something happened; wake up and check whether `something'
> was what you were waiting for, and if so attend to it."
>
> Furthermore, pthread_cond_wait() doesn't mean "Sleep until
> the thing I'm waiting for happens," it means "Sleep until something
> happens that might be what I'm waiting for, or until a passing
> dog barks in the night." When pthread_cond_wait() returns, it
> does not mean that what you're waiting for has come to pass, it
> means that some other thread thought it might be a good idea for
> you to wake up and check the state of affairs, and perhaps just
> roll over and go back to sleep again. (The curious incident of
> the dog in the night refers to the possibility that you might get
> awakened even though *no* other thread calls pthread_cond_signal();
> pthread_cond_wait() can return "spuriously.")
You're explanation is very good - just a side point: Assuming you're
referring to the Sherlock Holmes story, the curious incident of the dog in
the night was that the dog did NOT bark! ("The dog always barks at
strangers. Why didn't he bark?")
Sorry for the side-point, but it was bugging me enough to point it out.
|
|
0
|
|
|
|
Reply
|
Jim
|
2/8/2005 9:57:03 PM
|
|
Jim Cochrane wrote:
> In article <cub7f4$t48$1@news1brm.Central.Sun.COM>, Eric Sosman wrote:
>> [...]
>> Furthermore, pthread_cond_wait() doesn't mean "Sleep until
>>the thing I'm waiting for happens," it means "Sleep until something
>>happens that might be what I'm waiting for, or until a passing
>>dog barks in the night." [...] (The curious incident of
>>the dog in the night refers to the possibility that you might get
>>awakened even though *no* other thread calls pthread_cond_signal();
>>pthread_cond_wait() can return "spuriously.")
>
> You're explanation is very good - just a side point: Assuming you're
> referring to the Sherlock Holmes story, the curious incident of the dog in
> the night was that the dog did NOT bark! ("The dog always barks at
> strangers. Why didn't he bark?")
Somehow, I suspected someone would call me on that one ...
All right, then, how about if I refer to the dog's barking as
the "incurious incident?"
The next sound you hear will be no dog, but the rush of
English usage zealots, stampeding to point out that I've used
"incurious" incorrectly. It won't be all that loud a sound,
since the zealot herds have dwindled and all but died out;
once they covered the plains from horizon to horizon and their
hoofbeats resounded like the tapping of uncounted millions of
pencils, but they proved unable to withstand the linguistically
transmitted diseases that spread with the Internet.
A government program made an attempt to replenish the
American herd by breeding them with zealots imported from
Canada, where English still prevails in remote locations.
Like many government programs, though, its execution was not
all that might have been desired: the first group of Canadian
zealots released upon the prairie shouted "A bas l'Anglais!"
and moved to New Orleans.
--
Eric.Sosman@sun.com
|
|
0
|
|
|
|
Reply
|
Eric
|
2/8/2005 11:08:26 PM
|
|
In article <cubgpa$442$1@news1brm.Central.Sun.COM>, Eric Sosman wrote:
>
>
> Jim Cochrane wrote:
>> In article <cub7f4$t48$1@news1brm.Central.Sun.COM>, Eric Sosman wrote:
>>> [...]
>>> Furthermore, pthread_cond_wait() doesn't mean "Sleep until
>>>the thing I'm waiting for happens," it means "Sleep until something
>>>happens that might be what I'm waiting for, or until a passing
>>>dog barks in the night." [...] (The curious incident of
>>>the dog in the night refers to the possibility that you might get
>>>awakened even though *no* other thread calls pthread_cond_signal();
>>>pthread_cond_wait() can return "spuriously.")
>>
>> You're explanation is very good - just a side point: Assuming you're
>> referring to the Sherlock Holmes story, the curious incident of the dog in
>> the night was that the dog did NOT bark! ("The dog always barks at
>> strangers. Why didn't he bark?")
>
> Somehow, I suspected someone would call me on that one ...
> All right, then, how about if I refer to the dog's barking as
> the "incurious incident?"
>
> The next sound you hear will be no dog, but the rush of
> English usage zealots, stampeding to point out that I've used
> "incurious" incorrectly. It won't be all that loud a sound,
> since the zealot herds have dwindled and all but died out;
> once they covered the plains from horizon to horizon and their
> hoofbeats resounded like the tapping of uncounted millions of
> pencils, but they proved unable to withstand the linguistically
> transmitted diseases that spread with the Internet.
>
> A government program made an attempt to replenish the
> American herd by breeding them with zealots imported from
> Canada, where English still prevails in remote locations.
> Like many government programs, though, its execution was not
> all that might have been desired: the first group of Canadian
> zealots released upon the prairie shouted "A bas l'Anglais!"
> and moved to New Orleans.
Well, as far as I'm concerned, you can mangle words as much as you like.
Just don't mangle literature. :-)
--
Jim Cochrane; jtc@dimensional.com
[When responding by email, include the term non-spam in the subject line to
get through my spam filter.]
|
|
0
|
|
|
|
Reply
|
Jim
|
2/8/2005 11:45:59 PM
|
|
Thanks to both of you for your explanations.
They are very equal both of them but unfortunatly I can't get it to work.
The code was written as advised but pthread_cond_wait() keeps blocking no
matter what I do.
Could it be because I call it from a signal handler which will monitor on
SIGINT and SIGKILL? This signal handler then calls pthread_cond_signal().
As a test I tried to use pthread_cond_timedwait() instead to see what would
happen.
struct timespec tTimeout;
dbgprintf(("Cond wait -1"));
m_Mutex.Lock();
dbgprintf(("Cond wait0"));
while (!m_bWakeup)
{
tTimeout.tv_sec = time(NULL) + 5;
tTimeout.tv_nsec = 0;
dbgprintf(("Cond wait"));
int nRet = pthread_cond_timedwait(&m_Condition, m_Mutex.GetMutex(),
&tTimeout);
dbgprintf(("woke up %d", nRet));
}
m_bWakeup = false;
m_Mutex.UnLock();
It keeps writing "woke up 60" as expected each 5 seconds. However when I use
ctrl+c and the signal handler gets called it chokes also. I can see
pthread_cond_signal reports success but then nothing more happens.
Similarly I tried pthread_cond_broadcast() as an alternative but with same
result. It also prints result and nothing more happens:
dbgprintf(("signal 0"));
m_Mutex.Lock();
dbgprintf(("signal 1"));
m_bWakeup = true;
m_Mutex.UnLock();
int nRet = pthread_cond_broadcast(&m_Condition);
dbgprintf(("signal error code: %d", nRet));
Very very weird.
Is there something obvious I'm missing?
-- John
|
|
0
|
|
|
|
Reply
|
John
|
2/9/2005 12:10:16 PM
|
|
> Could it be because I call it from a signal handler which will monitor on
> SIGINT and SIGKILL? This signal handler then calls pthread_cond_signal().
> As a test I tried to use pthread_cond_timedwait() instead to see what
would
> happen.
>
I found out it does have something to do with the signal handler. I tried to
spawn a thread and signal to it from main thread and it went fine with the
provided code. However from signal handler it still chokes on
pthread_cond_wait(). I even tried starting a new thread from signal handler
but for some reason it still chokes.
This doesn't really make sense to me. Does anyone have a explanation?
-- John
|
|
0
|
|
|
|
Reply
|
John
|
2/9/2005 1:06:10 PM
|
|
In article <cub7f4$t48$1@news1brm.Central.Sun.COM>, eric.sosman@sun.com says...
>
> You may have been misled by the name of pthread_cond_signal().
> Perhaps you think "I need to send a signal to a thread; here's a
> function with both `thread' and `signal' in its name; that must be
> the function I'm looking for."
[rest snipped]
You have managed to describe the issues with CV's more clearly and
consisely than I have ever seen it done before. Somebody should take this
text (and attributions of course) and tuck it into the man pages somehow.
Very well done.
--
Randy Howard (2reply remove FOOBAR)
"Making it hard to do stupid things often makes it hard
to do smart ones too." -- Andrew Koenig
|
|
0
|
|
|
|
Reply
|
Randy
|
2/9/2005 1:22:41 PM
|
|
> This doesn't really make sense to me. Does anyone have a explanation?
>
Damn I hate incomplete documentation. The man page I read didn't say
anything about signal safety. However another man page found on google said
exactly what I experienced that threads may deadlock when calling
pthread_cond_signal().
So the question is what alternatives there are.
-- John
|
|
0
|
|
|
|
Reply
|
John
|
2/9/2005 3:14:39 PM
|
|
"John Smith" <john.smith@x-formation.com> wrote in message
news:420a0acf$0$33724$edfadb0f@dread16.news.tele.dk...
> I found out it does have something to do with the signal handler. I tried
> to
> spawn a thread and signal to it from main thread and it went fine with the
> provided code. However from signal handler it still chokes on
> pthread_cond_wait(). I even tried starting a new thread from signal
> handler
> but for some reason it still chokes.
>
> This doesn't really make sense to me. Does anyone have a explanation?
Yes, you are calling functions that are not async signal safe from an
asynchronous signal handler. This is illegal. Spawn an extra thread just for
signals, and have it block syncrhonously on the signals you are trying to
catch. That thread can then call any pthreads functions you need.
DS
|
|
0
|
|
|
|
Reply
|
David
|
2/9/2005 7:49:55 PM
|
|
"John Smith" <john.smith@x-formation.com> wrote in message
news:420a28e5$0$33654$edfadb0f@dread16.news.tele.dk...
>> This doesn't really make sense to me. Does anyone have a explanation?
> Damn I hate incomplete documentation. The man page I read didn't say
> anything about signal safety. However another man page found on google
> said
> exactly what I experienced that threads may deadlock when calling
> pthread_cond_signal().
>
> So the question is what alternatives there are.
Asynchronous signals and threads don't go well together. You should
avoid them if you can. For signals like SIGKILL and SIGINT, you can solve
this by two ways. One way is to have the signal handler do nothing but
change the state of a volatile variable. The thread can then check the
voatile variable periodically and then invoke the real signal handler code
synchronously. Another is to create one thread whose sole purpose is to
block for these signals and handle them. All other threads should mask them.
DS
|
|
0
|
|
|
|
Reply
|
David
|
2/9/2005 7:51:49 PM
|
|
> You have managed to describe the issues with CV's more clearly and
> consisely than I have ever seen it done before. Somebody should take this
> text (and attributions of course) and tuck it into the man pages somehow.
Do you like the material from the discussion "spurious wakeup", too?
http://groups.google.de/groups?threadm=40ed1d8f.0411191313.4dff837c@posting.google.com
Regards,
Markus
|
|
0
|
|
|
|
Reply
|
Markus
|
2/9/2005 10:21:17 PM
|
|
> Damn I hate incomplete documentation. The man page I read didn't say
> anything about signal safety. However another man page found on google said
> exactly what I experienced that threads may deadlock when calling
> pthread_cond_signal().
>
> So the question is what alternatives there are.
- Does the discussion "async-signal safety issue" point to things that are useful for you?
http://groups.google.de/groups?threadm=40f6a255%40usenet01.boi.hp.com
- http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms
Regards,
Mrkus
|
|
0
|
|
|
|
Reply
|
Markus
|
2/9/2005 10:36:36 PM
|
|
|
17 Replies
491 Views
(page loaded in 0.186 seconds)
|