ltp
ltp copied to clipboard
Possible communication deadlocks (Discussion)
Hi, in the below code, it is possible for the thread to signal first (pthread_cond_broadcast at Line 108) and, later, the thread goes to the waiting site forever (at Line 50 in the function t1_func) without signal notified anymore.
We notice that there is a sleep (sleep(2); at Line 96) to ensure the main thread go the broadcast site "later". However, the child thread may still go the wait site later, in spite of the lower possibilitiy.
I also noticed that there are other places in the project using the pthread_cond_wait similar to this.
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_wait/3-1.c#L49-L54
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_wait/3-1.c#L96-L110
To avoid this problem, the recommended usage of thread_cond_signal and thread_cond_wait would be
pthread_mutex_t count_lock;
pthread_cond_t count_nonzero;
unsigned count;
decrement_count()
{
pthread_mutex_lock(&count_lock);
while (count == 0)
pthread_cond_wait(&count_nonzero, &count_lock);
count = count - 1;
pthread_mutex_unlock(&count_lock);
}
increment_count()
{
pthread_mutex_lock(&count_lock);
if (count == 0)
pthread_cond_signal(&count_nonzero);
count = count + 1;
pthread_mutex_unlock(&count_lock);
}
Synchronization by sleep() is of course broken by design and needs to be fixed. However I do not get at all what are you recommending, looks like the piece of code was copied from a textbook and it's not obvious how it should be applied to the test if at all.
I think that obvious solution here is actually loop and retry the pthread_cond_broadcast() with a small sleep (something as 10 or 100ms) between retries until the child wakes up or until we are out of time.
Hi, thank you for your reply. Let me explain the recommended code. The idea is to use an additional variable to ensure that, if the thread has sent the signal, the other thread should not enter the waiting loop, thereby avoiding the deadlock (waiting forever).
More specifically, if the thread (executing increment_count ) has reached the signal site, the count would be plus 1. As a result, the other thread (executing decrement_count) cannot enter the while loop and wait. I indeed copyed the recommended pattern from this.
I agree with you that there are many ways to avoid the deadlock situations.
I roughly point out some other similar places, which also use sleep to synchronize. I hope this might help you.
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_wait/1-1.c#L50-L57
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_broadcast/2-1.c#L45-L50
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_wait/2-1.c#L50-L55
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_signal/2-1.c#L60-L66
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_broadcast/1-1.c#L44-L52
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/open_posix_testsuite/conformance/interfaces/pthread_cond_broadcast/4-1.c#L44-L50
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/kernel/fs/openfile/openfile.c#L247-L252
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/realtime/func/pi-tests/testpi-4.c#L97-L99
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/realtime/func/prio-wake/prio-wake.c#L152-L162
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/realtime/func/pi-tests/testpi-2.c#L97-L99
https://github.com/linux-test-project/ltp/blob/3e0e60074495e1c350b634848e5dd1f7993ec2ef/testcases/realtime/func/pi-tests/testpi-1.c#L95-L97