race condition with scheduler initialization
A student of mine, @darshand15, has been doing some testing with ParlayLib. We think we've found a race condition in the scheduler initialization.
Suppose you have 2 threads.
- At the moment of scheduler initialization, we have
num_awake_workersset to 2. - Immediately after thread 0 finishes initializing the scheduler, thread 1 is "awake" but may not be scheduled by the OS right away.
- Suppose thread 0 now executes
pardoand thereforescheduler.spawn(&right_job); from here it executesif (first) wake_up_a_worker()but this does not doatomic_notify_onebecause the current number of awake threads is set to 2 (equal tonum_threads). - Now, thread 1 starts executing. It enters
worker()and thenwait_for_work(), and therefore immediately goes to sleep, waiting on thewake_up_counter.
The problem is that we have now entered a situation where thread 0 has a spare job in its deque, but thread 1 is not awake to steal it.
We were able to reproduce this race condition in practice with a simple program that looks like:
void loop() {
// just loop for a while
}
int main() {
loop();
pardo(
[](){ loop(); },
[](){ loop(); }
);
}
In this example, the scheduler gets initialized (lazily) at the beginning of the pardo. When we run this program on 2 threads, we almost always see sequential performance (no speedup). Actually, I don't think we've yet managed to see any speedup on 2 threads for this example program.
@darshand15, perhaps you could copy/paste the minimal reproducible example here? Just the actual source for the program I described above.
Yes, please find below the source code for the example mentioned above.
#include <iostream>
#include <parlay/parallel.h>
#define NUM_ITER 1000000000
using namespace std;
double seq_loop(double num)
{
double acc = 0;
double prod = 0;
for(int i = 0; i<NUM_ITER; ++i)
{
prod = num*i;
acc = acc + prod;
}
return acc;
}
int main(int argc, char* argv[])
{
if(argc < 2)
{
cout << "Enter command line args\n";
}
double ans1 = seq_loop(atoi(argv[1]));
double ans2 = 0;
double ans3 = 0;
parlay::par_do(
[&]() { ans2 = seq_loop(ans1); },
[&]() { ans3 = seq_loop(ans1 + atoi(argv[1])); }
);
double ans4 = seq_loop(ans2 + ans3);
cout << ans4 << "\n";
return 0;
}
Thanks for the report. We are busy with the SPAA deadline but will look at this ASAP.
Guy
On Wed, Feb 26, 2025 at 8:42 PM darshand15 @.***> wrote:
Yes, please find below the source code for the example mentioned above.
#include
#include <parlay/parallel.h> #define NUM_ITER 1000000000 using namespace std; double seq_loop(double num) { double acc = 0; double prod = 0; for(int i = 0; i<NUM_ITER; ++i) { prod = num*i; acc = acc + prod; }
return acc;} int main(int argc, char* argv[]) { if(argc < 2) { cout << "Enter command line args\n"; }
double ans1 = seq_loop(atoi(argv[1])); double ans2 = 0; double ans3 = 0; parlay::par_do( [&]() { ans2 = seq_loop(ans1); }, [&]() { ans3 = seq_loop(ans1 + atoi(argv[1])); } ); double ans4 = seq_loop(ans2 + ans3); cout << ans4 << "\n"; return 0;}
— Reply to this email directly, view it on GitHub https://github.com/cmuparlay/parlaylib/issues/83#issuecomment-2686592726, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTIVAFDTAVHSLPG5RAH4T32RZUPFAVCNFSM6AAAAABX6HSPUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBWGU4TENZSGY . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: darshand15]darshand15 left a comment (cmuparlay/parlaylib#83) https://github.com/cmuparlay/parlaylib/issues/83#issuecomment-2686592726
Yes, please find below the source code for the example mentioned above.
#include
#include <parlay/parallel.h> #define NUM_ITER 1000000000 using namespace std; double seq_loop(double num) { double acc = 0; double prod = 0; for(int i = 0; i<NUM_ITER; ++i) { prod = num*i; acc = acc + prod; }
return acc;} int main(int argc, char* argv[]) { if(argc < 2) { cout << "Enter command line args\n"; }
double ans1 = seq_loop(atoi(argv[1])); double ans2 = 0; double ans3 = 0; parlay::par_do( [&]() { ans2 = seq_loop(ans1); }, [&]() { ans3 = seq_loop(ans1 + atoi(argv[1])); } ); double ans4 = seq_loop(ans2 + ans3); cout << ans4 << "\n"; return 0;}
— Reply to this email directly, view it on GitHub https://github.com/cmuparlay/parlaylib/issues/83#issuecomment-2686592726, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTIVAFDTAVHSLPG5RAH4T32RZUPFAVCNFSM6AAAAABX6HSPUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBWGU4TENZSGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks for the report. We are busy with the SPAA deadline but will look at this ASAP.
Guy …
Okay, thank you!
@darshand15 @shwestrick I can reproduce the example. Do you have a quick fix to it? Since parallel_for is dependent on par_do, all parallel primitives would run in sequential mode when the number of threads is set to 2..?
The problem isn't too severe because the scheduler will "fix itself" as soon as another task spawn occurs, at a par_do or parallel_for or otherwise. Any program with a large amount of parallelism (lots of calls to par_do) probably won't have any issues.
A quick workaround is to force the scheduler to be initialized earlier, before the main parallel part of the program. A dummy call to par_do at the top of main seems to do the trick.
I've been chatting with Guy and Daniel about an idea for a more robust fix, and @darshand15 and I are looking into it.