parlaylib icon indicating copy to clipboard operation
parlaylib copied to clipboard

race condition with scheduler initialization

Open shwestrick opened this issue 10 months ago • 6 comments

A student of mine, @darshand15, has been doing some testing with ParlayLib. We think we've found a race condition in the scheduler initialization.

Suppose you have 2 threads.

  1. At the moment of scheduler initialization, we have num_awake_workers set to 2.
  2. Immediately after thread 0 finishes initializing the scheduler, thread 1 is "awake" but may not be scheduled by the OS right away.
  3. Suppose thread 0 now executes pardo and therefore scheduler.spawn(&right_job); from here it executes if (first) wake_up_a_worker() but this does not do atomic_notify_one because the current number of awake threads is set to 2 (equal to num_threads).
  4. Now, thread 1 starts executing. It enters worker() and then wait_for_work(), and therefore immediately goes to sleep, waiting on the wake_up_counter.

The problem is that we have now entered a situation where thread 0 has a spare job in its deque, but thread 1 is not awake to steal it.

We were able to reproduce this race condition in practice with a simple program that looks like:

void loop() {
  // just loop for a while
}

int main() {
  loop();
  pardo(
    [](){ loop(); },
    [](){ loop(); }
  );
}

In this example, the scheduler gets initialized (lazily) at the beginning of the pardo. When we run this program on 2 threads, we almost always see sequential performance (no speedup). Actually, I don't think we've yet managed to see any speedup on 2 threads for this example program.

shwestrick avatar Feb 26 '25 21:02 shwestrick

@darshand15, perhaps you could copy/paste the minimal reproducible example here? Just the actual source for the program I described above.

shwestrick avatar Feb 26 '25 21:02 shwestrick

Yes, please find below the source code for the example mentioned above.

#include <iostream>
#include <parlay/parallel.h>

#define NUM_ITER 1000000000

using namespace std;

double seq_loop(double num)
{
    double acc = 0;
    double prod = 0;
    for(int i = 0; i<NUM_ITER; ++i)
    {
        prod = num*i;
        acc = acc + prod;
    }

    return acc;
}

int main(int argc, char* argv[])
{
    if(argc < 2)
    {
        cout << "Enter command line args\n";
    }

    double ans1 = seq_loop(atoi(argv[1]));
    double ans2 = 0;
    double ans3 = 0;

    parlay::par_do(
      [&]() { ans2 = seq_loop(ans1); },
      [&]() { ans3 = seq_loop(ans1 + atoi(argv[1])); }
    );

    double ans4 = seq_loop(ans2 + ans3);
    cout << ans4 << "\n";

    return 0;
}

darshand15 avatar Feb 27 '25 01:02 darshand15

Thanks for the report. We are busy with the SPAA deadline but will look at this ASAP.

Guy

On Wed, Feb 26, 2025 at 8:42 PM darshand15 @.***> wrote:

Yes, please find below the source code for the example mentioned above.

#include #include <parlay/parallel.h>

#define NUM_ITER 1000000000 using namespace std; double seq_loop(double num) { double acc = 0; double prod = 0; for(int i = 0; i<NUM_ITER; ++i) { prod = num*i; acc = acc + prod; }

return acc;

} int main(int argc, char* argv[]) { if(argc < 2) { cout << "Enter command line args\n"; }

double ans1 = seq_loop(atoi(argv[1]));
double ans2 = 0;
double ans3 = 0;

parlay::par_do(
  [&]() { ans2 = seq_loop(ans1); },
  [&]() { ans3 = seq_loop(ans1 + atoi(argv[1])); }
);

double ans4 = seq_loop(ans2 + ans3);
cout << ans4 << "\n";

return 0;

}

— Reply to this email directly, view it on GitHub https://github.com/cmuparlay/parlaylib/issues/83#issuecomment-2686592726, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTIVAFDTAVHSLPG5RAH4T32RZUPFAVCNFSM6AAAAABX6HSPUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBWGU4TENZSGY . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: darshand15]darshand15 left a comment (cmuparlay/parlaylib#83) https://github.com/cmuparlay/parlaylib/issues/83#issuecomment-2686592726

Yes, please find below the source code for the example mentioned above.

#include #include <parlay/parallel.h>

#define NUM_ITER 1000000000 using namespace std; double seq_loop(double num) { double acc = 0; double prod = 0; for(int i = 0; i<NUM_ITER; ++i) { prod = num*i; acc = acc + prod; }

return acc;

} int main(int argc, char* argv[]) { if(argc < 2) { cout << "Enter command line args\n"; }

double ans1 = seq_loop(atoi(argv[1]));
double ans2 = 0;
double ans3 = 0;

parlay::par_do(
  [&]() { ans2 = seq_loop(ans1); },
  [&]() { ans3 = seq_loop(ans1 + atoi(argv[1])); }
);

double ans4 = seq_loop(ans2 + ans3);
cout << ans4 << "\n";

return 0;

}

— Reply to this email directly, view it on GitHub https://github.com/cmuparlay/parlaylib/issues/83#issuecomment-2686592726, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTIVAFDTAVHSLPG5RAH4T32RZUPFAVCNFSM6AAAAABX6HSPUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBWGU4TENZSGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gblelloch avatar Feb 27 '25 02:02 gblelloch

Thanks for the report. We are busy with the SPAA deadline but will look at this ASAP.

Guy

Okay, thank you!

darshand15 avatar Feb 27 '25 11:02 darshand15

@darshand15 @shwestrick I can reproduce the example. Do you have a quick fix to it? Since parallel_for is dependent on par_do, all parallel primitives would run in sequential mode when the number of threads is set to 2..?

WhateverLiu avatar Apr 10 '25 04:04 WhateverLiu

The problem isn't too severe because the scheduler will "fix itself" as soon as another task spawn occurs, at a par_do or parallel_for or otherwise. Any program with a large amount of parallelism (lots of calls to par_do) probably won't have any issues.

A quick workaround is to force the scheduler to be initialized earlier, before the main parallel part of the program. A dummy call to par_do at the top of main seems to do the trick.

I've been chatting with Guy and Daniel about an idea for a more robust fix, and @darshand15 and I are looking into it.

shwestrick avatar Apr 10 '25 13:04 shwestrick