oneTBB icon indicating copy to clipboard operation
oneTBB copied to clipboard

Can using task_area and task_group create a deadlock ?

Open fdiedler opened this issue 9 months ago • 7 comments

Hi,

Does the following pattern can create a deadlock ?

tbb::task_group;
tbb::task_area taskArena1(6), taskArena2(6); // Two tbb::task_area with maximum 6 threads

// Create tasks
taskGroup.run([&]{
    taskArena1.execute([&]{
        // work here in arena 1
    });
});

taskGroup.run([&]{
    taskArena2.execute([&]{
        // work here in arena 2
    });
});

// Wait from the taskArena1
taskArena1.execute([&]{
    taskGroup.wait();
});

// Execution of the rest of the code

There is an atomic flag that when the first task is finished, will cancelled the other task. Example :

  • The task inside taskArena1 finishes
  • The flag to cancel taskArena2 task is set
  • The taskArena2 finishes
  • The taskGroup.wait() function returns and the execution of the code continues

In rare situation, it seems that I have a deadlock but I am not sure.

Thanks,

fdiedler avatar Mar 24 '25 14:03 fdiedler

@fdiedler From what I've seen in your example, you don't actually submit work into taskArena1 and taskArena2 when using taskGroup. The work is submitted into internal task arena representation in this scenario. But then you wait for taskGroup from taskArena1, which itself doesn't have any work at this point. That might lead to hang in case if worker threads are not available for some reason. I would suggest to rewrite the code (simply remove taskArena1.execute in the end) so external thread could join execution of work:

// task_group task is submitted into internal task arena representation
taskGroup.run([&]{
    taskArena1.execute([&]{
        // work here in arena 1
    });
});
// task_group task is submitted into internal task arena representation
taskGroup.run([&]{
    taskArena2.execute([&]{
        // work here in arena 2
    });
});

// Join the execution as an external thread
taskGroup.wait();

isaevil avatar Mar 25 '25 10:03 isaevil

Funny how the timing worked out here, because we were looking at a very similar issue on our end. For our case we found out that the issue is that my_pool_state in the arena is not fully synchronized with the amount of work actually in the arena (the state of the task queues in the slots). In other words, there is a race condition between spawning a new task (through task_group run for example) and the last thread belonging to an arena leaving that arena. After the race my_pool_state can still be UNSET, while the new task was added to a slot. Due to this no new thread is requested to actually deal with the work and the task stays forever in the arena. In our case the "external thread owner" of the arena did never join the arena. I am not sure whether this would make a difference.

We misused tbb a bit to get to this point and we can fix that, but I am not sure if this is a state that should be possible. But looking at the code comments in advertise_new_work, it seems that this was at least partially a conscious choice. In that case, I think it would be good to make this clearer in the docs. From an outside perspective, it is very subtle that the task_group wait does not make the thread work on the task in the group, if they were spawned in the wrong arena.

griebelsi avatar Mar 27 '25 09:03 griebelsi

Hi @griebelsi @isaevil

I am sorry, I don't understand why this code will solve my issue ?

// Join the execution as an external thread
taskGroup.wait();

I don't understand why my code could create a deadlock with threads ? But I suppose there is something I don't understand inside the Intel TBB library.

// Wait from the taskArena1
taskArena1.execute([&]{
    taskGroup.wait();
});

But then you wait for taskGroup from taskArena1, which itself doesn't have any work at this point

I also don't understand why taskArena1 has no work at this point ? I send a work to taskArena1 in this line

taskGroup.run([&]{
    taskArena1.execute([&]{
        // work here in arena 1
    });
});

or there is something I don't undererstand ?

Thanks,

fdiedler avatar Mar 27 '25 10:03 fdiedler

So each thread (that is not part of the tbb worker pool) has its own implicit arena. So you main thread has one. If you now execute taskGroup.run, it will schedule a task in that arena. This task then executes the taskArena1.execute which will schedule the work in taskArena1. So visualized:

taskGroup.run([&]{
    // executed in the implicit arena of the main thread
    taskArena1.execute([&]{
        // executed in arena 1
    });
});

If you run into the same problem that I observed, than what happens is that one of the outer tasks that is executed in the main thread arena gets stuck. Because it is not correctly advertised that the main thread arena now contains work.

If you want to only schedule on arena1. You have to use task_group defer and enqueue the task handle on the arena. Or potentially (but I am not 100% certain that it would work) use:

taskArena1.execute([&]{
    taskGroup.run([&]{
    });
});

griebelsi avatar Mar 27 '25 10:03 griebelsi

So to explain the error better: If you have the same issue as we do, then

taskGroup.run([&]{ // <--- This gets stuck due to a bug in tbb
    taskArena2.execute([&]{ // <--- And this is never executed
        // work here in arena 2
    });
});

and then you wait forever on the taskGroup.

griebelsi avatar Mar 27 '25 11:03 griebelsi

@fdiedler From task_arena spec:

Each user thread that invokes any parallel construction outside an explicit task_arena uses an implicit task arena representation object associated with the calling thread.

In your code sample you invoke task_group::run(...) outside both of explicitly created task_arena's, thus work is actually submitted into the implicit one. But then the following construction happens:

// Enter taskArena1 to execute work from there. No actual work has been submitted there.
taskArena1.execute([&] {
    // Wait for taskGroup to finish. Since all work is in the implicit task arena,
    // we just wait until workers join the implicit arena and execute the work submitted.
    // If workers are not available/couldn't join due to missed wakeups, it will hang.
    taskGroup.wait();
});

@griebelsi Yes, threads availability guarantee only exists when using task_arena::enqueue. In other cases it is possible that threads may miss the wakeup when work is submitted. That is why for task_group you should always call task_group::wait. However, I agree that the interaction between task_arenas and such asynchronous APIs as task_group is not clear in documentation, which we might want to fix.

By the way, there is a RFC about waiting in a task_arena in which you might be interested.

isaevil avatar Mar 27 '25 11:03 isaevil

The issue with calling wait is that it only ensures the tasks to run through in the arena on which it is called. If you have (by accident) scheduled tasks of the task group on multiple arenas, wait is not enough. So the current task group interface is only usable if you ensure that you only schedule on one arena, but the task_group class is not aware of the concept of arenas, even though the arena is implicitly part of the task_group invariant. In my mind there are two solutions:

  1. You bind the task_group to a task_arena. This is essentially our fix to the problem: we create a class arena_task_group that stores a task_arena created with attach. All run calls on the task_group then schedule their work to this arena only (with defer enqueue). It then does not matter from which arena the run is called.
  2. You allow a task_group to schedule to multiple arenas, but then you need to guarantee that tasks are actually run without calling wait. Potentially it would also make sense to add the arena as an (optional) explicit parameter to run.

The option that wait ensures that the tasks of the task group are run in all arenas seems too complex to implement from my point of view. Essentially my view of things is exactly the reverse of what is discussed here: https://github.com/uxlfoundation/oneTBB/tree/master/rfcs/proposed/task_arena_waiting#1-simplify-the-use-of-a-task-group. I want to make the task_group aware of arenas. The suggestion in the link makes arenas aware of task groups (which I feel like would be the wrong way around). My point of view is a more problematic for defer, but I feel that it makes more sense for run. As written we have a work around for now, but I feel like the current task_group interface is too error prone to be used by itself without a wrapper that either implements point 1 or point 2.

griebelsi avatar Mar 28 '25 08:03 griebelsi