oneTBB icon indicating copy to clipboard operation
oneTBB copied to clipboard

enqueue tasks never starting or taking forever to start/TBB scheduler spinning and doing nothing

Open TrianglesPCT opened this issue 3 years ago • 2 comments
trafficstars

Is there any known reason why tasks started like this:

tbb::task::enqueue(*new(tbb::task::allocate_root()) TBB_Task<task_type>(std::move(task)), tbb::priority_normal);

Would not start for a very long time, or in some cases apparently never start?

The TBB threads are spinning like mad internally and using nearly 100% CPU but not doing any work.

Most seem to be spinning here:(various places in receive_or_steal_task)

      [Inline Frame] tbb.dll!tbb::internal::msvc_intrinsics::pause(unsigned __int64 delay) Line 100	C++
[Inline Frame] tbb.dll!tbb::internal::atomic_backoff::bounded_pause() Line 377	C++
[Inline Frame] tbb.dll!tbb::internal::prolonged_pause() Line 309	C++

tbb.dll!tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::receive_or_steal_task(__int64 & completion_ref_count, __int64 isolation) Line 281 C++ tbb.dll!tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all(tbb::task & parent, tbb::task * child) Line 634 C++ tbb.dll!tbb::internal::arena::process(tbb::internal::generic_scheduler & s) Line 148 C++ tbb.dll!tbb::internal::market::process(rml::job & j) Line 711 C++

This is not a normal thing, it used to almost never happen, but lately is happening more and more, I think because I increased the # of tasks I was allowing to be submitted concurrently & optimized the program resulting in in running much faster and submitting more tasks.

It often takes a few hundred thousand of these tasks before it starts happening(all but the most recent ~100 have completed)

TBB does sometimes recover, but it can take random amount of time, and doesn't always recover.

I am using an older version of TBB, as the API seems to have changed out under me, and it looks like alot of work to update..

#define TBB_INTERFACE_VERSION 11003 #define TBB_INTERFACE_VERSION_MAJOR TBB_INTERFACE_VERSION/1000

Is this a known issue that was fixed in more recent versions? If so I can update..ugh.

Also this is on a 12 core system, TBB has about 24 threads active.

TrianglesPCT avatar Aug 29 '22 20:08 TrianglesPCT

It seems related to task_arena, I had assumed TBB used a single arena internally for this global enqueue, but it appears that it does not, and each thread has a separate arena.. which it apparently does not handle well(seems like a bug, maybe too much spin locks or lock free methods that never go anywhere?). Forcing all of these tasks into a single arena seems to have fixed the problem.

TrianglesPCT avatar Aug 29 '22 21:08 TrianglesPCT

Hi @TrianglesPCT, it's strange behavior that enqueue task doesn't executed at all. First off all, It make sense to migrate to oneTBB because we fixed a lot bugs. From you description I might guess that you have several external threads in your example. Each external thread that submit work into TBB (also oneTBB) via parallel_for, task_group, etc. will have implicit arena into library in which it submit work. In that case old TBB has a bug of possible threads oversubscription that might leads to 24 threads on 12 core system. That also will lead to workers migration between arenas (that will increase overheads). You can use one task_arena and submit work using it. (It will not create implicit arenas for each external thread) For example:

tbb::task_arena a;

auto thread_func = [&a] {
    a.execute([] {
        tbb::parallel_for(1, 10000, [] { /* user lambda */ });
    });
};
std::thread t(thread_func);
thread_func();

t.join();

Both threads will submit and execute work in the same arena.

pavelkumbrasev avatar Aug 30 '22 09:08 pavelkumbrasev

@TrianglesPCT is this issue still relevant for you? Could you please respond?

isaevil avatar Oct 05 '22 11:10 isaevil

No I was able to work around by using only a single arena

TrianglesPCT avatar Oct 12 '22 16:10 TrianglesPCT

@TrianglesPCT so I guess this issue can be closed? Feel free to reopen if any questions are left.

isaevil avatar Oct 13 '22 07:10 isaevil