oneTBB callback on idle threads

Hi, I am looking for a way to keep track of when and how long any TBB thread stays idle within (the framework used by) my application.

After reading a bit of the oneTBB code, my understanding is that I can consider a thread to be idle when it is in the stealing loop at https://github.com/oneapi-src/oneTBB/blob/v2021.8.0-rc1/src/tbb/task_dispatcher.h#L193-L232: https://github.com/oneapi-src/oneTBB/blob/bd619c54a6ed475ee404af2dd033f6f353ecf47f/src/tbb/task_dispatcher.h#L193-L232

Is this reasonable? That is,

is it OK to consider a thread as idle, that is, not executing any task, while it is in that loop ?
are there other places where an idle thread could spend its time ?

As for a way to notify my application when a thread is idle, I have been thinking to extend the task_scheduler_observer adding two methods on_thread_idle() and on_thread_active().

Does this seem like a good approach ?

Thanks for any comments and suggestions - next I'll see if I can implement something along these lines.

.Andrea

Dec 14 '22 23:12 fwyzard

Hi @fwyzard, could you please describe more your use case? Since the still loop is a part of task stealing mechanism it is might be considered as a working state. In balanced scenarios this might introduce some overheads because we will insert the V-call on the hot path.

Dec 15 '22 11:12 pavelkumbrasev

Hi @pavelkumbrasev, of course.

CMSSW is the software used by the CMS experiment at CERN for the simulation, physics reconstruction and analysis of the experimental data, and its framework relies heavily on TBB to implement task-based multithreading.

An optimised application may run order of 10k TBB tasks per second per CPU thread.

However we also have cases where the threads are idle, mostly because of two reasons:

the input data is available at a lower rate that what the system can process, leading to a situation where simply there aren't any tasks to run;
sometimes there are tasks available, but their dependencies are not yet satisfied, so they aren't available to run.

We implemented a "service" inside the application that tracks how much (cpu and real) time is spent inside each "module" of the application (roughly, a module maps to a TBB task). However we don't have a way to measure how much time the threads spend idling, for example because of one of the two reasons above.

Hence my attempt to extend TBB to let it notify us about idling tasks :-) Though I haven't thought yet whether the two cases above (no input data, or unsatisfied dependencies) could be threaded separately.

So far, I thought to generate a pair of calls (idle/active) at most once per call to task_dispatcher::receive_or_steal_task(...), by making the idle call only one the thread reaches the waiter.pause(slot); call for the first time, and making the active call only one the thread exists the loop and only if it was marked as idle.

Let me know what you think, and if I can provide more information !

Dec 15 '22 11:12 fwyzard

Seems your right and entry_idle might be called just before the waiter.pause() and leave on the return. Although, applicability of such API is kinda limited too - maybe statistic. Do you observe that in such scenarios ("1" and "2") still loop is important enough to mark it as a separate state? Since on described scenarios it will lead to frequent waiter.pause() calls i.e. 2 * P process pauses + 100 yields that should be in total ~100 us or even less. After one thread marks internal arena as empty worker threads will leave at this moment.

Dec 15 '22 13:12 pavelkumbrasev

Yes, my goal is to monitor the active vs idle time, not to affect the properties of the threads.

I know that the fraction of time spent idle in scenario 1. can be significant.

I do not know yet what the fraction of time ending up in scenario 2. could be - first I would need to find a way to monitor when it happens :-)

Dec 15 '22 13:12 fwyzard

I can review your PR once you submit it and after it we could discuss applicability of this API in general case. Are you ok with this plan?

Dec 15 '22 13:12 pavelkumbrasev

Sure, thanks.

Though, I have one question first. In order to call the two new methods on_thread_idle() and on_thread_active() to the task_scheduler_observer, I think I need to extend the observer_list class and implement the equivalent of do_notify_entry_observers() and do_notify_exit_observers() for the new methods. Looking at their implementation, they are similar but not identical. Could you give me a brief explanation of why they are different ?

Thank you, .Andrea

Dec 15 '22 14:12 fwyzard

TBH that pretty complicated logic so on first glance entry will iterate until reach last == nullptr and 'exit' iterate to the last notified observer that was stored into TLS. I wonder if you could change interface and reuse do_notify_entry_observers and do_notify_exit_observers methods and pass lambda with needed call e,g, in do_notify_entry_observers tso->on_scheduler_entry(worker); and tso->on_thread_idle(worker); and similar with do_notify_exit_observers.

Dec 15 '22 14:12 pavelkumbrasev

@pavelkumbrasev, thanks for the suggestions.

Please find at https://github.com/oneapi-src/oneTBB/pull/995 my first attempt to implement the idle thread notifications.

Dec 20 '22 21:12 fwyzard

It's split into two commits:

the first implements the minimal changes, but with a large amount of code duplication in src/tbb/observer_proxy.h/.cpp;
the second attempts to reduce this duplication, though I'm not very happy about the result.

Dec 20 '22 21:12 fwyzard

From a first round of checks in our application, it seems to behave as intended. Next I'll try to implement some proper monitoring on top of it, and then measure the impact on the application performance.

Dec 20 '22 21:12 fwyzard

Hi @fwyzard any updates regarding your research?

Jan 09 '23 09:01 pavelkumbrasev

@pavelkumbrasev is this issue still relevant?

Aug 13 '24 08:08 arunparkugan

It's something I haven't actively looked into for a while, but I would like to come back to it.

Aug 13 '24 09:08 fwyzard

oneTBB oneTBB copied to clipboard

callback on idle threads

oneTBB
oneTBB copied to clipboard