poco
poco copied to clipboard
The load balancing issue in Poco::ActiveThreadPool
Describe the bug
Each thread in Poco::ActiveThreadPool
has its own Poco::NotificationQueue
.
If I have 2 types of Runnable like this:
class LongTimeTask : public Poco::Runnable {
void run() {
sleep(2 * 60 * 1000);
}
};
class ShortTimeTask : public Poco::Runnable {
void run() {
sleep(1);
}
};
int capacity = 2;
Poco::ActiveThreadPool pool(capacity);
for (int i = 0; i < 10; i++) {
pool.start(new LongTimeTask); // [BUG!!!] all long time task will enqueue to thread#1
pool.start(new ShortTimeTask); // [BUG!!!] all short time task will enqueue to thread#2
}
thread#2 will always in idle, and thread#1 will always in busy !!!
Hi! I can't agree with a bug. I verified behaivour on the latest main branch with this code
Poco::AtomicCounter lttCount;
class LongTimeTask : public Poco::Runnable {
Poco::AtomicCounter &_counter;
public:
LongTimeTask(Poco::AtomicCounter &counter) : _counter(counter) {}
void run() override {
_counter++;
puts("ltt task");
Poco::Thread::sleep(2 * 1000);
}
};
Poco::AtomicCounter sttCount;
class ShortTimeTask : public Poco::Runnable {
Poco::AtomicCounter &_counter;
public:
ShortTimeTask(Poco::AtomicCounter &counter) : _counter(counter) {}
void run() override {
_counter++;
puts("stt task");
Poco::Thread::sleep(1);
}
};
int capacity = 2;
Poco::ActiveThreadPool pool(capacity);
for (int i = 0; i < 10; i++) {
LongTimeTask ltt(lttCount);
pool.start(ltt);
ShortTimeTask stt(sttCount);
pool.start(stt);
}
pool.joinAll();
assertEqual(10, lttCount.value());
assertEqual(10, sttCount.value());
Output was
testActiveThreadLoadBalancing: ltt task
stt task
stt task
stt task
stt task
stt task
stt task
stt task
stt task
stt task
stt task
ltt task
ltt task
ltt task
ltt task
ltt task
ltt task
ltt task
ltt task
ltt task
Please see this picture.
If there are 5 LongTime runnables, and 5 ShortTime runnables. and 2 threads in the pool.
Each LongTime runnable sleep 5 seconds,
Each ShortTime runnable sleep 1 seconds,
If the load balancing is good, It will cost 15 seconds to finish all 10 runnables,
but in Poco::ActiveThreadPool
is may cost 25 seconds.
This seems to me like an enhancement request, not a bug.
Implementation currently assigns tasks to thread in round-robin way when adding them.
Possible improvement is to assign them when they actually start running from a single queue..
Hi @siren186 and @matejk ! Please look at my PR, I think that my optimization can enhance some usecases
FYI
capacity = 2 optimization = false ./Foundation-testrunner testActiveThreadLoadBalancing 0.05s user 0.02s system 0% cpu 11.496 total
capacity = 2 optimization = true ./Foundation-testrunner testActiveThreadLoadBalancing 0.06s user 0.01s system 0% cpu 10.545 total
capacity = 4 optimization = false ./Foundation-testrunner testActiveThreadLoadBalancing 0.06s user 0.02s system 1% cpu 6.605 total
capacity = 4 optimization = true ./Foundation-testrunner testActiveThreadLoadBalancing 0.05s user 0.02s system 1% cpu 6.628 total
capacity = 32 optimization = false ./Foundation-testrunner testActiveThreadLoadBalancing 0.06s user 0.02s system 3% cpu 1.797 total
capacity = 32 optimization = true ./Foundation-testrunner testActiveThreadLoadBalancing 0.06s user 0.02s system 4% cpu 1.692 total