peloton Performance fix: replace WorkerPool sleeping with condition variable

Performance fix: replace WorkerPool sleeping with condition variable

Open mbutrovich opened this issue 6 years ago • 3 comments

I couldn't stare at concurrency control logic anymore this afternoon, so I threw together a quick experiment replacing our WorkerPool's sleep with exponential backoff behavior for C++11's std::condition_variable. This change applies to any WorkerPool's created by a MonoQueuePool, which is currently the query worker threads, the parallel worker threads for codegen, and Brain worker threads.

I benchmarked using the same configs from #1401:

TPC-C: 15 runs, 60 seconds each, 4 terminals, scale factor 4 YCSB (read-only): 15 runs, 60 seconds each, 4 terminals, scale factor 1000

	master μ	master σ	branch μ	branch σ
TPC-C	329	22	344	11
YCSB	16580	167	18186	56

It seems 5-10% faster in limited testing.

Right now I'm interested in feedback about if this has already been explored, and any concerns others might have with using this approach. I'm mostly interested in scalability, and would like to see how this fares on something with multiple sockets and a lot of cores. I'm also wondering if this actually falls apart when we do get our TPS numbers up where they should be.

Jun 19 '18 22:06 mbutrovich

Coverage increased (+0.005%) to 76.371% when pulling 64567a5fcf9afe679c29927f05091e42b9ad429c on mbutrovich:worker_queue into 2406b763b91d9cee2a5d9f4dee01c761b476cef6 on cmu-db:master.

Jun 19 '18 23:06 coveralls

Interesting. The reason why we did not use cv is that it requires mutex, which may result in poor performance during high throughput. I would argue that we hold this change for a while and see if we may get a higher throughput after we reach a higher TPS.

Jun 26 '18 03:06 ChTimTsubasa

We can hold this until we perform additional measurements. There are few instances where sleep is the right solution. It is usually chosen for its simplicity. If mutex overhead is a concern, or measured to be a concern, then we should look at minimizing use of the mutex for instance. Event driven is the right choice, with suitable optimization as needed.

Jun 26 '18 13:06 pervazea

peloton peloton copied to clipboard

Performance fix: replace WorkerPool sleeping with condition variable

peloton
peloton copied to clipboard