benchbase currently load phase is possible only using a single driver

currently load phase is possible only using a single driver

Open mariadb-DmitryVolkov opened this issue 2 years ago • 3 comments

For tpc-c workload, there is no capability to specify the warehouse number to start with (it always starts with 1). This prevents the running load from multiple drivers. A single driver becomes a bottleneck due to high CPU usage using a large (or distributed database).

Jul 15 '22 19:07 mariadb-DmitryVolkov

Looks like this is a trivial change: https://github.com/cmu-db/benchbase/blob/f663fe36d0119289b1f12a8e877c080de51db8e1/src/com/oltpbenchmark/benchmarks/tpcc/TPCCBenchmark.java#L75-L96

instead of for (int w = 0; it should be a configuration parameter (0 by default). This will allow to create as many drivers as needed and set the proper offset for each driver.

Jul 15 '22 20:07 mariadb-DmitryVolkov

I actually think the right fix is to separate scalefactor from the warehouse generation. That way we could increase the # of warehouses (and the range of W_IDs like you need) while also scaling the size of the other tables (DISTRICT, CUSTOMER, etc).

Jul 16 '22 12:07 apavlo

Thanks, @apavlo. Sure, I think you are proposing a bigger change. Unfortunately, I can't help with that bigger change - I don't know the code well enough yet. But I would highly appreciate it if you could implement it. This is a big showstopper for using benchbase for distributed databases.

Jul 17 '22 02:07 mariadb-DmitryVolkov

benchbase benchbase copied to clipboard

currently load phase is possible only using a single driver

benchbase
benchbase copied to clipboard