benchbase
benchbase copied to clipboard
currently load phase is possible only using a single driver
For tpc-c workload, there is no capability to specify the warehouse number to start with (it always starts with 1). This prevents the running load from multiple drivers. A single driver becomes a bottleneck due to high CPU usage using a large (or distributed database).
Looks like this is a trivial change: https://github.com/cmu-db/benchbase/blob/f663fe36d0119289b1f12a8e877c080de51db8e1/src/com/oltpbenchmark/benchmarks/tpcc/TPCCBenchmark.java#L75-L96
instead of for (int w = 0; it should be a configuration parameter (0 by default). This will allow to create as many drivers as needed and set the proper offset for each driver.
I actually think the right fix is to separate scalefactor from the warehouse generation. That way we could increase the # of warehouses (and the range of W_IDs like you need) while also scaling the size of the other tables (DISTRICT, CUSTOMER, etc).
Thanks, @apavlo. Sure, I think you are proposing a bigger change. Unfortunately, I can't help with that bigger change - I don't know the code well enough yet. But I would highly appreciate it if you could implement it. This is a big showstopper for using benchbase for distributed databases.