ClickBench icon indicating copy to clipboard operation
ClickBench copied to clipboard

add datafusion

Open waitingkuo opened this issue 2 years ago • 14 comments

Add datafusion not yet find a "c6a.4xlarge, 500gb gp2" vm to test this

waitingkuo avatar Aug 02 '22 20:08 waitingkuo

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Aug 02 '22 20:08 CLAassistant

Thank you! I'm very interested in this.

alexey-milovidov avatar Aug 06 '22 20:08 alexey-milovidov

I tried to run it on c6a.4xlarge

[2.638, 0.331, 0.305],
[0.516, 0.362, 0.362],
[1.312, 0.938, 0.888],
[1.427, 0.483, 0.487],
[3.006, 2.958, 2.953],
[null, null, null],
[0.479, 0.358, 0.353],
[0.450, 0.362, 0.367],
[3.952, 3.568, 3.370],
[6.014, 5.381, 5.570],
[2.301, 1.797, 1.726],
[2.810, 2.234, 2.139],
[null, null, null],
[null, null, null],
[null, null, null],
[4.722, 4.707, 4.469],
[null, null, null],
[null, null, null],
[null, null, null],
[0.786, 0.510, 0.494],
[14.218, 8.460, 8.409],
[17.441, 11.445, 11.307],
[41.716, 28.732, 28.761],
[129.533, 95.898, 95.534],
[7.422, 5.185, 5.246],
[10.154, 9.899, 10.094],
[13.262, 11.534, 11.464],
[18.279, 13.104, 14.166],
[225.516, 232.958, 234.723],
[2.859, 3.152, 3.202],
[8.917, 6.679, 6.655],
[11.631, 7.802, 7.855],
[0.081, 0.054, 0.056],
[null, null, null],
[null, null, null],
[4.864, 4.564, 4.709],
[null, null, null],
[null, null, null],
[null, null, null],
[null, null, null],
[null, null, null],
[null, null, null],
[null, null, null],

qoega avatar Aug 09 '22 12:08 qoega

@waitingkuo Let's submit these results and then improve?

alexey-milovidov avatar Aug 10 '22 00:08 alexey-milovidov

@alexey-milovidov we haven't fixed some issues in our master branch, i'll modify the installation part to build the latest and then submit it

waitingkuo avatar Aug 10 '22 10:08 waitingkuo

@qoega thank you!

waitingkuo avatar Aug 10 '22 10:08 waitingkuo

@qoega i fixed some issues and all the test cases passed 😃 i've added the result (it was running in azure, i tried my best to find a similar VM there)

@alexey-milovidov it's ready to merge, thank you~

waitingkuo avatar Aug 10 '22 15:08 waitingkuo

close apache/arrow-datafusion#2902 and apache/arrow-datafusion#3048

waitingkuo avatar Aug 10 '22 15:08 waitingkuo

Current version

[2.629, 0.334, 0.308],
[0.501, 0.361, 0.369],
[1.268, 0.874, 0.862],
[1.380, 0.484, 0.479],
[2.993, 3.000, 3.005],
[5.118, 4.026, 4.026],
[0.502, 0.358, 0.347],
[0.444, 0.359, 0.367],
[4.121, 3.484, 3.534],
[5.929, 5.335, 5.379],
[1.842, 1.357, 1.336],
[2.241, 1.614, 1.722],
[5.574, 4.465, 4.589],
[7.704, 6.852, 6.775],
[5.938, 5.027, 4.977],
[5.022, 4.762, 4.682],
[10.981, 9.900, 9.915],
[8.992, 7.361, 7.306],
[21.124, 17.996, 17.917],
[1.180, 0.443, 0.416],
[13.902, 8.194, 7.936],
[16.771, 10.375, 10.443],
[40.808, 27.454, 27.461],
[115.434, 82.709, 82.571],
[7.043, 4.855, 4.761],
[10.034, 10.077, 9.909],
[13.047, 11.289, 11.196],
[13.947, 7.928, 7.885],
[223.222, 229.755, 234.819],
[1.829, 2.235, 2.208],
[8.378, 6.282, 6.490],
[10.424, 7.384, 7.216],
[0.055, 0.055, 0.056],
[26.220, 20.760, 20.288],
[27.056, 20.990, 20.964],
[4.797, 4.557, 4.788],
[0.414, 0.270, 0.274],
[0.297, 0.225, 0.220],
[0.267, 0.241, 0.219],
[0.698, 0.533, 0.518],
[0.147, 0.087, 0.089],
[0.127, 0.085, 0.091],
[0.112, 0.076, 0.075],

qoega avatar Aug 10 '22 16:08 qoega

@qoega thank you

unfortunately, i just found a issue here. i'll make it right again as soon as possible

waitingkuo avatar Aug 10 '22 16:08 waitingkuo

No problem. I just have that instance ready for benchmarks and can pull and run anytime.

qoega avatar Aug 10 '22 17:08 qoega

@qoega @alexey-milovidov i submitted an issues for parquet data #18

hits.parquet and hits_{n}.parquet are slight different

waitingkuo avatar Aug 10 '22 17:08 waitingkuo

datafusion parquet importer doesn't support schema for now, it's inferred from parquet metadata directly.

waitingkuo avatar Aug 10 '22 18:08 waitingkuo

i did the benchmark again. it works now. I thought that there were some issues it turned out that i used hits_0.parquet to do the quick test. It's ready to be merged again :D

waitingkuo avatar Aug 10 '22 18:08 waitingkuo

@qoega @alexey-milovidov i've made the result up to date. please let me if there's anything i need to improve. thanks~

waitingkuo avatar Aug 16 '22 15:08 waitingkuo

@waitingkuo Thank you! I also wanted to re-run on AWS for better comparison and to check for reproducibility, but let's firstly merge as is.

alexey-milovidov avatar Aug 17 '22 05:08 alexey-milovidov