big_data_benchmarks
big_data_benchmarks copied to clipboard
Only aggregations
This is a bit of a better comparison and makes dask run. Instead of materializing a column, we nog aggregate (take the mean). And we don't ask dask to materialize the filtered dataframe.
These are my result using vaex-hdf5 (parquet is much slower):
And vaex:
With parquet is slightly faster, but I cannot run the filtered part