JuliaDB.jl
JuliaDB.jl copied to clipboard
groupby operations slower on JuliaDB compared to DataFrames
trafficstars
On the dataset defined here, a group operation is extremely slow on JuliaDB compared to DataFrames.
The following benchmark was done on dataset of size N=1e8 from the link above.
Grouping by one column and calculating sum along another
On JuliaDB
@btime groupby(sum, df, :id1, select=:v1);
6.908 s (1710 allocations: 1.68 GiB)
On DataFrames
@btime combine(groupby(df, :id1), :v1=>sum)
743.827 ms (222 allocations: 762.96 MiB)
This was on Julia 1.4, DataFrames v0.21.0, and JuliaDB v0.13.0