DataFrames.jl
DataFrames.jl copied to clipboard
Use threading in joins
A separate issue to keep track of this specific issue.
The thing we could do:
- always precompute hashes (even for one col)
- precompute hashes using threading
- then split shorter table into shards matching number of available threads and process each shard separately.
- it should be very cheap to combine the results.