JuliaDB.jl
JuliaDB.jl copied to clipboard
chunks argument to loadtable not working as expected
using Distributed
addprocs(4)
@everywhere using JuliaDB
loadtable("some.csv", chunks=4)
# distributed table with 1 chunk
I was expecting more than one chunk.
As mentioned on slack.
I've noticed that the number of chunks is capped at how many files you have being ingested. Not sure if that's 100% true in all cases, but certainly has been with my code.
I think one of the authors said that later on slack.
That's correct. Files aren't split into multiple chunks. Each chunk is at least one file.
Maybe need to add some rechunking mechanism so that we can chunk a large file into smaller files.
https://github.com/JuliaComputing/JuliaDB.jl/pull/288