JuliaDB.jl icon indicating copy to clipboard operation
JuliaDB.jl copied to clipboard

chunks argument to loadtable not working as expected

Open cmcaine opened this issue 6 years ago • 5 comments

using Distributed
addprocs(4)
@everywhere using JuliaDB
loadtable("some.csv", chunks=4)
# distributed table with 1 chunk

I was expecting more than one chunk.

As mentioned on slack.

cmcaine avatar Feb 26 '19 23:02 cmcaine

I've noticed that the number of chunks is capped at how many files you have being ingested. Not sure if that's 100% true in all cases, but certainly has been with my code.

versipellis avatar Jun 18 '19 15:06 versipellis

I think one of the authors said that later on slack.

cmcaine avatar Jun 27 '19 12:06 cmcaine

That's correct. Files aren't split into multiple chunks. Each chunk is at least one file.

joshday avatar Jun 27 '19 13:06 joshday

Maybe need to add some rechunking mechanism so that we can chunk a large file into smaller files.

xiaodaigh avatar Jul 19 '19 14:07 xiaodaigh

https://github.com/JuliaComputing/JuliaDB.jl/pull/288

jpsamaroo avatar Jul 19 '19 14:07 jpsamaroo