JuliaDB.jl icon indicating copy to clipboard operation
JuliaDB.jl copied to clipboard

Docs should reflect changed indexcols and datacols behavior when filenamecol is active

Open MaximilianJHuber opened this issue 6 years ago • 4 comments
trafficstars

If I loadtable with filenamecol = :source_file this creates a new first column.

Then indexcols and datacols need to be shifted by one index, which is a bit impractical. So, I suggest that the filenamecol should be the appended as the last column of the resulting table.

MaximilianJHuber avatar Dec 28 '18 17:12 MaximilianJHuber

That seems like a good change to me.

joshday avatar Dec 28 '18 17:12 joshday

First column in an ndsparse is the longest "stride". That's why I thought it made sense...

shashi avatar Dec 31 '18 05:12 shashi

The filenamecol is typically the first index column, but in practice I often try loading one file before loading multiple. Since filenamecol shifts all the column numbers, it messes up the arguments I had previously set for indexcols and datacols.

When you start with this:

loadtable(one_file; indexcols = (1,2), datacols = (3,4,5))

Loading multiple files becomes:

# proposed change
loadtable(many_files; indexcols = (6,1,2), datacols = (3,4,5), filenamecol = :File)

# current behavior
loadtable(many_files; indexcols = (1,2,3), datacols = (4,5,6), filenamecol = :File)

I think I prefer the proposed change, but at worst the current behavior is only a slight inconvenience.

joshday avatar Jan 02 '19 14:01 joshday

The behavior is still unchanged, so I guess the matter is decided. I think the docs should mention this quirk!

MaximilianJHuber avatar Jun 27 '19 14:06 MaximilianJHuber