JuliaDB.jl
JuliaDB.jl copied to clipboard
Docs should reflect changed indexcols and datacols behavior when filenamecol is active
If I loadtable with filenamecol = :source_file this creates a new first column.
Then indexcols and datacols need to be shifted by one index, which is a bit impractical. So, I suggest that the filenamecol should be the appended as the last column of the resulting table.
That seems like a good change to me.
First column in an ndsparse is the longest "stride". That's why I thought it made sense...
The filenamecol is typically the first index column, but in practice I often try loading one file before loading multiple. Since filenamecol shifts all the column numbers, it messes up the arguments I had previously set for indexcols and datacols.
When you start with this:
loadtable(one_file; indexcols = (1,2), datacols = (3,4,5))
Loading multiple files becomes:
# proposed change
loadtable(many_files; indexcols = (6,1,2), datacols = (3,4,5), filenamecol = :File)
# current behavior
loadtable(many_files; indexcols = (1,2,3), datacols = (4,5,6), filenamecol = :File)
I think I prefer the proposed change, but at worst the current behavior is only a slight inconvenience.
The behavior is still unchanged, so I guess the matter is decided. I think the docs should mention this quirk!