CSV.jl
CSV.jl copied to clipboard
Yet another feature request: Chunks but for groups
Assuming that my csv file is properly sorted, it would be great to be able to iterate through the groups (I'm imagining effectively doing groupby(mycsvfile,"col1")) in order to do aggregation type operations without loading the entire file first. So, basically like "smart" CSV.Chunks.
Perhaps would require an initial complete pass over the entirety of the grouping columns to get the proper row indices?
Certainly interesting to think about; related to CSV.Chunks, but would require quite a bit different implementation. I agree that I think it would require a pass to find the right byte positions for each "grouping".