CSV.jl icon indicating copy to clipboard operation
CSV.jl copied to clipboard

Yet another feature request: Chunks but for groups

Open tbeason opened this issue 3 years ago • 1 comments

Assuming that my csv file is properly sorted, it would be great to be able to iterate through the groups (I'm imagining effectively doing groupby(mycsvfile,"col1")) in order to do aggregation type operations without loading the entire file first. So, basically like "smart" CSV.Chunks.

Perhaps would require an initial complete pass over the entirety of the grouping columns to get the proper row indices?

tbeason avatar Jun 07 '21 06:06 tbeason

Certainly interesting to think about; related to CSV.Chunks, but would require quite a bit different implementation. I agree that I think it would require a pass to find the right byte positions for each "grouping".

quinnj avatar Jun 17 '21 04:06 quinnj