orc
orc copied to clipboard
What is the benefit of iterating over the stripes?
I'm looking at the example in the README, and the loop looks like this:
...
// Iterate over each stripe in the file.
for c.Stripes() {
// Iterate over each row in the stripe.
for c.Next() {
// Retrieve a slice of interface values for the current row.
log.Println(c.Row())
}
}
...
What is the reason for the user having to iterate over the stripes? It feels very low-level. An alternative would be to simplify the API to
...
// Iterate over each row in the stripe.
for c.Next() {
// Retrieve a slice of interface values for the current row.
log.Println(c.Row())
}
...
...and have the cursor handle the stripes internally. Thoughts?
That could be a nice simplification, my original thinking was to facilitate skipping a stripes based on checking column statistics. But that could be achieved in other ways.
Another point I think is if we have this we can read stripes in parallel which make it more quickly.