orc icon indicating copy to clipboard operation
orc copied to clipboard

What is the benefit of iterating over the stripes?

Open JensRantil opened this issue 4 years ago • 2 comments

I'm looking at the example in the README, and the loop looks like this:

...

// Iterate over each stripe in the file.
for c.Stripes() {
    
    // Iterate over each row in the stripe.
    for c.Next() {
          
        // Retrieve a slice of interface values for the current row.
        log.Println(c.Row())
        
    }
   
}

...

What is the reason for the user having to iterate over the stripes? It feels very low-level. An alternative would be to simplify the API to

...

// Iterate over each row in the stripe.
for c.Next() {

    // Retrieve a slice of interface values for the current row.
    log.Println(c.Row())
   
}

...

...and have the cursor handle the stripes internally. Thoughts?

JensRantil avatar Oct 27 '20 15:10 JensRantil

That could be a nice simplification, my original thinking was to facilitate skipping a stripes based on checking column statistics. But that could be achieved in other ways.

scritchley avatar Nov 24 '20 12:11 scritchley

Another point I think is if we have this we can read stripes in parallel which make it more quickly.

Zhile avatar Oct 14 '21 02:10 Zhile