parquet-go icon indicating copy to clipboard operation
parquet-go copied to clipboard

Handle io.EOF errors returned by ReadAt opening files

Open metalmatze opened this issue 2 years ago • 4 comments

Hey,

we're opening parquet files with an object storage client and not directly via os.File. This currently fails.

The underlying ReaderAt interface is allowed to return io.EOF. Currently, opening a file fails for us because reading the footer the object storage client reads until the end of the file returning io.EOF yet at the same time the data was written to the buffer just fine.

Now we can fix this in our FrostDB code, however, we think that generally speaking FrostDB' object storage client is compliant with the ReaderAt interface.

What is this project's stance on handling the io.EOF here?

metalmatze avatar Feb 16 '23 13:02 metalmatze

This makes sense to me for reading the footer where you would likely get valid data and an EOF, but do we need this for the column and offset indexes?

joe-elliott avatar Mar 01 '23 16:03 joe-elliott

Maybe we can use a helper function to handle unexpected conditions?

func readFullAt(r io.ReaderAt, b []byte, off int64) (int, error) {
  n, err := r.ReadAt(b, off)
  if n == len(b) {
    err = nil
  } else {
    switch err {
    case nil:
      err = io.ErrNoProgress
    case io.EOF:
      err = io.ErrUnexpectedEOF
    }
  }
  return n, err
}
  • we don't make assumptions about the underlying io.ReaderAt implementation
  • we silence errors if we read all the bytes we were looking for as they don't impact correctness

achille-roussel avatar Mar 01 '23 16:03 achille-roussel

I'd be happy with such a helper function.

metalmatze avatar Mar 02 '23 14:03 metalmatze

Apologies to make more work for you, but we've decided to move development on this project to a new organization at https://github.com/parquet-go/parquet-go to ensure its long term success. We appreciate your contribution and would appreciate if you could reopen this PR there if it is still relevant.

kevinburkesegment avatar Jul 12 '23 18:07 kevinburkesegment