kafka-connect-fs icon indicating copy to clipboard operation
kafka-connect-fs copied to clipboard

Fix batching on ParquetFileReader

Open jakedorne opened this issue 2 years ago • 0 comments

fixes #100

Currently, the parquet file batcher calls hasNext while seeking the file, which itself checks if seeked == true. This leads to the filereader repeatedly reading the second batch and never completes. Using the existing hasNextRecord fixes this and I assume was originally intended to be used here.

This PR doesn't contain tests, sorry. To reproduce this in tests I had to replace the mocking with stubs, which broke other tests and fixing it would be a bigger change than I think this fix warrants. Here is a commit showing what I did to reproduce.

Thanks!

jakedorne avatar Nov 06 '22 22:11 jakedorne