duckdb icon indicating copy to clipboard operation
duckdb copied to clipboard

CSV Parallel Reading Validation

Open pdet opened this issue 4 months ago • 0 comments

This PR reintroduces CSV Validation and extends the new line finder algorithm to solve the cases where we wouldn't be able to properly read a CSV file in parallel.

The validation works by checking that each thread that started reading the CSV file from a random buffer piece, started from the correct place, ensuring we don't have either data replication or data loss.

pdet avatar Oct 18 '24 12:10 pdet