kuzu icon indicating copy to clipboard operation
kuzu copied to clipboard

Add Dialect Detection as the First Step of CSV Sniffing

Open SterlingT3485 opened this issue 4 months ago • 9 comments

Description

Implemented Dialect Detection as the first step for CSV Sniffing,

Introduced a new dialectDetectionDriver inheriting the serialParsingDriver,

Introduced two new member functions for SerialCSVReader, which are resetReaderState and detectDialect.

the detectDialect makes use the parseCSV , which try any possible combination of dialect to parse the CSV and based on the numbers of valid rows to guess the best suitable dialect.

Did some minor change the parseCSV to handle the error during the CSV.

Did minor change in sniffCSVTypeandNameDriver:addValue to allow skip the trailing delimiter.

Fixes #1448

Contributor agreement

SterlingT3485 avatar Sep 26 '24 21:09 SterlingT3485