Gertjan van den Burg comments

Results 48 comments of


Gertjan van den Burg

Header Detection Improvement

Thanks for opening an issue on this and creating a PR @ben-bitdotio! The header detection code could definitely be improved, but I've been waiting until I have a dataset to...

Handling text fragments in the first few rows of a CSV file

Hi @RahulSinghYYC, thanks for your question. This depends a bit on whether you're reading the file as a list of list or as a dataframe. If you're using the ``read_table``...

Handling text fragments in the first few rows of a CSV file

Hi @RahulSinghYYC, CleverCSV doesn't currently have support for detecting the table area automatically. I know there is some research on this problem (see, e.g. [hypoparsr](https://github.com/tdoehmen/hypoparsr) and [Pytheas](https://github.com/cchristodoulaki/Pytheas/)), but there are...

Handling text fragments in the first few rows of a CSV file

Thanks for offering a suggestion @lcnittl, very nice of you to help! :+1: Just to offer another work-around: one of the main approaches that CleverCSV takes in detecting the dialect...

clevercsv sniffer slows to a crawl on large-ish files (e.g. FEC data)

Hi @jlumbroso, thanks for the detailed bug report! You're describing an issue that I've been thinking about for a while, but never had the time to seriously investigate, so I'm...

clevercsv sniffer slows to a crawl on large-ish files (e.g. FEC data)

Hi @jlumbroso, This took a bit longer than expected, but I've now added a comparison study to the repo (see [here](https://github.com/alan-turing-institute/CleverCSV/tree/comparison/comparison)). This experiment evaluates the accuracy and runtime of dialect...

Gertjan van den Burg

Header Detection Improvement

Handling text fragments in the first few rows of a CSV file

Handling text fragments in the first few rows of a CSV file

Handling text fragments in the first few rows of a CSV file

clevercsv sniffer slows to a crawl on large-ish files (e.g. FEC data)

clevercsv sniffer slows to a crawl on large-ish files (e.g. FEC data)

clevercsv sniffer slows to a crawl on large-ish files (e.g. FEC data)

clevercsv sniffer slows to a crawl on large-ish files (e.g. FEC data)

delimiter detection error

delimiter detection error