learn_gnuawk icon indicating copy to clipboard operation
learn_gnuawk copied to clipboard

suggestion for FPAT section: csvquote

Open dbro opened this issue 3 years ago • 1 comments

Hi Sundeep- thanks, this is a great resource!

In the section about Field Separators-FPAT there is a warning note saying that FPAT will not work for csv files. That's true, and it is why csvquote exists. In contrast to xsv (which you mention) and other csv-processing suites such as miller and csvtools, the goal of csvquote is to provide just enough csv awareness to allow any text processing tools to work effectively with problematic csv data. There are still two caveats to be aware of, although they are uncommon - the data is assumed to not already contain 2 specfic nonprinting ASCII characters (0x1e and 0x1f), and within the awk script there should not be any matching based on the embedded commas and newlines.

Also works for TSV and other data that follow RFC-4180.

dbro avatar Jan 31 '22 11:01 dbro

Thanks, I'll mention it in the next update.

I do mention this SO thread (https://stackoverflow.com/questions/45420535/whats-the-most-robust-way-to-efficiently-parse-csv-using-awk) as well as an alternate option, but yours is aimed to work with other tools as well, so it'll be good to mention it.

learnbyexample avatar Jan 31 '22 13:01 learnbyexample

Added in version 2.0

learnbyexample avatar Aug 22 '23 03:08 learnbyexample