csv-nix-tools
csv-nix-tools copied to clipboard
Error when column has colon in the column (:) name
I am trying to csv-sqlite to perform some cleanup on a csv file, and I'm having trouble when loading a csv with a colon in the name:
#no issue here echo "ID,Title,test" > a.csv cat a.csv | csv-sqlite "select * from input" -s #error here echo "ID,Title,test:" > a.csv cat a.csv | csv-sqlite "select * from input" -s #error here echo "ID,Title,\"test:\"" > a.csv cat a.csv | csv-sqlite "select * from input" -s
It prints the message:
unsupported type '"' EOF while reading header
And exits with exit code 2, I tried with current trunk (a39e4e6652213d769c58a105302bba58f1e509ea)
Right, that's sort of by design because csv-nix-tools assumes an extension to the CSV format, where the colon in the column name is a type delimiter - if there's no type after :, it doesn't know what to do. This extension is crucial for many things to work (like sorting numerically or filtering numerical columns using e.g. greater-than operator), so I'm not sure what should be done about it.
One way would be to have a command line switch, which disables type parsing, but that doesn't really solve the real issue. Ignoring type parsing problems (here, it's just an empty string) could be an option, but I don't like when tools fail silently... Maybe a better error message could be a solution? Or some combination of all 3 options?
Sorry for taking so much time to reply. I'm in the middle of a sort of hurricane right now ;).
Hi @mslusarz thanks for your answer, hope you're fine. I never heard of that csv extension until now, i just looked around and only found this but it seems to add the types also in the cells. What I would do (of course this is only an suggestion) would be:
- Validate the csv headers to see if it is using colons as data type specifiers, then
- If format is OK, proceed as you are currently doing
- If format is non compliant, issue a warning and fall back to non extended format (having a flag to force non extended would be cool)
I implemented it like you proposed in #12. I have to write more tests for that, so only PR for now.