xsv
xsv copied to clipboard
Is there an option to write without CSV escapes?
A question: Is there an option to perform output without the CSV escape syntax? This would be to generate a more strict TSV format, without escapes.
I don't see this, and the documentation is pretty good. I'd just like to make sure I haven't missed something. There are a number of options to the fmt
command that provide control over the escaping used, but I didn't see one turning it off.
Some examples:
$ # fmt -t will change the delimiter and drop surrounding quotes (without -quote-always)
$ echo '"abc","def"' | xsv fmt -t $'\t'
abc def
$ # Escapes are generated if a field contains a quote
$ echo '"abc","d""ef"' | xsv fmt -t $'\t'
abc "d""ef"
$ # In tsv the result would be:
$ # abc d"ef
$# Similarly with embedded field and record separators (tab/newline).
$# In TSV they are disallowed, and might be replaced by a space when encountered.
$ echo $'"abc","d\tef"' | xsv fmt -t $'\t'
abc "d ef"
$ # In the above, the embedded tab character was retained.
Again, I'm only asking if there is an option I haven't found. In the examples above the fmt
command is doing exactly what it says, which is to change the CSV delimiter character.
@jondegenhardt Thanks for the detailed question! I do not believe there is any such option. In fact, the underlying CSV writer doesn't support it, so that's how I know there isn't any such option. The CSV writer options are here: https://docs.rs/csv/1.0.0-beta.5/csv/struct.WriterBuilder.html --- we might consider changing escape
to accept an Option<u8>
, and when it and double_quote
are disabled, then no escaping is performed. We would also need to add a --quote-never
option I suppose.
The last bit is silently changing \t
and \n
into something else, which gets more complicated.
My estimation is that this is a bit of an awkward fit for xsv
at the moment.
Very good, thanks for the detailed response. The CSV doc reference is helpful.
I was about to open a new issue about this, cf. comments from https://github.com/BurntSushi/xsv/issues/67#issuecomment-480218068 and down, but I see this has been closed already. @jondegenhardt if you still have a need for --quote-never, xsv 0.13.0 seems to do this if you pass in the ASCII character 1, though as it's not documented anywhere I guess it comes with no guarantees :-)
$ printf 'user\tutterance\njoe\tSay "hi"\n'|xsv select -d $'\t' utterance \
| xsv fmt -t $'\t'
utterance
"Say ""hi"""
$ printf 'user\tutterance\njoe\tSay "hi"\n'|xsv select -d $'\t' utterance \
| xsv fmt --quote $'\1' -t $'\t'
utterance
Say "hi"
printf 'user\tutterance\njoe\tSay "hi"\n'|xsv select -d $'\t' utterance \
| xsv fmt --quote $'\1' -t $'\t' \
| grep -c $'\1'
0
@unhammer Just to clarify, using the ASCII byte 1
only works because it presumably does not appear in your input anywhere. If it did, then it would need to be quoted. Moreover, if your input contained a field that spanned multiple lines, then it would also need to be quoted.
I'm not sure why this was closed. The underlying CSV writer does support it, so I think this is as easy as adding a new --quote-never
flag and hooking it up.
Aha, thanks for the clarification, good to know the exact dangers involved.
Reason I closed it was that my question had been answered. Didn't mean to suggest the feature would not be useful.
Hi, I think --quote-never
would help in the aforementioned cases. Do you have a plan to implement this?
Thanks!
(and congrats for the tool, it's great)
A question: Is there an option to perform output without the CSV escape syntax? This would be to generate a more strict TSV format, without escapes.
What’s the behavior when a tab or newline is encountered in the data? Are they just converted to spaces (the example says “and might be replaced by a space when encountered”)? Or should the program error out?