rust-csv icon indicating copy to clipboard operation
rust-csv copied to clipboard

Quotes included despite using QuoteStyle::Never

Open hniksic opened this issue 2 years ago • 4 comments

The docs for QuoteStyle::Never say:

This never writes quotes, even if it would produce invalid CSV data.

Based on that, I would expect the following program to output a line with foo followed by an empty line, followed by a line with bar,baz:

fn main() {
    let mut writer = csv::WriterBuilder::new()
        .quote_style(csv::QuoteStyle::Never)
        .from_path("/dev/stdout")
        .unwrap();
    writer.write_record(["foo"]).unwrap();
    writer.write_record([""]).unwrap();
    writer.write_record(["bar,baz"]).unwrap();
}

Expected output:

foo

bar,baz

But actual output is:

foo
""
bar,baz

Is there a way to prevent the writer from generating quotes?

I am using csv 1.1.6.

hniksic avatar Apr 25 '22 11:04 hniksic

My initial impression here is that this is a bug, since the Never option explicitly says "never add quotes" and "even if this would sacrifice the integrity of the CSV data." And indeed, without the empty quotes here, the record would effectively disappear because the parser in this crate ignores empty lines. That could be pretty surprising, but it seems to me like the QuoteStyle::Never docs contain sufficient warning.

After a brief skim of the code, I couldn't immediately identify where the quotes are being inserted. But I think I'd accept a patch that fixes this.

With that said, there is a possible work-around here: simply do not write empty records. Unless you're feeding the data to some other parser---as I mentioned above---the parser in this crate will just ignore empty lines anyway. So it will be as if the record was never written at all.

BurntSushi avatar Apr 25 '22 12:04 BurntSushi

With that said, there is a possible work-around here: simply do not write empty records.

I need the empty line because the output will be parsed by another tool which is not necessarily a CSV parser, which doesn't mind empty fields. This already works when there is more than one field in the record:

writer.write_record(["a", "b"]).unwrap();
writer.write_record(["c", ""]).unwrap();
writer.write_record(["", "d"]).unwrap();

This produces the following, as expected:

a,b
c,
,d
,

But if there is only one record, empty field gets quoted. I don't have access to writer's output, so I cannot inject the newline myself.

The intended use is to pipe the output into tools like cut which don't understand the quoting, but cope with empty fields fine.

hniksic avatar Apr 25 '22 12:04 hniksic

After a brief skim of the code, I couldn't immediately identify where the quotes are being inserted. But I think I'd accept a patch that fixes this.

Looking at the code, this line could be the origin of the quotes:

    pub fn finish(&mut self, mut output: &mut [u8]) -> (WriteResult, usize) {
        let mut nout = 0;
        if self.state.record_bytes == 0 && self.state.in_field {
            assert!(!self.state.quoting);
            let (res, o) = self.write(&[self.quote, self.quote], output);

Would you accept a PR that modifies this to quote only if quoting style is something other than QuoteStyle::Never?

hniksic avatar Apr 25 '22 14:04 hniksic

I think so yes, but I'll need to think on it. In particular, the line right above

https://github.com/BurntSushi/rust-csv/blob/41c71ed353a71526c52633d854466c1619dacae4/csv-core/src/writer.rs#L234

suggests that I might have added this explicit quoting because of some other reason that I can no longer remember. And I wish I had left a comment.

BurntSushi avatar Apr 25 '22 14:04 BurntSushi