ConfigurationFunctions.ShouldQuote should not check for CsvConfiguration.Delimiter when CsvConfiguration.DetectDelimiter is true
Describe the bug
ConfigurationFunctions.ShouldQuote should not check for CsvConfiguration.Delimiter when CsvConfiguration.DetectDelimiter is true
To Reproduce Steps to reproduce the behavior:
- use
CsvConfiguration config = new CsvConfiguration(CultureInfo.InvariantCulture){DetectDelimiter = true}; - use
using (CsvReader csv = new CsvReader(streamReader, config)) - read a TSV file that is not quoted and contains a comma in the second column of the second row
- when using
CsvWriterto write the content back to a TSV,ConfigurationFunctions.ShouldQuotedoesn't check for the detected delimiter\tbut the default,
Expected behavior
ConfigurationFunctions.ShouldQuote should check for the detected delimiter \t instead of the default , so that the TSV generated with CsvWriter is identical to the original TSV read with CsvReader.
ShouldQuote is only used when writing. DetectDelimiter is only used when reading. There is something else going on for you.
Can you give an example that shows this? Here is what I have based on your reproduction steps.
void Main()
{
var s = new StringBuilder();
s.Append("Id\tName\r\n");
s.Append("1\tone\r\n");
s.Append("2\t,two\r\n");
s.Append("3\tthree\r\n");
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "\t",
DetectDelimiter = true,
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvReader(reader, config))
{
csv.GetRecords<Foo>().ToList().Dump();
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}

Here is what I have based on your reproduction steps.
You didn't follow my reproduction steps.
new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "\t",
DetectDelimiter = true,
};
is not the same as new CsvConfiguration(CultureInfo.InvariantCulture){DetectDelimiter = true};
If I use Delimiter = "\t" then it also works for me. But I don't know the delimiter in advance.
It has been 3 months since I was using this, so I don't remember the problem exactly.
Oops. Just remove the Delimiter = "\t" line. Same result.
The next release that will be coming out probably today will have delimiter detection fixes. If these don't fix your issue, please post here and I can open this again.
I forgot what the original problem was, but I looked into the source code just now.
The problem is in public static bool ShouldQuote(ShouldQuoteArgs args)
https://github.com/JoshClose/CsvHelper/blob/master/src/CsvHelper/Configuration/ConfigurationFunctions.cs#L103
in line || (config.Delimiter.Length > 0 && args.Field.Contains(config.Delimiter)) // Contains delimiter
Because I used DetectDelimiter = true, the detected delimiter was \t, but line 103 does not use the detected delimiter, but the default delimiter config.Delimiter - that causes a problem when writing the content back.
I don't remember what I was doing, but I guess I was trying to use a method that calls ShouldQuote.
My guess would be that I was trying to use public CsvWriter(TextWriter writer, CsvConfiguration configuration) that calls ShouldQuote in shouldQuote = configuration.ShouldQuote;
So the actual problem comes when writing this - the output TSV is not identical to the input TSV.
I am sorry that I was not specific enough.
@JoshClose I updated the original comment.
Oh... You read the data then use that same config to write and are expecting the delimiter from the detection to be used.
The problem here is that IParserConfiguration is passed into CsvParser where the delimiter detection is called and it's readonly, so we need to think if a different way to handle this. CsvReader.Parser.Delimiter is the updated value. After reading, you could set this value to the configuration before you pass it into the CsvWriter constructor.
Here is an example.
void Main()
{
var s = new StringBuilder();
s.Append("Id;Name\r\n");
s.Append("1;one\r\n");
List<Foo> records;
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
DetectDelimiter = true,
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvReader(reader, config))
{
csv.Context.RegisterClassMap<FooMap>();
records = csv.GetRecords<Foo>().ToList();
config.Delimiter = csv.Parser.Delimiter;
}
using (var writer = new StringWriter())
using (var csv = new CsvWriter(writer, config))
{
csv.WriteRecords(records);
csv.Flush();
writer.Dump();
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}
private class FooMap : ClassMap<Foo>
{
public FooMap()
{
Map(m => m.Id);
Map(m => m.Name);
}
}

I see. Thank you!
I remember now that I was trying to read a TSV and write it back the same way, without quotes and with automatically detected delimiter, but the generated TSV always had quotes for all the cells with a comma in the middle, even though the original didn't have quotes.
Perhaps reading a TSV and writing it back the same way could be explained in the Readme :)
It actually has nothing to do with a TSV. It will happen with any delimiter. Delimiter detection is the issue.
I believe this is the first time I've heard of someone needing to write a file in the same way it was ready. It's usually the other way around.
I'll mark this as documentation.