CsvHelper
CsvHelper copied to clipboard
LineBreakInQuotedFieldIsBadData Not Working since 28.0.0
I am trying to upgrade from version 27.2.1 to the latest and am running into an issue that was introduced in 28.0.0. I think it may be a bug.
I am reading using reader.GetRecordsAsync
My configuration is
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ",",
HasHeaderRecord = true,
TrimOptions = TrimOptions.Trim,
MissingFieldFound = ConfigurationFunctions.MissingFieldFound,
BadDataFound = ConfigurationFunctions.BadDataFound,
LineBreakInQuotedFieldIsBadData = true,
};
The case is if a record is cut off in the middle of a quotes field, such as:
Field1, Field2, Field3, Field4, Field5
1,2,3, “Text 1”,”Text
1,2,3, “Text 3”,”Text 4”
In 27.2.1, this gets a BadDataException.
In 28.0.0, no exception is thrown. On the first record Field5 gets set to mull, then on the second record all fields are set to null (all that data is lost)
I would like an exception thrown instead of succeeding with lost data. I can’t simply check for nulls values as that is a valid use case.
Can this be fixed back to 27.2.1 behavior? Or is there another option for detecting this?
Thanks
In your example, those characters aren't quotes. Looks like left and right quotes or something.
If I change them to quotes it throws a BadDataException
for me.
void Main()
{
var s = new StringBuilder();
s.Append("Field1, Field2, Field3, Field4, Field5\r\n");
s.Append("1,2,3, \"Text 1\",\"Text\r\n");
s.Append("1,2,3, \"Text 3\",\"Text 4\"\r\n");
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
TrimOptions = TrimOptions.Trim,
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvParser(reader, config))
{
while (csv.Read())
{
csv.Record.Dump();
}
}
}
Thanks for the quick reply. Sorry for the issue with the quotes, but that is not the issue. It must have gotten mangled as I moved the example to the machine I could use to post here.
Here is a Console app version of a complete example that illustrates the issue. I am using CsvReader, whereas you used CsvParser. The only differences from my actual code are I substituted a MemoryStream, and save the records to a List instead of doing a yield return.
using CsvHelper;
using CsvHelper.Configuration;
using System.Globalization;
using System.Text;
public class Program
{
static async Task Main()
{
Console.WriteLine("Starting");
var s = new StringBuilder();
s.Append("Field1, Field2, Field3, Field4, Field5\r\n");
s.Append("1,2,3, \"Text 1\",\"Text\r\n");
s.Append("1,2,3, \"Text 3\",\"Text 4\"\r\n");
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ",",
HasHeaderRecord = true,
TrimOptions = TrimOptions.Trim,
MissingFieldFound = ConfigurationFunctions.MissingFieldFound,
BadDataFound = ConfigurationFunctions.BadDataFound,
LineBreakInQuotedFieldIsBadData = true,
};
using var dataStream = new MemoryStream(Encoding.UTF8.GetBytes(s.ToString()));
using var textReader = (TextReader)new StreamReader(dataStream, new UTF8Encoding());
using var reader = new CsvReader(textReader, config);
reader.Context.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add("NULL");
reader.Context.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add("null");
reader.Context.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add("Null");
reader.Context.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add(string.Empty);
var records = new List<IDictionary<string, object>>();
try
{
await foreach (var record in reader.GetRecordsAsync<dynamic>().ConfigureAwait(false))
{
records.Add((IDictionary<string, object>)record);
}
}
catch
{
Console.WriteLine("Exception thrown in version 27.2.1. This is the behavior I expect");
throw;
}
Console.WriteLine("Exception not thrown in version 28.0.0. I would expect this would throw an exception");
}
}
This regressed in 542e301a. The reason is twofold:
-
DynamicRecordCreator
usesReader.TryGetField
when there is a header record, which swallows all exceptions (in particular, theBadDataException
) fromGetField
https://github.com/JoshClose/CsvHelper/blob/9848510ed0fc0980bda687882c5ea123b7d7d3ce/src/CsvHelper/Expressions/DynamicRecordCreator.cs#L43 - The
BadDataException
was previously (also) throwing from the invocation ofparser.Record
in https://github.com/JoshClose/CsvHelper/blob/ee3f8f20f4450032e78b3fee26eb2d6ba873b372/src/CsvHelper/CsvReader.cs#L245 which no longer gets called
Possible solutions:
- Use
Reader.GetField
inDynamicRecordCreator
instead ofReader.TryGetField
(breaking change, would require settingconfig.MissingFieldFound
to not throw in order to maintain behaviour) - Use
Reader.Parser[]
directly inDynamicRecordCreator
instead ofReader.TryGetField
(breaking change,MissingFieldFound
would no longer be fired) - Add an (internal)
GetField
which has a catch-all around theMissingFieldFound
invocation but allows theCsvParser
exceptions to bubble up
(1) makes the most sense to me but is probably the most breaking. (3) is probably the least breaking but the most fudge-y
Please fix this bug, or show me how to work around, hic