GetValue of CsvDataReader doesnt support TypeConverter
GetValue of CsvDataReader doesnt support TypeConverter and always returns string. In the scenario where it is used along with SqlBulkCopy, there is no way to do type conversion of incoming values and for example if CSV uses some non standart date values that causes and issue. Ability to pass index based type converter would be really useful. For example it could take collection of index based converters or column map etc. In this scenario it would be very big overhead to use POCOs to map data types and even that in my understanding is not supported in current version.
Are you passing in an IDataReader and not a DataTable to SqlBulkCopy.WriteToServer?
Thats correct. That would be goal of using CsvDataReader not to read whole file to memory as file can be very large.
Ok. In that case we have no clue what the types are. There will need to be some way to specify a mapping or definition of indexes/names to types. Do you know what methods get called during a bulk copy besides object GetValue(int index)?
my observation was that only object GetValue(int index) is called and obviously Read. I would say index based dictionary with type converter would do a job in the begining.
I'm trying to think of the best way to do this. If it doesn't use int GetOrdinal(string name), it would be pretty simple to just pass in an IDictionary<int, Type> to the constructor that maps the index to the type. If the name is used, it would need to be a little more complex. I don't have SQL Server installed... I looked at the source and I'm only seeing is FieldCount, Read and GetValue. There could be others though. Would you be able to create an IDataReader class and see what is called? Just leave the implementations as throws and fill them in as they get called. You could just take the CsvDataReader source too and set breakpoints.
Can confirm that from my observation only FieldCount, Read and GetValue being used.
As per providing converters, since we have facility in CsvReader to read header and identify columns, we can have another overload which takes column names and converters and then resolves to their respective indexes...
I'm also interested in this feature. Have a 4GB csv file I'm importing using CsvDataReader and SqlBulkCopy and some rows may have bad data. Any update or workarounds?
Did you end up coming up with a solution? I have the exact same issue with using CsvDataReader and SqlBulkCopy. I may end up writing an adapter around CsvDataReader like so:
public CsvDataReaderAdapter(CsvDataReader reader, IDictionary<string, Type> nameToTypeMapper)
and then handle checking the type and converting to the correct object in public object GetValue(int i)
I would be curious if you found a better way to do this?
I'm curious as well.
Turns out in order to correctly specify types you just need to return a DataTable containing the destination schema from the GetSchemaTable method in IDataReader. I've submitted a PR #1435 which allows a schema DataTable to be specified in the constructor which seems to do the trick.
PR 1435 helps a lot, but still doesn't account for type conversions. Maybe it's the version of SqlBulkCopy that I'm using, but it does not call any of the Get(type) methods.. Instead it only ever calls GetValue(ordinal).
I was able to fix this for my situation (loading 0/1 into bit fields as false/true) by copying the CsvDataReader class and changing the GetValue method to:
public object GetValue(int i) { if (IsDBNull(i)) return DBNull.Value;
if (GetFieldType(i) == typeof(int)) return GetInt32(i);
if (GetFieldType(i) == typeof(bool)) return GetBoolean(i);
return _csv.GetField(i) ?? string.Empty;
}
This looks up the type from the schema provided in the constructor and calls the appropriate Get(type).. which then allows the underlying csvReader to use the type converters and lists (e.g., I had to add 0 to BooleanFalseValues and 1 to BooleanTrueValues). Without the change to GetValue, adding those things to the TypeConverterOptionsCache alone did not work.