Sylvan.Data.Excel icon indicating copy to clipboard operation
Sylvan.Data.Excel copied to clipboard

Using Excel library and Csv library to read .csv, .xls, and .xlsx files respectively, the experience is inconsistent, and there are even logic errors.

Open lileyzhao opened this issue 3 months ago • 4 comments

Using only a simple two-row, two-column table, create .xlsx, .xls, and .csv files. e.g.

name age
nike 18

Here are the reading codes and output results for the three formats.: default options: xls file

using ExcelDataReader xls = ExcelDataReader.Create("d://dowhile.xls");
do
{
    var name = xls.GetString(0);
    var age = xls.GetString(1);
    Console.WriteLine($"{name}-{age}");
} while (xls.Read());

xls file output

nike 18
nike 18

xlsx file

using ExcelDataReader xls = ExcelDataReader.Create("d://dowhile.xlsx");
do
{
    var name = xls.GetString(0);
    var age = xls.GetString(1);
    Console.WriteLine($"{name} {age}");
} while (xls.Read());

xlsx file output

name age
nike 18

csv file

using var csv = CsvDataReader.Create("d://dowhile.csv");
do
{
    var name = csv.GetString(0);
    var age = csv.GetString(1);
    Console.WriteLine($"{name} {age}");
} while (csv.Read());

csv file output

nike 18
nike 18

if option hasHeader=false Another scenario.

xls file

using ExcelDataReader xls = ExcelDataReader.Create("d://dowhile.xls", new ExcelDataReaderOptions { Schema = ExcelSchema.NoHeaders });
do
{
    var name = xls.GetString(0);
    var age = xls.GetString(1);
    Console.WriteLine($"{name}-{age}");
} while (xls.Read());

xls file output

name age
name age
nike 18

xlsx file

using ExcelDataReader xls = ExcelDataReader.Create("d://dowhile.xlsx", new ExcelDataReaderOptions { Schema = ExcelSchema.NoHeaders });
do
{
    var name = xls.GetString(0);
    var age = xls.GetString(1);
    Console.WriteLine($"{name} {age}");
} while (xls.Read());

xlsx file output

name age
nike 18

csv file

using var csv = CsvDataReader.Create("d://dowhile.csv", new CsvDataReaderOptions { HasHeaders = false });
do
{
    var name = csv.GetString(0);
    var age = csv.GetString(1);
    Console.WriteLine($"{name} {age}");
} while (csv.Read());

csv file output

name age
name age
nike 18

lileyzhao avatar Mar 29 '24 06:03 lileyzhao

You're observations are probably correct, but based on unexpected/incorrect usage of the library. You should always call Read before accessing fields with GetString, etc. If you change all the code examples from do/while to while(reader.Read()) the behavior should be consistent.

I should modify all the libraries to throw an InvalidOperationException if any field is accessed before Read is called, to indicate that this is not an expected access pattern.

MarkPflug avatar Mar 29 '24 14:03 MarkPflug

You are right, I have now switched to using while(read()) to resolve the inconsistency issue. In fact, initially I was using while(read()), but I tried using do-while due to another very exceptional problem. I will submit another very exceptional problem shortly, please wait.

lileyzhao avatar Mar 29 '24 14:03 lileyzhao

I will leave this issue open as a reminder that I should make GetString et al throw when the reader hasn't yet been advanced (before Read) or after the end of the data has been reader when Read returns false.

MarkPflug avatar Mar 29 '24 14:03 MarkPflug

I have submitted a new issue, please take a look there for the root cause of the problem.

lileyzhao avatar Mar 29 '24 15:03 lileyzhao

This is fixed in 0.4.20, just pushed to nuget.

MarkPflug avatar Apr 25 '24 22:04 MarkPflug