DataFrame icon indicating copy to clipboard operation
DataFrame copied to clipboard

DataFrameCsvReader should ignore spaces after commas in CSV files

Open olekscode opened this issue 6 years ago • 2 comments

The following line of a CSV file:

first_name, last_name, order_date, amount

Will be parser as:

#('first_name' ' last_name' ' order_date' ' amount')

(with each string except the first one starting with a space).

Spaces after commas in CSV files should be ignored

olekscode avatar Aug 02 '19 15:08 olekscode

Shouldn't user just use , (comma space) as separator?

AtharvaKhare avatar Aug 02 '19 15:08 AtharvaKhare

I think the result should be trimmed from both sides. Because sometimes even data inside the same file is inconsistent.

In most cases, CSV files have data separated by commas, TSV files separate it by tabs. But then some people add extra spaces:

Oleks, 25, true

And others don't:

Oleks,25,true

It can be even more troublesome when there are tabs and the space can be invisible.

And then the users will be running into all kinds of problems. For example, when ' 25' can not be parsed as a number because there is a space.

So I think that it's better to trim whitespace characters from left and right when reading from a CSV. (but only unless quotes are used! Because if a file contains something like "Oleks", " 25 " then maybe clients want those spaces inside quotes)

olekscode avatar Jul 26 '21 13:07 olekscode