PapaParse icon indicating copy to clipboard operation
PapaParse copied to clipboard

Data is surrounded by Quotes

Open FirstVertex opened this issue 5 years ago • 4 comments

Not sure what I'm doing wrong. But all the fields are enclosed in quotes.

In my input file (see below), as you can see, all the strings are enclosed in quotes.

Here's my code:

  readCSV(file: File) {
    Papa.parse(file, {header: true,
      complete(results) {
          console.log(results.data);
       }
     });
  }

Here's the records that are outputted:

[{Name: "Adam Donachie", " "Team"": " "BAL"", " "Position"": " "Catcher"", " "Height(inches)"": " 74", " "Weight(lbs)"": " 180", …}
{Name: "Paul Bako", " "Team"": " "BAL"", " "Position"": " "Catcher"", " "Height(inches)"": " 74", " "Weight(lbs)"": " 215", …}
{Name: "Ramon Hernandez", " "Team"": " "BAL"", " "Position"": " "Catcher"", " "Height(inches)"": " 72", " "Weight(lbs)"": " 210", …}
{Name: "Kevin Millar", " "Team"": " "BAL"", " "Position"": " "First Baseman"", " "Height(inches)"": " 72", " "Weight(lbs)"": " 210", …}
{Name: "Chris Gomez", " "Team"": " "BAL"", " "Position"": " "First Baseman"", " "Height(inches)"": " 73", " "Weight(lbs)"": " 188", …}
{Name: "Brian Roberts", " "Team"": " "BAL"", " "Position"": " "Second Baseman"", " "Height(inches)"": " 69", " "Weight(lbs)"": " 176", …}
{Name: "Miguel Tejada", " "Team"": " "BAL"", " "Position"": " "Shortstop"", " "Height(inches)"": " 69", " "Weight(lbs)"": " 209", …}
{Name: "Melvin Mora", " "Team"": " "BAL"", " "Position"": " "Third Baseman"", " "Height(inches)"": " 71", " "Weight(lbs)"": " 200", …}
{Name: "Aubrey Huff", " "Team"": " "BAL"", " "Position"": " "Third Baseman"", " "Height(inches)"": " 76", " "Weight(lbs)"": " 231", …}
{Name: "Adam Stern", " "Team"": " "BAL"", " "Position"": " "Outfielder"", " "Height(inches)"": " 71", " "Weight(lbs)"": " 180", …}]

See the Name column is correct. All the other columns are surrounded in quotes.

Here's the input file:

"Name", "Team", "Position", "Height(inches)", "Weight(lbs)", "Age"
"Adam Donachie", "BAL", "Catcher", 74, 180, 22.99
"Paul Bako", "BAL", "Catcher", 74, 215, 34.69
"Ramon Hernandez", "BAL", "Catcher", 72, 210, 30.78
"Kevin Millar", "BAL", "First Baseman", 72, 210, 35.43
"Chris Gomez", "BAL", "First Baseman", 73, 188, 35.71
"Brian Roberts", "BAL", "Second Baseman", 69, 176, 29.39
"Miguel Tejada", "BAL", "Shortstop", 69, 209, 30.77
"Melvin Mora", "BAL", "Third Baseman", 71, 200, 35.07
"Aubrey Huff", "BAL", "Third Baseman", 76, 231, 30.19
"Adam Stern", "BAL", "Outfielder", 71, 180, 27.05

FirstVertex avatar Oct 30 '19 20:10 FirstVertex

I just discovered this bug, too. Those are probably not valid CSV files (with spaces after the comma), but here's a workaround. I use a call like

      const data = Papa.parse(text, {
        transform: datum => {
          // Hack: Papa doesn't remove quotes on fields with leading spaces.
          datum = datum.trim()
          // Make sure whole field is quoted after trimming.
          if (datum[0] == '"' && datum[datum.length - 1] == '"') {
            datum = datum.slice(1)
            if (datum[datum.length - 1] == '"')
              datum = datum.slice(0, datum.length - 1)
          }
          return datum
        }}).data

cpryland avatar Nov 01 '19 12:11 cpryland

It'd be great if there were an option to handle this case, or if it were handled automatically (there's really no other interpretation one can make).

My only fear with this workaround is that quoted quotes (doubled) won't be handled properly.

cpryland avatar Nov 01 '19 12:11 cpryland

This is the same issue as https://github.com/mholt/PapaParse/issues/330 which was closed without a fix. I agree this is a common enough use case that having a configuration option to trim whitespace around delimiters would be worthwhile - especially if each value is quoted.

drc-nloftsgard avatar Mar 17 '22 17:03 drc-nloftsgard