PapaParse icon indicating copy to clipboard operation
PapaParse copied to clipboard

How to parse and unparse an array of strings that contain quotes and commas

Open kaedub opened this issue 5 years ago • 3 comments

I have an array of titles that I need to parse from a single column. These fields can contain quotes and/or commas. I'm not really sure what the best approach is.

When using unparse I have seen two results here depending on the input.

data;
//  [ { name: 'Jim', titles: [ 'Jim "Jimson" Smith', 'Jim, The Great' ] } ]

let parsedWithoutJSON = Papa.unparse(data);
//  'name,titles\r\nJim,"Jim ""Jimson"" Smith,Jim, The Great"'

data[0].titles = JSON.stringify(data[0].titles);
data;
// [
//   {
//     name: 'Jim',
//     titles: '["Jim \\"Jimson\\" Smith","Jim, The Great"]'
//   }
// ]

let parsedWithJSON = Papa.unparse(data);
// 'name,titles\r\nJim,"[""Jim \\""Jimson\\"" Smith"",""Jim, The Great""]"'

However, when I try to parse the csv string that was output from unparse, parse returns differently than I expected.

Papa.parse(parsedWithJSON)
{
  data: [
    [ 'name', 'titles' ],
    [ 'Jim', '["Jim \\"Jimson\\" Smith","Jim, The Great"]' ]
  ],
  errors: [],
  ...
}
// expected [ 'Jim', [ 'Jim "Jimson" Smith', 'Jim, The Great' ] ]

Papa.parse(parsedWithoutJSON)
{
  data: [
    [ 'name', 'titles' ],
    [ 'Jim', 'Jim "Jimson" Smith,Jim, The Great' ]
  ],
  errors: [],
  ...
}
// expected [ 'Jim', [ 'Jim "Jimson" Smith', 'Jim, The Great' ] ]

Is this possible or is there a better way to do this? I've tried playing around with different config objects but haven't really gotten anywhere.

Thanks

kaedub avatar Aug 06 '20 18:08 kaedub

It is not clear from the docs but papaparse can only unparse json/objects that are not nested...

Works

{
  "Column 1": "foo",
  "Column 2": "bar"
},
{
  "Column 1": "foo2",
  "Column 2": "bar2"
}

Does not work

{
  "Column 1": "foo",
  "Column 2": {
     "Column 3": "bar"
  }
}
{
  "Column 1": "foo",
  "Column 2": [ "nested"]
}

What your are seeing is that the array is converted to a string (the value of the title column) and as it also contains quotes, it must be quoted and the quotes inside must be escaped.

Unparsing also does not work with nesting.

janisdd avatar Jun 12 '21 17:06 janisdd

@janisdd : What should I do if I have JSON sub-fields? Convert them?

pedrobbarbosa avatar Nov 21 '21 03:11 pedrobbarbosa

It depends on the structure and application.

{
  "Column 1": "foo",
  "Column 2": {
     "Column 3": "bar"
  }
}

could be transformed to

{
  "Column 1": "foo",
  "Column 2.Column 3": "bar"
}

or something like this. But this probably does not work for more complicated structures.

Maybe csv is not the right choice here? Like this https://json-editor.github.io/json-editor/ (from https://github.com/json-editor/json-editor)

janisdd avatar Nov 21 '21 09:11 janisdd