PapaParse
PapaParse copied to clipboard
How to parse and unparse an array of strings that contain quotes and commas
I have an array of titles that I need to parse from a single column. These fields can contain quotes and/or commas. I'm not really sure what the best approach is.
When using unparse I have seen two results here depending on the input.
data;
// [ { name: 'Jim', titles: [ 'Jim "Jimson" Smith', 'Jim, The Great' ] } ]
let parsedWithoutJSON = Papa.unparse(data);
// 'name,titles\r\nJim,"Jim ""Jimson"" Smith,Jim, The Great"'
data[0].titles = JSON.stringify(data[0].titles);
data;
// [
// {
// name: 'Jim',
// titles: '["Jim \\"Jimson\\" Smith","Jim, The Great"]'
// }
// ]
let parsedWithJSON = Papa.unparse(data);
// 'name,titles\r\nJim,"[""Jim \\""Jimson\\"" Smith"",""Jim, The Great""]"'
However, when I try to parse the csv string that was output from unparse, parse returns differently than I expected.
Papa.parse(parsedWithJSON)
{
data: [
[ 'name', 'titles' ],
[ 'Jim', '["Jim \\"Jimson\\" Smith","Jim, The Great"]' ]
],
errors: [],
...
}
// expected [ 'Jim', [ 'Jim "Jimson" Smith', 'Jim, The Great' ] ]
Papa.parse(parsedWithoutJSON)
{
data: [
[ 'name', 'titles' ],
[ 'Jim', 'Jim "Jimson" Smith,Jim, The Great' ]
],
errors: [],
...
}
// expected [ 'Jim', [ 'Jim "Jimson" Smith', 'Jim, The Great' ] ]
Is this possible or is there a better way to do this? I've tried playing around with different config objects but haven't really gotten anywhere.
Thanks
It is not clear from the docs but papaparse can only unparse json/objects that are not nested...
Works
{
"Column 1": "foo",
"Column 2": "bar"
},
{
"Column 1": "foo2",
"Column 2": "bar2"
}
Does not work
{
"Column 1": "foo",
"Column 2": {
"Column 3": "bar"
}
}
{
"Column 1": "foo",
"Column 2": [ "nested"]
}
What your are seeing is that the array is converted to a string (the value of the title column) and as it also contains quotes, it must be quoted and the quotes inside must be escaped.
Unparsing also does not work with nesting.
@janisdd : What should I do if I have JSON sub-fields? Convert them?
It depends on the structure and application.
{
"Column 1": "foo",
"Column 2": {
"Column 3": "bar"
}
}
could be transformed to
{
"Column 1": "foo",
"Column 2.Column 3": "bar"
}
or something like this. But this probably does not work for more complicated structures.
Maybe csv is not the right choice here? Like this https://json-editor.github.io/json-editor/ (from https://github.com/json-editor/json-editor)