node-csv-parse
node-csv-parse copied to clipboard
trim() doesn't remove all whitespace characters
Describe the bug
In the documentation for trim(), it's mentioned that the function will trim all whitespace around delimiters identically to \s
:
The characters interpreted as whitespaces are identical to the \s meta character in regular expressions
However, in practice, and inspecting the source of __isCharTrimable(), I can see that it only is trimming the 5 examples characters from the docs, rather than the full list of characters that \s
actually matches on. In the MDN Javascript docs, as well as in practice (in the example below in NodeJS), some whitespace characters should be expected to be matched and removed, but aren't.
To Reproduce
This example shows that the values containing whitespace characters not listed in the trim() docs aren't getting removed, but they are when using the regex \s
.
const csvParse = require('csv-parse');
/**
* First whitespace is 'IDEOGRAPHIC SPACE' (U+3000)
* Second whitespace is 'HAIR SPACE' (U+200A)
*/
const input = `ID,First,Last
1234, John,Smith
5678,Jane, Doe`;
csvParse(input, {
columns: true,
trim: true
}, (err, res) => {
if (err) { throw err; }
console.log(res);
const trimmed = res.map((r) => {
return {
...r,
First: r.First.replace(/^\s+|\s+$/g, ''),
Last: r.Last.replace(/^\s+|\s+$/g, '')
};
});
console.log(trimmed);
});
Additional context
This isn't a case where the code is necessarily broken or incorrect, as it's correctly matching the 5 whitespace characters mentioned in the docs. However, the line in the docs
identical to the \s meta character in regular expressions
incorrectly describes what the output will be.
this was affecting me as well. I resolved it with my own trim but then found this.