node-csv-stream
node-csv-stream copied to clipboard
Added support for multiple line endings in a single file
Hey!
I had a use case where a user would upload a csv file and then I'd parse it converting it to JSON. Since a user would submit a windows csv ( \r line endings ) or a *nix csv ( \n line endings), it was necessary for me to allow an or case when determining the value of this.endLine.
Therefore I introduced a bit of code to allow this. endLine can now be an array. To maintain backward compatibility, I did not mandate an array to be passed. I found this useful and thought perhaps it should be included.
Cheers
Hi @Lukazar,
Sorry for the delay, I had a quick look and it seems that the change make the parsing really slow compared to before.
before:
lbdremy@precision:~/workspace/nodejs/csv-stream$ node bench/parse.js
232 ms
43107.59482758621 bytes/ms
226 ms
44252.04424778761 bytes/ms
226 ms
44252.04424778761 bytes/ms
227 ms
44057.101321585906 bytes/ms
226 ms
44252.04424778761 bytes/ms
225 ms
44448.72 bytes/ms
227 ms
44057.101321585906 bytes/ms
229 ms
43672.323144104805 bytes/ms
226 ms
after:
lbdremy@precision:~/workspace/nodejs/csv-stream$ node bench/parse.js
1364 ms
7332.08357771261 bytes/ms
1363 ms
7337.462949376376 bytes/ms
1362 ms
7342.850220264318 bytes/ms
1363 ms
7337.462949376376 bytes/ms
1365 ms
7326.712087912088 bytes/ms
1364 ms
7332.08357771261 bytes/ms
1361 ms
I need to spend more time on it to find out how to fix it but in the same time if you want to have fun with it, you can have a look to what I did the first time to find out performance issues in this gist https://gist.github.com/lbdremy/3805481, it might help.
Ah, right you are! It is almost assuredly the Array functions. I bet switching the look up around line 82 to a hash will get some of that time back. I'll play around with it at some point. Thanks for the perf dump, that will help!
Cheers
fill myself like necromanct)) but will it be implemented?