PapaParse
PapaParse copied to clipboard
Skip lines future (Feature Suggestion)
It would be great to have the option "skip N lines", since CSV files often contain not commented headers. Can this be implemented? For example:
This is a data file generated by some old software. Next line will contain a headers of parameters. Temperature, Humidity, Voltage 22.5, 45.5, 220 23.0, 44.0, 219 ... In such case "skipLines: 2" could be added to config.
Thanks!
Yes, it makes sense to me. It will be great if you provide a PR adding support for it. Please also include a test case to ensure the behaviour is working correctly.
My low English level does not allow me to be sure that I understood you correctly. I'm not a public person and my PR is limited to my friends only. What did You mean "include a test case"? Fork PapaParse and add a automatic test case? I have no any git tools at hand now, but, just to be sure I'm going to right direction, should I add following code to tests/test-cases.js and suggest a patch?
// DamirKh
{
description: "Skipped line at beginning",
input: 'Skipped_line\na,b,c',
config: { skipLines: 1 },
expected: {
data: [['a', 'b', 'c']],
errors: []
}
},
{
description: "Two skipped lines at beginning",
input: 'Skipped_line1\nSkipped_line2\na,b,c',
config: { skipLines: 2 },
expected: {
data: [['a', 'b', 'c']],
errors: []
}
},
{
description: "Skipped line at beginning followed by comment line",
input: 'Skipped_line\n# Comment\na,b,c',
config: {
skipLines: 1,
comments: true
},
expected: {
data: [['a', 'b', 'c']],
errors: []
}
},
Yes, you should include this kind of tests on test tsts-test-cases.js
I will only susgest to add the two of the three sugestesd tests as the first are the second are teting the same but only with a diferent behaviour. It only change the variable used.
Why not just parse all and then remove the first N
lines of the result??
Here is a one line solution
var result = Papa.parse('the csv data')
var N = 2
[...Array(N)].forEach(_ => result.data.pop())
//result.data has the first N=2 rows removed
@jscheid, 1) To skip "header" (N lines). 2) To correct parsing with attribute "header". 3) For big files with streaming.
Now i have large file with 4 first rows is "header" (1 row column names, 3 rows props about columns), other data.
Will it be merged & published?
@Lonli-Lokli No, I won't merge the PR unless it is green and updated to latest version. So it needs some work in order to merge it.
Implemented in #1021 1021