PapaParse icon indicating copy to clipboard operation
PapaParse copied to clipboard

Skip lines future (Feature Suggestion)

Open DamirKh opened this issue 5 years ago • 7 comments

It would be great to have the option "skip N lines", since CSV files often contain not commented headers. Can this be implemented? For example:

This is a data file generated by some old software. Next line will contain a headers of parameters. Temperature, Humidity, Voltage 22.5, 45.5, 220 23.0, 44.0, 219 ... In such case "skipLines: 2" could be added to config.

Thanks!

DamirKh avatar Nov 14 '19 13:11 DamirKh

Yes, it makes sense to me. It will be great if you provide a PR adding support for it. Please also include a test case to ensure the behaviour is working correctly.

pokoli avatar Nov 14 '19 14:11 pokoli

My low English level does not allow me to be sure that I understood you correctly. I'm not a public person and my PR is limited to my friends only. What did You mean "include a test case"? Fork PapaParse and add a automatic test case? I have no any git tools at hand now, but, just to be sure I'm going to right direction, should I add following code to tests/test-cases.js and suggest a patch?

// DamirKh
    {
		description: "Skipped line at beginning",
		input: 'Skipped_line\na,b,c',
		config: { skipLines: 1 },
		expected: {
			data: [['a', 'b', 'c']],
			errors: []
		}
	},
    {
		description: "Two skipped lines at beginning",
		input: 'Skipped_line1\nSkipped_line2\na,b,c',
		config: { skipLines: 2 },
		expected: {
			data: [['a', 'b', 'c']],
			errors: []
		}
	},
    {
                description: "Skipped line at beginning followed by comment line",
		input: 'Skipped_line\n# Comment\na,b,c',
		config: { 
                        skipLines: 1,
                        comments: true
                },
		expected: {
			data: [['a', 'b', 'c']],
			errors: []
		}
	},

DamirKh avatar Nov 15 '19 05:11 DamirKh

Yes, you should include this kind of tests on test tsts-test-cases.js

I will only susgest to add the two of the three sugestesd tests as the first are the second are teting the same but only with a diferent behaviour. It only change the variable used.

pokoli avatar Nov 15 '19 08:11 pokoli

Why not just parse all and then remove the first N lines of the result??

Here is a one line solution

var result = Papa.parse('the csv data')
var N = 2

[...Array(N)].forEach(_ => result.data.pop())

//result.data has the first N=2 rows removed

janisdd avatar Jun 12 '21 17:06 janisdd

@jscheid, 1) To skip "header" (N lines). 2) To correct parsing with attribute "header". 3) For big files with streaming.

Now i have large file with 4 first rows is "header" (1 row column names, 3 rows props about columns), other data.

websharik avatar Jun 23 '21 15:06 websharik

Will it be merged & published?

Lonli-Lokli avatar May 17 '22 09:05 Lonli-Lokli

@Lonli-Lokli No, I won't merge the PR unless it is green and updated to latest version. So it needs some work in order to merge it.

pokoli avatar May 17 '22 09:05 pokoli

Implemented in #1021 1021

pokoli avatar Oct 09 '23 14:10 pokoli