node-csv-parse icon indicating copy to clipboard operation
node-csv-parse copied to clipboard

column headers do not appear to be used if from_line > 1

Open phillip-odam opened this issue 3 years ago • 4 comments

When from_line greater than 1 and columns set to true the column headers aren't used as keys in the returned object from the read method.

Here's a test script used to look at the issue, are there more properties needing to be provided to parse so that each returned record includes the column headers?

"use strict";

const parse = require("csv-parse");
const fs = require("fs");

var csvParser = parse({
  "delimiter": ",",
  "quote": "\"",
  "record_delimiter": "\n",
  "from_line": 2,
  "columns": true
});

var initialExecution = true;

csvParser.on("readable", function () {
  if (initialExecution) {
     initialExecution = false;
     var record;
     while (record = csvParser.read()) {
        console.debug({ record: record });
        console.debug(Object.keys(record));
        break;
     }
  }
});

csvParser.on("error", function (err) {
  console.error({ error: err });
});

var readStream = fs.createReadStream("**SOME-CSV**");

readStream.on("open", function () {
  readStream.pipe(csvParser);
});

readStream.on("error", function (err) {
  console.error({ error: err });
});

If from_line in the script above is changed from 2 to 1 all works as desired, but if 2 or greater the columns aren't autodetected. Is the idea that the skipped lines aren't even being considered as valid CSV ie. this parser can be used to pull out a CSV from a greater file that isn't a CSV?

phillip-odam avatar Jun 11 '21 17:06 phillip-odam

Is the idea that the skipped lines aren't even being considered as valid CSV ie. this parser can be used to pull out a CSV from a greater file that isn't a CSV?

To be honest, I don't know if this is a feature or a bug. Some people might expect that the CSV start at a specific line (the current behavior), other expect to have a valid CSV, with header at the begining, and to start parsing records from a specific line (your case).

I am afraid that fixing this issue as you suggestion, we might break some code. On the other side, your request is valid. Probably that the solution is to introduce a new option, something like header_in_first_line or something even more flexible like header_at_line. If anyone come to read this exchange and has some opinion, please share with us.

wdavidw avatar Jun 11 '21 21:06 wdavidw

I tend to say, if it is not tested, it is not a feature. Well, it is tested so it is officially supported and we shall stick to it.

wdavidw avatar Jun 11 '21 21:06 wdavidw

It is also documented: https://csv.js.org/parse/options/from_line/#simple-example-with-inferred-column-names

wdavidw avatar Jun 11 '21 21:06 wdavidw

Thanks for the prompt feedback... was wondering at the time whether to raise as bug vs feature, almost want a third "I have a question" :)

phillip-odam avatar Jun 12 '21 00:06 phillip-odam