PapaParse icon indicating copy to clipboard operation
PapaParse copied to clipboard

[Feature Request] Add a way to retrieve the line number and original line from row/error objects

Open jsg2021 opened this issue 5 years ago • 4 comments

I have additional object validation that I would like to report the line number of... but it is otherwise a valid csv line... I also have to collect all the error lines and provide them in a report (preferably as-is) so the user can diagnose why the line was a problem. I'm currently using a step() function and depending on that it is called for every row... and unparse() to regenerate the line... Ideally, i could move to the chunk() callback and just inspect each row/error for those values. (much faster)

I was thinking for rows that are treated like {}, a non-enumerable meta property can be added with a value like: {row: 1, rawText: ''}. Where row is the line number of the file, and rawText is the line including the line delimiter at the end. Error objects could just grow the rawText property.

Thoughts?

jsg2021 avatar Dec 21 '18 20:12 jsg2021

Is not possible to determine the line number depending on the position on the returned array?

pokoli avatar Dec 27 '18 10:12 pokoli

Not reliably. Especially when dealing with errors/comments etc

jsg2021 avatar Dec 27 '18 21:12 jsg2021

This is legitimate. There's a lot of things we do in this library that can result in non-preservation of the original line numbers.

That said I think it's a pretty big change in the row output and I don't think it's safe to add any type of arbitrary key that might conflict with a header being imported. In order to support this, we'd need to change the row output to something like. We could add an enableMeta option or something that would cause this type of output, but definitely might result in some end-user confusion with the payload difference.

Thoughts? @pokoli @jsg2021

{
  "meta": {},
  "data": {}|[]
}

dboskovic avatar Jan 06 '19 21:01 dboskovic

Plus one for this.

Some additional reasons why the data / errors indices might not correspond to actual line numbers:

  • Lines being skipped due to being empty
  • beforeFirstChunk transformation (this one could be tricky)
  • Quoted fields containing an unescaped linebreak (see screenshots)

image

image

flakey-bit avatar May 24 '22 20:05 flakey-bit