vroom icon indicating copy to clipboard operation
vroom copied to clipboard

line number in problems not correct after commented rows.

Open jimhester opened this issue 4 years ago • 1 comments

Because we aren't keeping track of how many lines were skipped.

jimhester avatar Jan 26 '21 16:01 jimhester

I think I hit this too, but it plays into a slightly larger issue which makes interpreting the output of problems more difficult.

For example:


library(tidyverse)
library(vroom)
tibble(
  A=c(1,"two"),
  B=c(1,2)
) %>%
  write_csv("import_bug.csv")

vroom(
  "import_bug.csv", 
  col_types=cols(A=col_double()))-> data

problems(data)

data%>%slice(3)

data%>%slice(2)

Here problems reports a problem on row 3, but when you look at row 3 in the data there's no problem. That's because it's not counting the header, and it's actually on row 2, which is line 3.

tibble(
  A=c("#skip",1,"two"),
  B=c(0,1,2)
) %>%
  write_csv("import_bug2.csv")

vroom(
  "import_bug2.csv", 
  col_types=cols(A=col_double()),
  comment="#")-> data2

problems(data2)

data%>%slice(3)
data%>%slice(2)

In this instance problems still reports the issue on row 3, but now 3 is neither the row in the tibble, nor the line in the file.

Ideally it would be nice if problems were to report both row (the row in the returned tibble) and line (the line in the parsed file) where the parsing failure occurred.

Note also that the reported problems are also messed up by skip_empty_rows and skip arguments.

s-andrews avatar Oct 08 '21 13:10 s-andrews