vroom
vroom copied to clipboard
line number in problems not correct after commented rows.
Because we aren't keeping track of how many lines were skipped.
I think I hit this too, but it plays into a slightly larger issue which makes interpreting the output of problems more difficult.
For example:
library(tidyverse)
library(vroom)
tibble(
A=c(1,"two"),
B=c(1,2)
) %>%
write_csv("import_bug.csv")
vroom(
"import_bug.csv",
col_types=cols(A=col_double()))-> data
problems(data)
data%>%slice(3)
data%>%slice(2)
Here problems reports a problem on row 3, but when you look at row 3 in the data there's no problem. That's because it's not counting the header, and it's actually on row 2, which is line 3.
tibble(
A=c("#skip",1,"two"),
B=c(0,1,2)
) %>%
write_csv("import_bug2.csv")
vroom(
"import_bug2.csv",
col_types=cols(A=col_double()),
comment="#")-> data2
problems(data2)
data%>%slice(3)
data%>%slice(2)
In this instance problems still reports the issue on row 3, but now 3 is neither the row in the tibble, nor the line in the file.
Ideally it would be nice if problems were to report both row (the row in the returned tibble) and line (the line in the parsed file) where the parsing failure occurred.
Note also that the reported problems are also messed up by skip_empty_rows and skip arguments.