readr
readr copied to clipboard
read_delim parsing issue with compressed file
Trying to read the attached file with read_delim
results in the following error:
Attached file: f.log.gz
r$> a = read_delim('f.log.gz', delim=' | ',col_names=F,col_types='cccc')
Warning message:
One or more parsing issues, call `problems()` on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)
r$> problems(a)
# A tibble: 1 × 5
row col expected actual file
<int> <int> <chr> <chr> <chr>
1 1494 3 4 columns 3 columns ""
r$> a[1494,]
# A tibble: 1 × 4
X1 X2 X3 X4
<chr> <chr> <chr> <chr>
1 15:26 07 | 三特東喰赤 いのけん(+33) まあぷ(-10) 陸奥陽之助(-23) NA
However, the indicated line does have 4 columns (note the |
on X2 column), and if I uncompress the file before calling read_delim it parses it fine.
I was not able to reduce the file further than this and still reproduce the issue, so it seems the issue is not related to that specific line.
Env info: Linux R 4.3.3 readr 2.1.5