Why does "1176413S03" get converted to numeric when using `readr::type_convert` ?
Why does these values get converted into numeric when using readr::type_convert ? I would expect them to stay characters.
x <- c("1176413S03", "1176413S06", "1176413S02", "1176413S08", "1176413S05", "1176413S04")
df <- data.frame(x)
str(df)
'data.frame': 6 obs. of 1 variable:
$ x: chr "1176413S03" "1176413S06" "1176413S02" "1176413S08" ...
df1 <- readr::type_convert(df)
str(df1)
'data.frame': 6 obs. of 1 variable:
$ x: num 1.18e+09 1.18e+12 1.18e+08 1.18e+14 1.18e+11 ...
This behavior appears to happen only for the letters D, E, F, L & S (upper or lower case). E makes sense.
res <- purrr::map_chr(
.x = LETTERS,
.f = \(x) {
d <- readr::type_convert(
df = data.frame(
x = glue::glue("1176413{letter}03",letter = x)
)
)
class(d$x)
}
)
setNames(res,LETTERS)
> setNames(res,LETTERS)
A B C D E F
"character" "character" "character" "numeric" "numeric" "numeric"
G H I J K L
"character" "character" "character" "character" "character" "numeric"
M N O P Q R
"character" "character" "character" "character" "character" "character"
S T U V W X
"numeric" "character" "character" "character" "character" "character"
Y Z
"character" "character"
https://github.com/tidyverse/readr/blob/96ddac314b47402bc63e1f81c149c463cf58e3da/src/QiParsers.h#L157-L181
And vroom doesn't use qiparser
I think these may be for C floating point constants https://en.cppreference.com/w/c/language/floating_constant
e is exponent for decimal floating point p is exponent for hex floating point (rare) f is suffix for float l is suffix for long double (rare)
I don't know what s is, and a parser should really only be using e or no letters.
At the very least it should be documented, both in ?type_convert and there should be at least some reference to this behavior in ?read_csv & friends as well.
I agree, but looking at the state of the issue tracker and other reports listed I don't think it will be addressed anytime soon