vroom icon indicating copy to clipboard operation
vroom copied to clipboard

Guess big integers?

Open rjake opened this issue 1 year ago • 0 comments

Our data warehouse started using integer64 values for our keys and without specifying, these come through as col_double(). We want to reduce errors when saving and reading in the data and I wanted to know if vroom could guess at the col_big_integer() column type for us? I'm afraid folks on the team won't remember to specify it and they will get duplicate values in their analyses. For example, dplyr::n_distinct(visit_key) would show 1 unique value.

x <- 
  I(
    "visit_key, name
    100000000000000100, A
    100000000000000101, B"
    #              ---
  )

vroom::vroom(x) |>
  dplyr::pull(visit_key)
#> 100000000000000096
#> 100000000000000096
#>                ---


vroom::vroom(
  x, 
  col_types = vroom::cols("visit_key" = vroom::col_big_integer())
) |> 
  dplyr::pull(visit_key)
#> 100000000000000100 
#> 100000000000000101
#>                ---

rjake avatar Jun 19 '23 16:06 rjake