odbc
odbc copied to clipboard
Integers not supported in Snowflake
Issue Description and Expected Result
R assumes that all integers are SQL_DECIMAL(38,0). Can integers in Snowflake be translated as integers in R and Big Integers in Snowflake translated as int64 in R?
Database
Snowflake
Reproducible Example
require(odbc)
#> Loading required package: odbc
#> Warning: package 'odbc' was built under R version 3.6.3
connString <- "driver=SnowflakeDSIIDriver;database=;
server=;
uid=;pwd=;
schema=;role=\"\";
warehouse="
connection = dbConnect(odbc(), .connection_string = connString)
name <- 'MDO_CUSTOM_DATA.TEMPDB.TESTINT'
splitName <- strsplit(name, ".", fixed=TRUE)[[1]]
id <- DBI::Id(database = splitName[[1]], schema = splitName[[2]], table = splitName[[3]])
value <- data.frame(intval = 1L, numericval = 1.0)
dbWriteTable(connection, name=id, value=value, temporary=TRUE, overwrite=TRUE)
dbGetQuery(conn=connection, sprintf("show columns in table %s", name))
#> table_name schema_name column_name
#> 1 TESTINT TEMPDB intval
#> 2 TESTINT TEMPDB numericval
#> data_type null? default
#> 1 {"type":"FIXED","precision":38,"scale":0,"nullable":true} true
#> 2 {"type":"REAL","nullable":true} true
#> kind expression comment database_name autoincrement
#> 1 COLUMN MDO_CUSTOM_DATA
#> 2 COLUMN MDO_CUSTOM_DATA
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 3.6.2 (2019-12-12)
#> os Windows 10 x64
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_United States.1252
#> ctype English_United States.1252
#> tz America/Chicago
#> date 2021-10-05
#>
#> - Packages -------------------------------------------------------------------
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
#> backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1)
#> callr 3.7.0 2021-04-20 [1] CRAN (R 3.6.3)
#> cli 3.0.1 2021-07-17 [1] CRAN (R 3.6.2)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
#> devtools 2.3.2 2020-09-18 [1] CRAN (R 3.6.3)
#> digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 3.6.3)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1)
#> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.2)
#> knitr 1.27 2020-01-16 [1] CRAN (R 3.6.2)
#> lifecycle 0.1.0 2019-08-01 [1] CRAN (R 3.6.2)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
#> pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 3.6.3)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 3.6.3)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.2)
#> processx 3.5.2 2021-04-30 [1] CRAN (R 3.6.3)
#> ps 1.5.0 2020-12-05 [1] CRAN (R 3.6.3)
#> purrr 0.3.3 2019-10-18 [1] CRAN (R 3.6.2)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.2)
#> Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 3.6.3)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 3.6.2)
#> rlang 0.4.9 2020-11-26 [1] CRAN (R 3.6.3)
#> rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.2)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 3.6.3)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
#> stringi 1.4.4 2020-01-09 [1] CRAN (R 3.6.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
#> testthat 3.0.1 2020-12-17 [1] CRAN (R 3.6.3)
#> usethis 2.0.0 2020-12-10 [1] CRAN (R 3.6.3)
#> withr 2.3.0 2020-09-22 [1] CRAN (R 3.6.3)
#> xfun 0.12 2020-01-13 [1] CRAN (R 3.6.2)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.2)
#>
#> [1] C:/Users/patri/OneDrive/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-3.6.2/library
Looks like no one is working on anything tagged as snowflake. I am happy to help here.
Is this just some type of data type mapping exercise between R and Snowflake? If that's the case, I can provide the data type mappings, just let me know where I should add the code.
Okay, I tried to make a fix with no success. The issue is not in the dbWriteTable method. In Snowflake, Integers and BigInt are stored as DECIMAL(38,0) in the database. The type DECIMAL(38,0) is assumed an Integer as you can see from the example below. Is there a way to automatically convert DECIMAL(38,0) to Integer in the dbGetQuery method?
This is an example of Snowflake assuming that DECIMAL(38,0) is Integer type.
CREATE OR REPLACE TEMPORARY TABLE "PUBLIC"."TESTINT" (
INTVAL DECIMAL(38,0),
INTVAL2 INTEGER
);
INSERT INTO "PUBLIC"."TESTINT" VALUES(1, 1);
select is_integer(INTVAL), is_integer(INTVAL2) FROM "PUBLIC"."TESTINT"
I'm able to reproduce your results, @patrickhowerter!
This does indeed seem to be an artifact of how Snowflake stores data:
Type | Notes |
---|---|
NUMBER | Default precision and scale are (38,0). |
DECIMAL, NUMERIC | Synonymous with NUMBER. |
INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT | Synonymous with NUMBER except precision and scale cannot be specified. |
odbc is able to write such that Snowflake interprets integers properly, but that information is obscured on read:
library(DBI)
library(odbc)
con <- dbConnect(odbc(), dsn = "Snowflake", pwd = Sys.getenv("snowflakePass"))
con
#> <OdbcConnection> Simon@Snowflake
#> Snowflake Version: 7.44.2
# make a test tbl and write it to snowflake:
tbl <- data.frame(integ = 1L, numer = 1, decim = 1.1)
vapply(tbl, class, character(1))
#> integ numer decim
#> "integer" "numeric" "numeric"
dbWriteTable(con, "tbl", tbl, overwrite = TRUE)
# see if writing obscured the information needed to discern... it did not:
dbGetQuery(con,
'select is_integer("integ"), is_integer("numer"), is_integer("decim")
from "tbl";')
#> IS_INTEGER("INTEG") IS_INTEGER("NUMER") IS_INTEGER("DECIM")
#> 1 TRUE FALSE FALSE
# on read, though:
vapply(dbReadTable(con, "tbl"), class, character(1))
#> integ numer decim
#> "numeric" "numeric" "numeric"
Created on 2023-12-21 with reprex v2.0.2
odbc defines the dbGetQuery()
method under the hood in dbReadTable.DBIConnection
, so we could have some sort of:
setMethod("dbGetQuery", c("Snowflake", "character"),
function(conn,
statement,
n = -1,
params = NULL,
immediate = is.null(params),
...) {
res <- callNextMethod(conn,
statement,
n = n,
params = params,
immediate = immediate,
...)
# correct col types
res
}
)
But the logic to figure out which columns to find types for—and how to find those types—for a generic query seems brittle. Perhaps odbc could get in the business of defining dbReadTable()
methods, though I'm not sure that's in scope for the package.
I think this is probably out of scope for odbc since we're relying on the driver to give us correct types. It's probably worth double checking that https://github.com/r-dbi/odbc/blob/main/src/odbc_result.cpp#L600-L683 is doing the right thing with the column types that we're getting from the snowflake driver, but I suspect this is either a snowflake or driver issue and there's nothing we can do here.
Heard. From my (limited) understanding of odbc_result.cpp
, I think there's not much we can do here; will go ahead and close.