odbc icon indicating copy to clipboard operation
odbc copied to clipboard

datetime2 are not parsed when reading from MS-SQL Server with `SQL Server` ODBC Driver

Open holgerbrandl opened this issue 4 years ago • 9 comments

Schema and example data

create table Custom.WipForecast
(
    OperationId           bigint         not null
    StartTimelineMinuteId datetime2      not null,
    ForecastWindowMinutes bigint         not null,
    ForecastValue         decimal(18, 2) not null,
    ModelName             nvarchar(50)
)

INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:00:00.0000000', 60, 0.13, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:06:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:12:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:18:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:24:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:30:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:36:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:42:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:48:00.0000000', 60, 2.40, N'last_value');
INSERT INTO Custom.WipForecast (OperationId, StartTimelineMinuteId, ForecastWindowMinutes, ForecastValue, ModelName) VALUES (16786238, N'2020-08-01 11:54:00.0000000', 60, 2.40, N'last_value');

Example Script

pacman::p_load(tidyverse, magrittr,  lubridate)
pacman::p_load(DBI, odbc, dbplyr)

sessionInfo()

localCon <- dbConnect(odbc(),
Driver = "SQL Server",
Server = "localhost\\sql2019",
Database = "db_name",
Trusted_Connection = "True"
)

tbl(localCon, in_schema("Custom", "WipForecast")) %>% head(5) %>% collect()

Result vs Expected

> tbl(localCon, in_schema("Custom", "WipForecast")) %>% head(5) %>% collect() 
Observations: 5
Variables: 5
$ OperationId           <int64> 16786238, 16786238, 16786238, 16786238, 167...
$ StartTimelineMinuteId <chr> "2020-08-01 11:00:00.0000000", "2020-08-01 11...
$ ForecastWindowMinutes <int64> 60, 60, 60, 60, 60
$ ForecastValue         <dbl> 0.13, 2.40, 2.40, 2.40, 2.40
$ ModelName             <chr> "last_value", "last_value", "last_value", "la...

Expected: The column type of StartTimelineMinuteId should be ddtm

Obtained: The column type of StartTimelineMinuteId is chr.

Database

SQL Server 64bit 15.0.4063.15

Session Info

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] dbplyr_2.0.0    odbc_1.3.0      DBI_1.1.0       lubridate_1.7.4
 [5] magrittr_1.5    forcats_0.4.0   stringr_1.4.0   dplyr_1.0.0
 [9] purrr_0.3.3     readr_1.3.1     tidyr_1.0.0     tibble_2.1.3
[13] ggplot2_3.2.1   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3       cellranger_1.1.0 pillar_1.4.3     compiler_3.6.1
 [5] tools_3.6.1      bit_1.1-14       jsonlite_1.6     lifecycle_0.2.0
 [9] nlme_3.1-140     gtable_0.3.0     lattice_0.20-38  pkgconfig_2.0.3
[13] rlang_0.4.6      reprex_0.3.0     cli_2.0.2        rstudioapi_0.10
[17] haven_2.2.0      withr_2.1.2      xml2_1.2.2       httr_1.4.1
[21] fs_1.3.1         generics_0.0.2   vctrs_0.3.1      hms_0.5.2
[25] bit64_0.9-7      grid_3.6.1       tidyselect_1.1.0 glue_1.4.1
[29] R6_2.4.1         fansi_0.4.1      readxl_1.3.1     pacman_0.5.3
[33] blob_1.2.0       modelr_0.1.5     backports_1.1.5  scales_1.1.0
[37] rvest_0.3.5      assertthat_0.2.1 colorspace_1.4-1 utf8_1.1.4
[41] stringi_1.4.3    lazyeval_0.2.2   munsell_0.5.0    broom_0.5.2
[45] crayon_1.3.4

holgerbrandl avatar Nov 13 '20 10:11 holgerbrandl

I found a workaround: There are multiple odbc drivers on my system:

> odbc::odbcListDrivers()
                                name        attribute value
1                         SQL Server         APILevel     2
2                         SQL Server ConnectFunctions   YYY
3                         SQL Server        CPTimeout    60
4                         SQL Server    DriverODBCVer 03.50
5                         SQL Server        FileUsage     0
6                         SQL Server         SQLLevel     1
7                         SQL Server       UsageCount     1
8      SQL Server Native Client 11.0       UsageCount     1
9      SQL Server Native Client 11.0         APILevel     2
10     SQL Server Native Client 11.0 ConnectFunctions   YYY
11     SQL Server Native Client 11.0        CPTimeout    60
12     SQL Server Native Client 11.0    DriverODBCVer 03.80
13     SQL Server Native Client 11.0        FileUsage     0
14     SQL Server Native Client 11.0         SQLLevel     1
15 SQL Server Native Client RDA 11.0       UsageCount     1
16 SQL Server Native Client RDA 11.0         APILevel     2
17 SQL Server Native Client RDA 11.0 ConnectFunctions   YYY
18 SQL Server Native Client RDA 11.0        CPTimeout    60
19 SQL Server Native Client RDA 11.0    DriverODBCVer 03.80
20 SQL Server Native Client RDA 11.0        FileUsage     0
21 SQL Server Native Client RDA 11.0         SQLLevel     1
22     ODBC Driver 17 for SQL Server       UsageCount     1
23     ODBC Driver 17 for SQL Server         APILevel     2
24     ODBC Driver 17 for SQL Server ConnectFunctions   YYY
25     ODBC Driver 17 for SQL Server        CPTimeout    60
26     ODBC Driver 17 for SQL Server    DriverODBCVer 03.80
27     ODBC Driver 17 for SQL Server        FileUsage     0
28     ODBC Driver 17 for SQL Server         SQLLevel     1

Not sure who or what installed these different versions. But if I use in the example above the driver ODBC Driver 17 for SQL Server instead of SQL Server dates are parsed correctly. There is on downside: This driver does not seem to support Trusted_Connection = "True" but relies on a user-name and password.

So I have a workaround, and I'm not sure if this still qualifies as a bug. The problem still applies when using SQL Server which is what the official (?) docs recommend, see https://db.rstudio.com/databases/microsoft-sql-server/#ms-sql-express

holgerbrandl avatar Nov 13 '20 11:11 holgerbrandl

In contast to how SQL Server driver requires a Trusted_Connection = "True" argument we need to use Trusted_Connection = "Yes" when using ODBC Driver 17 for SQL Server.

Since the latter is the most modern driver, I think this would be the best practice approach.

The ticket is still valid, as date parsing does not work when using SQL Server Odbc driver.

holgerbrandl avatar Nov 13 '20 16:11 holgerbrandl

Thanks, seeing the same with the FreeTDS driver.

krlmlr avatar Dec 23 '20 06:12 krlmlr

Confirming that the issue no longer occurs after switching to MSSQL driver.

Should we raise this on https://github.com/FreeTDS/freetds?

krlmlr avatar Dec 23 '20 06:12 krlmlr

@krlmlr I am struggling to reproduce with freeTDS 1.2.11 - in that it seems to me DATETIME2 is coming back as POSIXct (maybe i misunderstood the issue).

There I think you need to make sure you have selected an appropriate version of the TDS protocol - in particular looks like 7.3 or newer supports DATETIME2. I usually do this setting the TDSVER environment variable.

> odbcConnectionColumns(conn = con, name = table_id)$field.type
[1] "datetime2"
> str(dbReadTable(con, table_id))
'data.frame':   1 obs. of  1 variable:
 $ c1: POSIXct, format: "2020-12-23 17:17:59"
> con@info$drivername
[1] "libtdsodbc.so"
> Sys.getenv("TDSVER")
[1] "7.4"

detule avatar Dec 23 '20 17:12 detule

This is about DATE, not DATETIME2 .

krlmlr avatar Dec 24 '20 07:12 krlmlr

...at least my issue is, the original problem is instead about DATETIME2 . :facepalm:

I'm using TDS version 8.0 .

krlmlr avatar Dec 24 '20 07:12 krlmlr

We are off-topic here from the original issue, but for me both DATE and DATETIME2 come back appropriately class-ed with FreeTDS:

> values <- data.frame(c1 = Sys.Date(), c2 = Sys.time())
> dbWriteTable(con, "test_date_time2", values, field.types = list(c1 = "DATE", c2= "DATETIME2"), overwrite = TRUE)
> odbcConnectionColumns(con, "test_date_time2")$field.type
[1] "date"      "datetime2"
> str(DBI::dbReadTable(con, "test_date_time2"))
'data.frame':   1 obs. of  2 variables:
 $ c1: Date, format: "2020-12-24"
 $ c2: POSIXct, format: "2020-12-24 14:07:33"
> con@info$drivername
[1] "libtdsodbc.so"

For what is worth I think TDS version 8.0 is deprecated / was intended as a placeholder for 7.1 before the protocol version was announced.

detule avatar Dec 24 '20 14:12 detule

TDS 7.0 < 8.0 < 7.1 < 7.x :astonished:

Confirming that the test now passes. Seeing another problem with the next test, opening a new issue.

krlmlr avatar Dec 25 '20 07:12 krlmlr

Since it sounds like a driver issue, I'm closing here.

hadley avatar Apr 24 '23 15:04 hadley