Eunomia icon indicating copy to clipboard operation
Eunomia copied to clipboard

A new GiBleed dataset is always downloaded breaking unittests on R 4.4.0

Open mvankessel-EMC opened this issue 1 year ago • 3 comments

A new GiBleed dataset is always downloaded when getting the Eunomia ConnectionDetails with Eunomia::getEunomiaConnectionDetails(), even though both the zip,- and sqlite-file exist already.

When running the the function it will also throw an additional error, when ran in an enclose environment (unittest, GitHub Actions, reprex).

#> Error: table person already exists

as it is trying to overwrite the existing CDM.

file.exists(file.path(Sys.getenv("EUNOMIA_DATA_FOLDER"), "GiBleed_5.3.sqlite"))
#> [1] TRUE

connectionDetails <- Eunomia::getEunomiaConnectionDetails()
#> attempting to download GiBleed
#> attempting to extract and load: D:/Users/mvankessel/Documents/EunomiaCache/GiBleed_5.3.zip to: D:/Users/mvankessel/Documents/EunomiaCache/GiBleed_5.3.sqlite
#> Error: table person already exists

Created on 2024-05-22 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24 ucrt)
#>  os       Windows 11 x64 (build 22631)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Dutch_Netherlands.utf8
#>  ctype    Dutch_Netherlands.utf8
#>  tz       Europe/Amsterdam
#>  date     2024-05-22
#>  pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  ! package           * version date (UTC) lib source
#>    backports           1.4.1   2021-12-13 [1] CRAN (R 4.4.0)
#>    bit                 4.0.5   2022-11-15 [1] CRAN (R 4.4.0)
#>    bit64               4.0.5   2020-08-30 [1] CRAN (R 4.4.0)
#>    blob                1.2.4   2023-03-17 [1] CRAN (R 4.4.0)
#>    cachem              1.1.0   2024-05-16 [1] CRAN (R 4.4.0)
#>    checkmate           2.3.1   2023-12-04 [1] CRAN (R 4.4.0)
#>    cli                 3.6.2   2023-12-11 [1] CRAN (R 4.4.0)
#>    CommonDataModel     0.2.0   2024-02-07 [1] CRAN (R 4.4.0)
#>    DatabaseConnector   6.3.2   2023-12-11 [1] CRAN (R 4.4.0)
#>    DBI                 1.2.2   2024-02-16 [1] CRAN (R 4.4.0)
#>    digest              0.6.35  2024-03-11 [1] CRAN (R 4.4.0)
#>    Eunomia             2.0.0   2024-04-23 [1] CRAN (R 4.4.0)
#>    evaluate            0.23    2023-11-01 [1] CRAN (R 4.4.0)
#>    fansi               1.0.6   2023-12-08 [1] CRAN (R 4.4.0)
#>    fastmap             1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
#>    fs                  1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
#>    glue                1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
#>    hms                 1.1.3   2023-03-21 [1] CRAN (R 4.4.0)
#>    htmltools           0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>    knitr               1.46    2024-04-06 [1] CRAN (R 4.4.0)
#>    lifecycle           1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>    magrittr            2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
#>    memoise             2.0.1   2021-11-26 [1] CRAN (R 4.4.0)
#>    pillar              1.9.0   2023-03-22 [1] CRAN (R 4.4.0)
#>    pkgconfig           2.0.3   2019-09-22 [1] CRAN (R 4.4.0)
#>    R6                  2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
#>    readr               2.1.5   2024-01-10 [1] CRAN (R 4.4.0)
#>    reprex              2.1.0   2024-01-11 [1] CRAN (R 4.4.0)
#>  D rJava               1.0-11  2024-01-26 [1] CRAN (R 4.4.0)
#>    rlang               1.1.3   2024-01-10 [1] CRAN (R 4.4.0)
#>    rmarkdown           2.27    2024-05-17 [1] CRAN (R 4.4.0)
#>    RSQLite             2.3.6   2024-03-31 [1] CRAN (R 4.4.0)
#>    rstudioapi          0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
#>    sessioninfo         1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>    SqlRender           1.17.0  2024-03-20 [1] CRAN (R 4.4.0)
#>    tibble              3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
#>    tzdb                0.4.0   2023-05-12 [1] CRAN (R 4.4.0)
#>    utf8                1.2.4   2023-10-22 [1] CRAN (R 4.4.0)
#>    vctrs               0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
#>    withr               3.0.0   2024-01-16 [1] CRAN (R 4.4.0)
#>    xfun                0.44    2024-05-15 [1] CRAN (R 4.4.0)
#>    yaml                2.3.8   2023-12-11 [1] CRAN (R 4.4.0)
#> 
#>  [1] C:/R/R-4.4.0/library
#> 
#>  D ── DLL MD5 mismatch, broken installation.
#> 
#> ──────────────────────────────────────────────────────────────────────────────

The following code should resolve this:

getEunomiaConnectionDetails <- function(databaseFile = tempfile(fileext = ".sqlite"), dbms = "sqlite") {
  if (interactive() & !("DatabaseConnector" %in% rownames(utils::installed.packages()))) {
    message("The DatabaseConnector package is required but not installed.")
    if (!isTRUE(utils::askYesNo("Would you like to install DatabaseConnector?"))) {
      return(invisible(NULL))
    } else {
      utils::install.packages("DatabaseConnector")
    }
  }
  
  if (!file.exists(file.path(Sys.getenv("EUNOMIA_DATA_FOLDER"), "GiBleed_5.3.sqlite"))) {
    datasetLocation <- getDatabaseFile(datasetName = "GiBleed", dbms = dbms, databaseFile = databaseFile)
  }
  DatabaseConnector::createConnectionDetails(dbms = dbms, server = datasetLocation)
}

file.exists(file.path(Sys.getenv("EUNOMIA_DATA_FOLDER"), "GiBleed_5.3.sqlite"))
#> [1] TRUE

connectionDetails <- getEunomiaConnectionDetails()

Created on 2024-05-22 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24 ucrt)
#>  os       Windows 11 x64 (build 22631)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Dutch_Netherlands.utf8
#>  ctype    Dutch_Netherlands.utf8
#>  tz       Europe/Amsterdam
#>  date     2024-05-22
#>  pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  ! package           * version date (UTC) lib source
#>    bit                 4.0.5   2022-11-15 [1] CRAN (R 4.4.0)
#>    bit64               4.0.5   2020-08-30 [1] CRAN (R 4.4.0)
#>    cli                 3.6.2   2023-12-11 [1] CRAN (R 4.4.0)
#>    DatabaseConnector   6.3.2   2023-12-11 [1] CRAN (R 4.4.0)
#>    DBI                 1.2.2   2024-02-16 [1] CRAN (R 4.4.0)
#>    digest              0.6.35  2024-03-11 [1] CRAN (R 4.4.0)
#>    evaluate            0.23    2023-11-01 [1] CRAN (R 4.4.0)
#>    fastmap             1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
#>    fs                  1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
#>    glue                1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
#>    htmltools           0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>    knitr               1.46    2024-04-06 [1] CRAN (R 4.4.0)
#>    lifecycle           1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>    reprex              2.1.0   2024-01-11 [1] CRAN (R 4.4.0)
#>  D rJava               1.0-11  2024-01-26 [1] CRAN (R 4.4.0)
#>    rlang               1.1.3   2024-01-10 [1] CRAN (R 4.4.0)
#>    rmarkdown           2.27    2024-05-17 [1] CRAN (R 4.4.0)
#>    rstudioapi          0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
#>    sessioninfo         1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>    withr               3.0.0   2024-01-16 [1] CRAN (R 4.4.0)
#>    xfun                0.44    2024-05-15 [1] CRAN (R 4.4.0)
#>    yaml                2.3.8   2023-12-11 [1] CRAN (R 4.4.0)
#> 
#>  [1] C:/R/R-4.4.0/library
#> 
#>  D ── DLL MD5 mismatch, broken installation.
#> 
#> ──────────────────────────────────────────────────────────────────────────────

mvankessel-EMC avatar May 22 '24 08:05 mvankessel-EMC

Is this only happening for the default dataset or is it also occurring on other datasets specified by name?

fdefalco avatar May 24 '24 13:05 fdefalco

So far I have only noticed it when using getEunomiaConnectionDetails(), where the GiBleed dataset is hardcoded.

mvankessel-EMC avatar May 24 '24 13:05 mvankessel-EMC

Thanks, yes, I can confirm I'm seeing the same behavior. I think the root of this is that getEunomiaConnectionDetails uses getDatabaseFile which defaults overwrite to TRUE. getEunomiaConnectionDetails doesn't provide any way to specify a different behavior. I think the desired behavior would be to use an existing copy unless otherwise specified. So I can add a overwrite parameter to getEunomiaConnectionDetails with a default of FALSE and then pass that along to getDatabaseFile in the getEunomiaConnectionDetails. Does that work for you?

fdefalco avatar May 24 '24 14:05 fdefalco

Yes that would work, thank you.

mvankessel-EMC avatar May 27 '24 07:05 mvankessel-EMC

Fix is now in the develop branch if you would like to test that would be great.

fdefalco avatar May 29 '24 20:05 fdefalco

Seems to work, thank you.

# Install OHDSI/Eunomia@develop
# remotes::install_github("OHDSI/Eunomia@develop")
file.exists(file.path(Sys.getenv("EUNOMIA_DATA_FOLDER"), "GiBleed_5.3.sqlite"))
#> [1] TRUE

connectionDetails <- Eunomia::getEunomiaConnectionDetails()

Created on 2024-05-30 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24 ucrt)
#>  os       Windows 11 x64 (build 22631)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Dutch_Netherlands.utf8
#>  ctype    Dutch_Netherlands.utf8
#>  tz       Europe/Amsterdam
#>  date     2024-05-30
#>  pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  ! package           * version date (UTC) lib source
#>    bit                 4.0.5   2022-11-15 [1] CRAN (R 4.4.0)
#>    bit64               4.0.5   2020-08-30 [1] CRAN (R 4.4.0)
#>    blob                1.2.4   2023-03-17 [1] CRAN (R 4.4.0)
#>    cachem              1.1.0   2024-05-16 [1] CRAN (R 4.4.0)
#>    callr               3.7.6   2024-03-25 [1] CRAN (R 4.4.0)
#>    cli                 3.6.2   2023-12-11 [1] CRAN (R 4.4.0)
#>    curl                5.2.1   2024-03-01 [1] CRAN (R 4.4.0)
#>    DatabaseConnector   6.3.2   2023-12-11 [1] CRAN (R 4.4.0)
#>    DBI                 1.2.2   2024-02-16 [1] CRAN (R 4.4.0)
#>    desc                1.4.3   2023-12-10 [1] CRAN (R 4.4.0)
#>    digest              0.6.35  2024-03-11 [1] CRAN (R 4.4.0)
#>    Eunomia             2.0.0   2024-05-30 [1] Github (OHDSI/Eunomia@f016f27)
#>    evaluate            0.23    2023-11-01 [1] CRAN (R 4.4.0)
#>    fansi               1.0.6   2023-12-08 [1] CRAN (R 4.4.0)
#>    fastmap             1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
#>    fs                  1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
#>    glue                1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
#>    hms                 1.1.3   2023-03-21 [1] CRAN (R 4.4.0)
#>    htmltools           0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>  V knitr               1.46    2024-05-29 [1] CRAN (R 4.4.0) (on disk 1.47)
#>    lifecycle           1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>    magrittr            2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
#>    memoise             2.0.1   2021-11-26 [1] CRAN (R 4.4.0)
#>    pillar              1.9.0   2023-03-22 [1] CRAN (R 4.4.0)
#>    pkgbuild            1.4.4   2024-03-17 [1] CRAN (R 4.4.0)
#>    pkgconfig           2.0.3   2019-09-22 [1] CRAN (R 4.4.0)
#>    processx            3.8.4   2024-03-16 [1] CRAN (R 4.4.0)
#>    ps                  1.7.6   2024-01-18 [1] CRAN (R 4.4.0)
#>    purrr               1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
#>    R.cache             0.16.0  2022-07-21 [1] CRAN (R 4.4.0)
#>    R.methodsS3         1.8.2   2022-06-13 [1] CRAN (R 4.4.0)
#>    R.oo                1.26.0  2024-01-24 [1] CRAN (R 4.4.0)
#>    R.utils             2.12.3  2023-11-18 [1] CRAN (R 4.4.0)
#>    R6                  2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
#>    readr               2.1.5   2024-01-10 [1] CRAN (R 4.4.0)
#>    remotes             2.5.0   2024-03-17 [1] CRAN (R 4.4.0)
#>    reprex              2.1.0   2024-01-11 [1] CRAN (R 4.4.0)
#>  D rJava               1.0-11  2024-01-26 [1] CRAN (R 4.4.0)
#>    rlang               1.1.3   2024-01-10 [1] CRAN (R 4.4.0)
#>    rmarkdown           2.27    2024-05-17 [1] CRAN (R 4.4.0)
#>    RSQLite             2.3.7   2024-05-27 [1] CRAN (R 4.4.0)
#>    rstudioapi          0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
#>    sessioninfo         1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>    styler              1.10.3  2024-04-07 [1] CRAN (R 4.4.0)
#>    tibble              3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
#>    tzdb                0.4.0   2023-05-12 [1] CRAN (R 4.4.0)
#>    utf8                1.2.4   2023-10-22 [1] CRAN (R 4.4.0)
#>    vctrs               0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
#>    withr               3.0.0   2024-01-16 [1] CRAN (R 4.4.0)
#>    xfun                0.44    2024-05-15 [1] CRAN (R 4.4.0)
#>    yaml                2.3.8   2023-12-11 [1] CRAN (R 4.4.0)
#> 
#>  [1] C:/R/R-4.4.0/library
#> 
#>  V ── Loaded and on-disk version mismatch.
#>  D ── DLL MD5 mismatch, broken installation.
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Also the previously failing unit tests in my code, also pass now.

mvankessel-EMC avatar May 30 '24 07:05 mvankessel-EMC