CohortDiagnostics icon indicating copy to clipboard operation
CohortDiagnostics copied to clipboard

Use of formals() in CohortDiagnostics::executeDiagnostics

Open mvankessel-EMC opened this issue 1 year ago • 3 comments

When parsing the arguments of executDiagnostics() here, only default parameters will be passed to variable callingArgsJson.

Here is an example showing this using an example function foo():

foo <- function(bar = 10, baz = 20) {
  args <- formals(foo)
  return(list(bar = args$bar, baz = args$baz))
}

foo()
#> $bar
#> [1] 10
#> 
#> $baz
#> [1] 20

foo(1, 2)
#> $bar
#> [1] 10
#> 
#> $baz
#> [1] 20

Created on 2023-08-18 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16 ucrt)
#>  os       Windows 11 x64 (build 22621)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Dutch_Netherlands.utf8
#>  ctype    Dutch_Netherlands.utf8
#>  tz       Europe/Amsterdam
#>  date     2023-08-18
#>  pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.1)
#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
#>  evaluate      0.21    2023-05-05 [1] CRAN (R 4.3.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.1)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.1)
#>  htmltools     0.5.5   2023-03-23 [1] CRAN (R 4.3.1)
#>  knitr         1.43    2023-05-25 [1] RSPM (R 4.3.0)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.3.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.1)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.3.1)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.3.1)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.3.1)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.1)
#>  rmarkdown     2.23    2023-07-01 [1] CRAN (R 4.3.1)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
#>  styler        1.10.1  2023-06-05 [1] CRAN (R 4.3.1)
#>  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.3.1)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.3.1)
#>  xfun          0.39    2023-04-20 [1] CRAN (R 4.3.1)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
#> 
#>  [1] C:/R/R-4.3.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

In the following example I've simplified executDiagnostics() definition to only produce json and cut down the variables to only use those that are passed to formals().

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# Simplified dummy function definition
executeDiagnostics <- function(runInclusionStatistics = TRUE,
                               runIncludedSourceConcepts = TRUE,
                               runOrphanConcepts = TRUE,
                               runTimeSeries = FALSE,
                               runVisitContext = TRUE,
                               runBreakdownIndexEvents = TRUE,
                               runIncidenceRate = TRUE,
                               runCohortRelationship = TRUE,
                               runTemporalCohortCharacterization = TRUE,
                               minCellCount = 5,
                               minCharacterizationMean = 0.01,
                               incremental = FALSE
                               ) {

  callingArgs <- formals(executeDiagnostics)
  callingArgsJson <-
    list(
      runInclusionStatistics = callingArgs$runInclusionStatistics,
      runIncludedSourceConcepts = callingArgs$runIncludedSourceConcepts,
      runOrphanConcepts = callingArgs$runOrphanConcepts,
      runTimeSeries = callingArgs$runTimeSeries,
      runVisitContext = callingArgs$runVisitContext,
      runBreakdownIndexEvents = callingArgs$runBreakdownIndexEvents,
      runIncidenceRate = callingArgs$runIncidenceRate,
      runTemporalCohortCharacterization = callingArgs$runTemporalCohortCharacterization,
      minCellCount = callingArgs$minCellCount,
      minCharacterizationMean = callingArgs$minCharacterizationMean,
      incremental = callingArgs$incremental
    ) %>%
    RJSONIO::toJSON(digits = 23, pretty = TRUE)
  return(callingArgsJson)
}

# Running dummy with defaults
res1 <- executeDiagnostics()

# Running with flipped defaults
res2 <- executeDiagnostics(runInclusionStatistics = !TRUE,
                   runIncludedSourceConcepts = !TRUE,
                   runOrphanConcepts = !TRUE,
                   runTimeSeries = !FALSE,
                   runVisitContext = !TRUE,
                   runBreakdownIndexEvents = !TRUE,
                   runIncidenceRate = !TRUE,
                   runCohortRelationship = !TRUE,
                   runTemporalCohortCharacterization = !TRUE,
                   minCellCount = -5,
                   minCharacterizationMean = -0.01,
                   incremental = !FALSE)

res1
#> [1] "{\n\t\"runInclusionStatistics\" : true,\n\t\"runIncludedSourceConcepts\" : true,\n\t\"runOrphanConcepts\" : true,\n\t\"runTimeSeries\" : false,\n\t\"runVisitContext\" : true,\n\t\"runBreakdownIndexEvents\" : true,\n\t\"runIncidenceRate\" : true,\n\t\"runTemporalCohortCharacterization\" : true,\n\t\"minCellCount\" : 5,\n\t\"minCharacterizationMean\" : 0.010000000000000000208167,\n\t\"incremental\" : false\n}"

res2
#> [1] "{\n\t\"runInclusionStatistics\" : true,\n\t\"runIncludedSourceConcepts\" : true,\n\t\"runOrphanConcepts\" : true,\n\t\"runTimeSeries\" : false,\n\t\"runVisitContext\" : true,\n\t\"runBreakdownIndexEvents\" : true,\n\t\"runIncidenceRate\" : true,\n\t\"runTemporalCohortCharacterization\" : true,\n\t\"minCellCount\" : 5,\n\t\"minCharacterizationMean\" : 0.010000000000000000208167,\n\t\"incremental\" : false\n}"

# Check if results are identical
identical(res1, res2)
#> [1] TRUE

Created on 2023-08-18 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16 ucrt)
#>  os       Windows 11 x64 (build 22621)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Dutch_Netherlands.utf8
#>  ctype    Dutch_Netherlands.utf8
#>  tz       Europe/Amsterdam
#>  date     2023-08-18
#>  pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.1)
#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
#>  dplyr       * 1.1.2   2023-04-20 [1] CRAN (R 4.3.1)
#>  evaluate      0.21    2023-05-05 [1] CRAN (R 4.3.1)
#>  fansi         1.0.4   2023-01-22 [1] CRAN (R 4.3.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.1)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.1)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.1)
#>  htmltools     0.5.5   2023-03-23 [1] CRAN (R 4.3.1)
#>  knitr         1.43    2023-05-25 [1] RSPM (R 4.3.0)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.3.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.1)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.1)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.1)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.3.1)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.3.1)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.1)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.3.1)
#>  RJSONIO       1.3-1.8 2023-01-31 [1] CRAN (R 4.3.0)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.1)
#>  rmarkdown     2.23    2023-07-01 [1] CRAN (R 4.3.1)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
#>  styler        1.10.1  2023-06-05 [1] CRAN (R 4.3.1)
#>  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.1)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.1)
#>  utf8          1.2.3   2023-01-31 [1] CRAN (R 4.3.1)
#>  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.3.1)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.3.1)
#>  xfun          0.39    2023-04-20 [1] CRAN (R 4.3.1)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
#> 
#>  [1] C:/R/R-4.3.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Original post in the DARWIN fork

mvankessel-EMC avatar Aug 29 '23 11:08 mvankessel-EMC

@mvankessel-EMC thanks for this - I'm not quite sure what we use the json for?

@gowthamrao is it used in the meta data to check what the calling arguments set by the user were? If so calling as.list(environment()) %>% RJSONIO::toJSON(digits = 23, pretty = TRUE) would achieve this more elegantly.

azimov avatar Aug 30 '23 16:08 azimov

@azimov I took the liberty to trace down the path callingArgsJson is used for. From what I can gather it follows this path:

  1. executeDiagnostics() RunDiagnostics.R a. L217-L232 b. L962-L1014, Specifically: L973 c. L1015-L1020 d. L1021-L1025
  2. makeDataExportable() Private.R L121-L250
  3. enforceMinCellValueDataframe() Private.R L252-L273
  4. enforceMinCellValue() Private.R L57-L80

mvankessel-EMC avatar Aug 31 '23 09:08 mvankessel-EMC

Yes - i looks like this is just stored in the metadata result - we just need to create a list that stores the relevant arguments (and doesn't leak user data, e.g. connectionDetails)

azimov avatar Aug 31 '23 15:08 azimov