teal icon indicating copy to clipboard operation
teal copied to clipboard

479 add `renv::snapshot()` for `.lockfile` with `future` + `shiny::ExtendedTask`

Open m7pr opened this issue 1 year ago • 18 comments

Fixes https://github.com/insightsengineering/coredev-tasks/issues/479 Alternative for https://github.com/insightsengineering/teal/pull/1224

This PR includes new imports

  • {renv}
    • so that we can use a function that creates a .lockfile for future reproducibility
  • {future}
    • so that we can run pak::lockfile_create() as a shiny::ExtendedTask()$new in a parallel process to the shiny session

Tested with

app <- init(
  data = teal_data(iris = iris),
  modules = example_module(label = "example teal module")
)
if (interactive()) {
  shinyApp(app$ui, app$server)
}

m7pr avatar May 22 '24 20:05 m7pr

Created a small video during which the lockfile creation took 21 seconds, but during this time it was possible to work with the app.

✔ Created lockfile 'session.lock' [21.5s]

https://github.com/insightsengineering/teal/assets/133694481/8841c710-c0de-4225-93ec-c0a8e2359c70

m7pr avatar May 22 '24 21:05 m7pr

Unit Tests Summary

  1 files   30 suites   5m 20s :stopwatch: 241 tests 241 :white_check_mark: 0 :zzz: 0 :x: 507 runs  507 :white_check_mark: 0 :zzz: 0 :x:

Results for commit 560a1ff2.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar May 22 '24 21:05 github-actions[bot]

Unit Test Performance Difference

Test Suite $Status$ Time on main $±Time$ $±Tests$ $±Skipped$ $±Failures$ $±Errors$
utils 💔 $0.35$ $+63.29$ $+2$ $0$ $0$ $0$
Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
utils 👶 $+63.31$ create_renv_lockfile_creates_a_lock_file_during_the_execution

Results for commit 2d8be64408804b95c37b32137eb492daed88acfb

♻️ This comment has been updated with latest results.

github-actions[bot] avatar May 22 '24 21:05 github-actions[bot]

some tests fail for shiny::testServer() so I think we can reuse the approach we created for logger::log_shiny_input_changes where we skip this functionality if session inherits(session, "MockShinySession")

m7pr avatar May 22 '24 22:05 m7pr

But how to include that in teal::init if the there is no shiny session yet : ) I guess we could use identical(Sys.getenv("TESTTHAT"), "true") to detect process is in the testing mode

m7pr avatar May 22 '24 22:05 m7pr

badge

Code Coverage Summary

Filename                          Stmts    Miss  Cover    Missing
------------------------------  -------  ------  -------  --------------------------------------------------------------------------------------------------------------------------------------------------
R/dummy_functions.R                  36      25  30.56%   21-37, 40-47
R/get_rcode_utils.R                  31       1  96.77%   50
R/include_css_js.R                   22      17  22.73%   12-38, 76-82
R/init.R                             87      31  64.37%   108-115, 164-165, 167, 179-200, 231-232, 234
R/landing_popup_module.R             25      25  0.00%    61-87
R/module_bookmark_manager.R         158     125  20.89%   42-43, 57-59, 70-83, 93-143, 148-149, 189, 224-301
R/module_filter_manager.R            84      19  77.38%   38-42, 157, 162-175
R/module_nested_tabs.R              168      67  60.12%   40-122, 138, 190, 212, 234, 242, 246
R/module_snapshot_manager.R         241     178  26.14%   95-107, 136-139, 143-144, 159-169, 173-188, 190-198, 205-220, 224-228, 230-236, 239-252, 255-273, 282-298, 313-336, 339-350, 353-359, 373, 394-418
R/module_tabs_with_filters.R         79      36  54.43%   34-74, 106, 122
R/module_teal_with_splash.R         114      34  70.18%   66-101, 116, 137, 203-204
R/module_teal.R                     123      86  30.08%   50-119, 152-153, 159-162, 175, 189-227
R/module_wunder_bar.R                60      39  35.00%   23-41, 55-64, 68-77
R/modules.R                         159      23  85.53%   127-130, 147-151, 206-209, 291-292, 344, 356-364
R/reporter_previewer_module.R        19       2  89.47%   30, 34
R/show_rcode_modal.R                 24      24  0.00%    17-42
R/tdata.R                            53       1  98.11%   154
R/teal_data_module-eval_code.R       27       0  100.00%
R/teal_data_module-within.R           7       0  100.00%
R/teal_data_module.R                  6       0  100.00%
R/teal_lockfile.R                    56      22  60.71%   34-38, 43-51, 72, 91, 94-102, 109
R/teal_reporter.R                    68       6  91.18%   69, 77, 125-126, 129, 146
R/teal_slices-store.R                29       0  100.00%
R/teal_slices.R                      59      12  79.66%   137-150
R/TealAppDriver.R                   324     324  0.00%    43-671
R/utils.R                           185       1  99.46%   278
R/validate_inputs.R                  32       0  100.00%
R/validations.R                      58      37  36.21%   110-377
R/zzz.R                              12       8  33.33%   3-15
TOTAL                              2346    1143  51.28%

Diff against main

Filename             Stmts    Miss  Cover
-----------------  -------  ------  -------
R/init.R                +1       0  +0.41%
R/module_teal.R         +3      +2  +0.08%
R/teal_lockfile.R      +56     +22  +60.71%
TOTAL                  +60     +24  +0.23%

Results for commit: 560a1ff2893e6692e5699814c31ed69344465125

Minimum allowed coverage is 80%

:recycle: This comment has been updated with latest results

github-actions[bot] avatar May 22 '24 22:05 github-actions[bot]

Hey team (cc @pawelru @gogonzo @vedhav @donyunardi) based on the points and the discussion in here I personally think there is a lot of issues with running pak::lockfile_create() on our end, because

  • it basically does not create a lockfile for a project (but for a package) so we need to create lockfile specification on our own
  • takes long time to compute, so we need a parallel process, which is discouraged to use in package code
  • brings dependencies on 3 new packages - pak, future and sessionifno
  • is extremely hard to test because there is gazillion ways of how you can install a package

Should we switch to renv::snapshot() or move to a solution where user passes the lockfile to init, stated by @vedhav in here https://github.com/insightsengineering/teal/pull/1224#issuecomment-2124913037

m7pr avatar May 23 '24 11:05 m7pr

@pawelru as a team we were aligned yesterday during daily that it will be the simplest to allow the user to compute and pass his/her own lockfile. If you do think the approach with shiny::ExtendedTask and pak::lockfile_create has too many obstacles and dark wholes, we are more than happy to close this one and move with the simpler solution.

m7pr avatar May 23 '24 15:05 m7pr

@pawelru as a team we were aligned yesterday during daily that it will be the simplest to allow the user to compute and pass his/her own lockfile. If you do think the approach with shiny::ExtendedTask and pak::lockfile_create has too many obstacles and dark wholes, we are more than happy to close this one and move with the simpler solution.

Hmmm... I'm still not very convinced for the reasons I outlined here. It is true that it would be easier to implement as you will essentially delegate the whole thing to the app developer. My hope is that the ease of implementation was not the main reason behind this decision. Recently, we discussed with POs the high level strategy for teal and other products and one of the key items was simplification (of any kind) for developers. This is very much against it and that's why I have problem with it. As a teal app developer, I think it's perfectly reasonable to expect teal to create the lockfile on its own. This of course will (and should) depend on the way how the environment was set up and, in extreme cases, it will produce not restorable file. If it's not restorable then it is how it is. I think that it has to be acknowledged by everyone (developers and users) that some stuff is beyond teal here.

pawelru avatar May 23 '24 15:05 pawelru

@pawelru this is proper attitude to state that some cases might be beyond teal. If we stick to this motto, I'm fine creating this feature on our side, knowing that testing of all test cases will be hard but it's not crucial to cover each edge case but rather to support great majority of typical use cases.

m7pr avatar May 27 '24 09:05 m7pr

Hey, thanks for the feedback. Finally in this PR I decided to go with renv::snapshot() solution due to motivation I stated in here https://github.com/insightsengineering/teal/pull/1232#discussion_r1616785540 I think pak is great for package installation, but for the lockfile creation I reckon renv does more automated things for us.

I applied feedback and created a button that allows to download the lockfile. . We ended up using the future::plan() for the asynchronous job.

I allowed the user to pass his own .lockfile, use DESCRIPTION file as a base for .lockfile generation or use default renv::snapshot() behavior to track all packages included in the working directory. You can read the documentation I provided to understand the mechanism. Feel free to test with below code


# pass your own .lockfile
# options(teal.renv.lockfile = "session.lock")

# allow to create the .lockfile based on DESCRIPTION file
# renv::settings$snapshot.type("explicit")

app <- init(
  data = teal_data(iris = iris),
  modules = example_module(label = "example teal module")
)
if (interactive()) {
  shinyApp(app$ui, app$server)
}

m7pr avatar May 28 '24 08:05 m7pr

Maybe one more thing would be capturing the output of renv::snapshot() so it's not pasted into the console. And maybe handling cases when it produces errors, however with the force = TRUE it will produce lockfile even though there are errors.

m7pr avatar May 28 '24 11:05 m7pr

urlchecker is failining in the container, but not on my machine locally - will need to pull the container and get inside to verify

m7pr avatar May 28 '24 14:05 m7pr

App fails when I try to start a new (concurent) shiny session.

image

Also project.lock contains packages which definitely are not associated with the app (teal.modules.clinical, teal.modules.general). I guess it reads from files in a working directory (in my case it is ~/.)

app code
pkgload::load_all("teal")

app <- init(
  data = teal_data(iris = iris),
  modules = example_module(label = "example teal module")
)
if (interactive()) {
  shinyApp(app$ui, app$server)
}

From the other PR

Because some packages could be loaded on require by some modules, it means that session might constantly change (more and more packages will be added), we recommend then a mechanism to update a lockfile when needed.

This is not needed as renv::snapshot() for the default type = "implicit" takes all packages from the files in current project directory. For the type = "explicit", it takes only the specs from DESCRIPTION file (the same as pak::lockfile_create() does) so it's not working on an actual R session but on a list of files, that do not change.

If it is true that renv::dependencies() can conclude all packages properly then we don't need to run lockfile_create() every shiny session but probably we can set future worker in init (so we have one for the whole app).

gogonzo avatar May 28 '24 14:05 gogonzo

Also project.lock contains packages which definitely are not associated with the app (teal.modules.clinical, teal.modules.general). I guess it reads from files in a working directory (in my case it is ~/.)

@gogonzo yes it does. Please see ?create_renv_lockfile to see it takes from project working directory by default. You can set this up to take from DESCRIPTION file or from custom lockfile. This is the same as renv::snapshot() is set up - they dont take from the session - they take from the working directory or from the DESCRIPTION file. There is also one extensive version to take from all installed files : ) - check renv::snapshot() type parameter documentation.

m7pr avatar May 28 '24 14:05 m7pr

If it is true that renv::dependencies() can conclude all packages properly then we don't need to run lockfile_create() every shiny session but probably we can set future worker in init (so we have one for the whole app).

That would be great - this requires the srv_teal_with_splash and srv_teal to pass lockfile/future object through their parameters.

m7pr avatar May 28 '24 14:05 m7pr

Hey @gogonzo thanks to your suggestion I moved back lockfile_task creation to init(). It has been there before we provided changes for this suggestion (point 3 https://github.com/insightsengineering/teal/pull/1224#issuecomment-2131296005).

Now the lockfile creation task is invoked once and is shared across the app for all shiny sessions. I reckon this is good to go. I added also some monitoring logs. Will schedule a meeting with the team to present the solution and to get this over the finish lines.

m7pr avatar May 29 '24 12:05 m7pr

We are seeing an error when running examples during CI that

> cleanEx()
Error: connections left open:
	<-localhost:37159 (sockconn)
	<-localhost:37159 (sockconn)

We may want to setup the parallel connection once the lockfile computations are done.

m7pr avatar May 29 '24 13:05 m7pr

@averissimo @kartikeyakirar @vedhav thanks for joining the meeting on Monday where I outlined the status of this feature.

The feature is finalized and ready for a quick review. Below you can find the summary of the feature.

We allow to download the .lockfile through teal app.

.lockfile

  • can be pre-computed by the user - if this is the case specify the path to the file through options(teal.renv.lockfile = "")
  • can be computed out of a current working directory - default behavior
  • can be computed out of a DESCRIPTION file from a current working directory - then run renv::settings$snapshot.type("explicit") before starting the app
  • can contain custom packages (out of custom files) - run renv::settings$snapshot.type("explicit") and specify files as a result of option(renv.snapshot.filter = fun1()) where fun1() returns the list of packages

Process

  • .lockfile creation is executed as a separate parallel task with future::plan(multisession, workers = 2) backend. If there was already any parallel backend when the teal::init was called, then the backend is reused. If there was no backend, then we create a new one and close it as soon as possible.
  • .lockfile is created as a file in working directory. It can not be created as a temp file in a parallel task that gets closed, because this file gets deleted.
  • We are using shiny::ExtendedTask and promises::future_promise to execute the parallel task. shiny::ExtendedTask is designed to work with parallel long-lasting process in shiny. It also has a cool feature of synchronization with bslib buttons, but we are not utilizing it. More in here https://shiny.posit.co/r/articles/improve/nonblocking/ for the future improvements.
  • The parallel task is started in teal::init so there is only one task per app, not a task per each session.
  • .lockfile is not created during: mocked shiny sessions, testthat process and R CMD CHECK. It is hard to control the parallel process in the setup of those mocked or controlled semi-R processes.

Code

Test with

# pass your own .lockfile
# options(teal.renv.lockfile = "session.lock")

# allow to create the .lockfile based on DESCRIPTION file
# renv::settings$snapshot.type("explicit")

app <- init(
  data = teal_data(iris = iris),
  modules = example_module(label = "example teal module")
)
if (interactive()) {
  shinyApp(app$ui, app$server)
}

m7pr avatar Jun 04 '24 10:06 m7pr

Thanks for the idea for the test - I edited it slighty so it returns back the same parallel background that was set before create_renv_lockfile call. It was an internal function so I did not plan on exhaustive testing but this is great.

testthat::test_that("create_renv_lockfile creates a lock file during the execution", {
  old_plan <- future::plan(future::sequential)
  withr::defer(future::plan(old_plan))
  
  renv_file_name <- "teal_renv_lock.lock"
  withr::defer(file.remove(renv_file_name))
  promise <- create_renv_lockfile(TRUE)
  
  testthat::expect_true(file.exists(renv_file_name))
})

I wonder if there is a way to check if shiny::ExtendedTask() in teal::app is created, however we specifically omit the creation of it during tests

m7pr avatar Jun 07 '24 12:06 m7pr

Thanks @kartikeyakirar and @averissimo for the review! Included comments and suggestions in recent commits

m7pr avatar Jun 07 '24 12:06 m7pr

@pawelru any last words on this one? This was a bumpy ride, and even though the research phase took a while I think we did huge team-work on this

m7pr avatar Jun 17 '24 13:06 m7pr

@pawelru I think you can consider yourself as an author :D

m7pr avatar Jun 19 '24 10:06 m7pr

@ruckip @m7pr I just had a moment of reflection on the sofa. And yes, I know it is in already. I just want to know you thoughts

Since:

  • renv::snapshot is used in a way that doesn't need a current R session (it doesn't compute from attached packages but scans files) Then:
  • future might be en overkill here. I think that callr::r_bg could simplify the code and make it more robust in the same time (no future::plan change)

gogonzo avatar Jun 20 '24 13:06 gogonzo

I have no strong preference nor opinion on that to be honest. I'm fine with whatever not blocking shiny session. Of course the simpler the better.

pawelru avatar Jun 20 '24 13:06 pawelru

I created an alternate PR where we use callr instead of future and promises. https://github.com/insightsengineering/teal/pull/1255 Testing it now

m7pr avatar Jul 01 '24 10:07 m7pr

The simplified version that does not use future+promise combination can be found in here and is ready for testing https://github.com/insightsengineering/teal/pull/1255 Waaay more simplified, however we lost track of the possibility to track the outcome of lockfile creation

m7pr avatar Jul 01 '24 13:07 m7pr