box icon indicating copy to clipboard operation
box copied to clipboard

Callr module sourcing

Open ElianHugh opened this issue 3 years ago • 8 comments

I've been having trouble working with callr and box. Specifically, sourcing modules by `box::use(./module)' fails, as it appears that callr attempts to find them in a /tmp/ folder, rather than the working directory.

Might be related to #49?

Background info: I've come across this when using the {targets} package, where I have to specify callr_function = NULL in order to run the pipeline

ElianHugh avatar Apr 25 '21 03:04 ElianHugh

Ah, that’s unfortunate. Apparently ‘callr’ uses a temporary file to invoke code via R -f ‹tmpfile›, which ‘box’ uses to figure out the location of the calling code.

Unfortunately inside the subprocess ‘box’ has no way of knowing that it was invoked via ‘callr’, and from where. I think fixing this would require support from the ‘callr’ package, some way of hooking into the call and passing information into the child process.

To work around this, you can explicitly set the script path via box::set_script_path inside the subprocess. Note that this function expects the full path of a module, not of the containing directory. But the script path doesn’t actually have to exist, it just needs to point a location whose parent directory will subsequently be searched for modules.

Here’s an example:

box::use(callr)

callr$r(function (mod) {
    box::set_script_path(mod)
    box::use(./module)
}, args = list(mod = file.path(box::file(), '_')))

klmr avatar Apr 25 '21 14:04 klmr

Actually it might be possible to handle this transparently without ‘callr’ cooperation by computing the script path when ‘box’ is loaded, and storing it in an environment variable (unless that variable is already set; in that case, assume that we’re inside a subprocess, e.g. via ‘callr’). Then, when script_path is called internally, that environment variable is used instead of recomputing the path.

The environment variable would need to be updated during the loading of a module (and reset afterwards) to support ‘callr’ invocations inside module source code.


Edit: Nope, that fundamentally doesn’t work: normally, when the user invokes an external R script, they want that script to be standalone, i.e. act as its own module. ‘callr’ is special because it is (from the user perspective) invoked with an R function, not a script.

klmr avatar Apr 25 '21 14:04 klmr

Edit: Nope, that fundamentally doesn’t work: normally, when the user invokes an external R script, they want that script to be standalone, i.e. act as its own module. ‘callr’ is special because it is (from the user perspective) invoked with an R function, not a script.

Ah, that's a pity. Thank you for looking into this! The workaround you posted works nicely, so I can use that for the time being - thanks!

ElianHugh avatar Apr 26 '21 02:04 ElianHugh

Hello, I'm sorry but I still don't know how to use the workaround with {targets}. Could you please give some example code?

phineas-pta avatar Jun 19 '22 22:06 phineas-pta

This is still unclear to me and breaks targets pipelines when I try to use them with box.

Namely, box doesn't search the box.path I've set for modules. When I run targets::tar_visnetwork() I get this error:

Error:
! Error running targets::tar_visnetwork()
  Target errors: targets::tar_meta(fields = error, complete_only = TRUE)
  Tips: https://books.ropensci.org/targets/debugging.html
  Last error: unable to load module “preprocessing/preprocessing”; not found in “/nfsdata/projects/petar/fgf1/renv/profiles/ygg/renv/library/R-4.1/x86_64-conda-linux-gnu/box/mod”, “/scratch/nmq407/R_tmp//Rtmp0WS3Dk”
Traceback:

1. targets::tar_visnetwork()
2. callr_outer(targets_function = tar_visnetwork_inner, targets_arguments = targets_arguments, 
 .     callr_function = callr_function, callr_arguments = callr_arguments, 
 .     envir = envir, script = script, store = store, fun = "tar_visnetwork")
3. if_any(inherits(out, "error"), callr_error(condition = out, fun = fun), 
 .     out)
4. callr_error(condition = out, fun = fun)
5. tar_throw_run(message)
6. tar_error(message = paste0(...), class = c("tar_condition_run", 
 .     "tar_condition_targets"))
7. rlang::abort(message = message, class = class, call = tar_empty_envir)
8. signal_abort(cnd, .file)

the targets pipeline is attempting to import the module, which I can import in an interactive R session using the following code

box::use(preprocessing/preprocessing)

pvtodorov avatar Sep 27 '22 13:09 pvtodorov

have continued a targets discussion re: compatability with box here https://github.com/ropensci/targets/discussions/936#discussioncomment-3745040

pvtodorov avatar Sep 27 '22 17:09 pvtodorov

@pvtodorov Unfortunately I don’t currently have the time to debug this properly but my suspicion is that ~/.Rprofile isn’t being properly sourced by ‘callr’, and that the R options of the current session aren’t inherited. If that’s the case, you can try using the R_BOX_PATH environment variable instead of the box.path R option: it’s treated as equivalent by ‘box’, and as an environment variable its value should definitely be inherited by a child process.

To try this, try including the following code inside either your calling script (_targets.R, I presume?) or your ~/.Rprofile:

Sys.setenv(R_BOX_PATH = '/projects/petar/fgf1/code/')

Regarding the incorrect dependency tracking of ‘targets’, this is presumably due to the fact that ‘targets’ internally resolves names using the ‘codetools’ package. Now, ‘codetools’ doesn’t know about the special meaning of box::use declarations and therefore doesn’t handle them correctly. I’d love to add support for ‘box’ to this package but it’s a nontrivial addition, and I am not sure that its maintainer would accept such changes anyway.

klmr avatar Sep 27 '22 21:09 klmr

Thanks for the quick reply and explanation.

Here's the results for setting box.path ✅ setting it as R_BOX_PATH = '/projects/petar/fgf1/code/' in .Renviron works ✅ setting it as Sys.setenv(R_BOX_PATH = '/projects/petar/fgf1/code/') in the .Rprofile works ✅ setting it as Sys.setenv(R_BOX_PATH = '/projects/petar/fgf1/code/') in the _targets.R works 🚫 setting it as an option in .Rprofile does not ✅ setting it as an option in _targets.R does

on the dependency tracking side, is there anything I can do as a user to wrap my box code, or run after it to accomplish the behavior I get with Example 3?

pvtodorov avatar Sep 29 '22 09:09 pvtodorov