software-review
software-review copied to clipboard
rixpress: Reproducible Analytical Pipelines with Nix
Submitting Author Name: Bruno Rodrigues Submitting Author Github Handle: @b-rodrigues Repository: https://github.com/b-rodrigues/rixpress Version submitted: 0.2.0 Submission type: Standard Editor: @ldecicco-USGS Reviewers: TBD
Archive: TBD Version accepted: TBD Language: en
- Paste the full DESCRIPTION file inside a code block below:
Package: rixpress
Title: Build Reproducible Analytical Pipelines With Nix
Version: 0.2.0
Authors@R:
person("Bruno", "Rodrigues", , "[email protected]", role = c("aut", "cre"))
Description: Streamlines the creation of reproducible analytical pipelines using
`default.nix` expressions generated via `{rix}` for reproducibility. Define
derivations in R or Python, chain them into a composition of pure functions
and build the resulting pipeline using `Nix` as the underlying end-to-end build
tool. Functions to plot a DAG representation of the pipeline are included,
as well as functions to load and inspect intermediary results for interactive
analysis. User experience heavily inspired by the `{targets}` package.
License: GPL (>= 3)
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
URL: https://github.com/b-rodrigues/rixpress/, https://b-rodrigues.github.io/rixpress/
BugReports: https://github.com/b-rodrigues/rixpress/issues
Depends:
R (>= 4.1.0)
Imports:
igraph,
jsonlite,
processx
RoxygenNote: 7.3.2
Suggests:
dplyr,
ggdag,
ggplot2,
knitr,
mockery,
reticulate,
rix,
rmarkdown,
testthat (>= 3.0.0),
usethis,
visNetwork
Config/testthat/edition: 3
VignetteBuilder: knitr
Scope
-
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- [ ] data retrieval
- [ ] data extraction
- [ ] data munging
- [ ] data deposition
- [ ] data validation and testing
- [x] workflow automation
- [ ] version control
- [ ] citation management and bibliometrics
- [ ] scientific software wrappers
- [ ] field and lab reproducibility tools
- [ ] database software bindings
- [ ] geospatial data
-
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
This package is intended to help users set up reproducible pipelines using the Nix programming language for enhanced reproducibility.
- Who is the target audience and what are scientific applications of this package?
The target audience is anyone wanting to switch from "script-based workflows" to build automation. rixpress generates valid Nix expressions from simple R function to define reproducible pipelines, and is heavily inspired by {targets}. The main difference between {targets} and this package is that the "heavy lifting" is performed by Nix, and it works very closely with my previous packages called {rix} which allows data scientists to set up reproducible environments using Nix. Also, because the underlying engine is Nix, it is language-agnostic, and so it is possible to define steps that use Python. These steps written in Python are not executed with {reticulate}, but instead run in a dedicated Python environment. Data transfer between Python an R is facilitated with {reticulate} though.
- Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
The main inspiration of this packages is {targets} and in combination with {rix}, one could set up a pipeline in a reproducible environment as well.
-
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
-
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tagthe editor you contacted.
Link to presubmission: https://github.com/ropensci/software-review/issues/699
@maurolepore
- Explain reasons for any
pkgcheckitems which your package is unable to pass.
Because this package relies heavily on side effects, unit tests are quite cumbersome to write, so I set up this other repository: https://github.com/b-rodrigues/rixpress_demos which contains many example pipelines that run on each push to {rixpress}'s repository. Thanks to LLM's I was able to improve test coverage to 67% (see https://github.com/b-rodrigues/rixpress/actions/runs/14971090564)
Technical checks
Confirm each of the following by checking the box.
- [x] I have read the rOpenSci packaging guide.
- [x] I have read the author guide and I expect to maintain this package for at least 2 years or to find a replacement.
This package:
- [x] does not violate the Terms of Service of any service it interacts with.
- [ ] has a CRAN and OSI accepted license.
- [x] contains a README with instructions for installing the development version.
- [x] includes documentation with examples for all functions, created with roxygen2.
- [x] contains a vignette with examples of its essential functions and uses.
- [x] has a test suite.
- [x] has continuous integration, including reporting of test coverage.
Publication options
-
[x] Do you intend for this package to go on CRAN?
-
[ ] Do you intend for this package to go on Bioconductor?
-
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal.
- [ ] The manuscript describing the package is no longer than 3000 words.
- [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
- (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
- (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
- (Please do not submit your package separately to Methods in Ecology and Evolution)
Code of conduct
- [x] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.
:rocket:
Editor check started
:wave:
Checks for rixpress (v0.2.0)
git hash: dbdc68c8
- :heavy_check_mark: Package name is available
- :heavy_check_mark: has a 'codemeta.json' file.
- :heavy_check_mark: has a 'contributing' file.
- :heavy_multiplication_x: The following functions have no documented return values: [export_nix_archive, import_nix_archive, print.derivation, rxp_init]
- :heavy_check_mark: uses 'roxygen2'.
- :heavy_check_mark: 'DESCRIPTION' has a URL field.
- :heavy_check_mark: 'DESCRIPTION' has a BugReports field.
- :heavy_check_mark: Package has at least one HTML vignette
- :heavy_multiplication_x: These functions do not have examples: [export_nix_archive, import_nix_archive, print.derivation, rxp_common_setup, rxp_file_common, rxp_inspect, rxp_list_logs, rxp_make, rxp_py_file, rxp_r_file].
- :heavy_check_mark: Package has continuous integration checks.
- :heavy_multiplication_x: Package coverage is 67% (should be at least 75%).
- :heavy_check_mark: R CMD check found no errors.
- :heavy_check_mark: R CMD check found no warnings.
- :eyes: Function names are duplicated in other packages
Important: All failing checks above must be addressed prior to proceeding
(Checks marked with :eyes: may be optionally addressed.)
Package License: GPL (>= 3)
1. Package Dependencies
Details of Package Dependency Usage (click to open)
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
| type | package | ncalls |
|---|---|---|
| internal | base | 328 |
| internal | rixpress | 53 |
| internal | stats | 8 |
| internal | graphics | 6 |
| internal | utils | 1 |
| imports | jsonlite | 5 |
| imports | igraph | 4 |
| imports | processx | 3 |
| suggests | ggplot2 | 9 |
| suggests | ggdag | 3 |
| suggests | dplyr | NA |
| suggests | knitr | NA |
| suggests | mockery | NA |
| suggests | reticulate | NA |
| suggests | rix | NA |
| suggests | rmarkdown | NA |
| suggests | testthat | NA |
| suggests | usethis | NA |
| suggests | visNetwork | NA |
| linking_to | NA | NA |
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.
base
list (31), sapply (24), sprintf (20), paste0 (15), file.path (14), c (12), deparse1 (12), substitute (12), grep (11), lapply (10), for (9), gsub (9), list.files (8), paste (8), readLines (8), length (7), file (6), data.frame (5), match (5), regmatches (5), unlist (5), args (4), character (4), grepl (4), basename (3), Filter (3), format (3), gregexpr (3), pretty (3), seq_along (3), strsplit (3), sub (3), subset (3), unique (3), vapply (3), any (2), append (2), if (2), lengths (2), setdiff (2), stdout (2), system2 (2), tryCatch (2), which (2), as.character (1), cat (1), col (1), deparse (1), dirname (1), do.call (1), drop (1), file.info (1), getwd (1), I (1), identity (1), is.list (1), is.null (1), names (1), Negate (1), nrow (1), numeric (1), readline (1), readRDS (1), Reduce (1), regexec (1), rep (1), return (1), round (1), source (1), stop (1), Sys.time (1), system.file (1), vector (1)
rixpress
cb (3), get_need_py (3), get_need_r (3), gen_flat_pipeline (2), gen_pipeline (2), generate_configurePhase (2), load_line (2), parse_nix_envs (2), parse_packages (2), parse_rpkgs_git (2), rxp_inspect (2), rxp_list_logs (2), rxp_read_load_setup (2), unnest_all_columns (2), add_import (1), adjust_import (1), adjust_py_packages (1), confirm (1), dag_for_ci (1), export_nix_archive (1), generate_dag (1), generate_libraries_from_nix (1), generate_libraries_script (1), generate_py_libraries_from_nix (1), generate_r_libraries_from_nix (1), generate_r_or_py_libraries_from_nix (1), get_nodes_edges (1), import_formatter_py (1), import_formatter_r (1), import_nix_archive (1), print.derivation (1), rixpress (1), rxp_common_setup (1), rxp_copy (1), rxp_file_common (1), rxp_ga (1)
ggplot2
aes (7), scale_fill_manual (1), scale_shape_manual (1)
stats
df (5), var (2), line (1)
graphics
lines (6)
jsonlite
write_json (3), fromJSON (1), read_json (1)
igraph
write_graph (2), graph_from_data_frame (1), V (1)
ggdag
geom_dag_node (2), as_tidy_dagitty (1)
processx
run (3)
utils
timestamp (1)
2. Statistical Properties
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
Details of statistical properties (click to open)
The package has:
- code in R (100% in 12 files) and
- 1 authors
- 7 vignettes
- no internal data file
- 3 imported packages
- 29 exported functions (median 26 lines of code)
- 70 non-exported functions in R (median 30 lines of code)
Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:
loc= "Lines of Code"fn= "function"exp/not_exp= exported / not exported
All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function
The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.
| measure | value | percentile | noteworthy |
|---|---|---|---|
| files_R | 12 | 63.4 | |
| files_vignettes | 7 | 98.0 | |
| files_tests | 11 | 88.5 | |
| loc_R | 1926 | 81.4 | |
| loc_vignettes | 1257 | 93.2 | |
| loc_tests | 1119 | 85.4 | |
| num_vignettes | 7 | 98.4 | TRUE |
| n_fns_r | 99 | 74.8 | |
| n_fns_r_exported | 29 | 76.9 | |
| n_fns_r_not_exported | 70 | 74.4 | |
| n_fns_per_file_r | 5 | 68.3 | |
| num_params_per_fn | 3 | 29.3 | |
| loc_per_fn_r | 28 | 74.0 | |
| loc_per_fn_r_exp | 26 | 57.3 | |
| loc_per_fn_r_not_exp | 30 | 78.2 | |
| rel_whitespace_R | 15 | 77.2 | |
| rel_whitespace_vignettes | 25 | 91.4 | |
| rel_whitespace_tests | 16 | 80.6 | |
| doclines_per_fn_exp | 25 | 23.9 | |
| doclines_per_fn_not_exp | 0 | 0.0 | TRUE |
| fn_call_network_size | 37 | 58.8 |
2a. Network visualisation
Click to see the interactive network visualisation of calls between objects in package
3. goodpractice and other checks
Details of goodpractice checks (click to open)
3a. Continuous Integration Badges
GitHub Workflow Results
| id | name | conclusion | sha | run_number | date |
|---|---|---|---|---|---|
| 14971213883 | anthophilic-walkingstick: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | success | dbdc68 | 367 | 2025-05-12 |
| 14968903732 | crabby-dromaeosaur: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | failure | ebdacd | 364 | 2025-05-12 |
| 14971161307 | devtools-tests-via-r-nix | success | dbdc68 | 395 | 2025-05-12 |
| 14971138662 | divinatory-neonredguppy: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | success | 1c903f | 366 | 2025-05-12 |
| 14969011823 | lousy-mice: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | success | 5d36a0 | 365 | 2025-05-12 |
| 14971227259 | pages build and deployment | success | 6f3863 | 347 | 2025-05-12 |
| 14971161309 | pkgdown.yaml | success | dbdc68 | 403 | 2025-05-12 |
| 14971161323 | run-rhub-checks | success | dbdc68 | 370 | 2025-05-12 |
| 14968514429 | skeletonlike-wombat: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | failure | 433cb9 | 363 | 2025-05-12 |
| 14971161311 | Test coverage | success | dbdc68 | 143 | 2025-05-12 |
| 14971161310 | Trigger Demo Actions | success | dbdc68 | 236 | 2025-05-12 |
3b. goodpractice results
R CMD check with rcmdcheck
rcmdcheck found no errors, warnings, or notes
Test coverage with covr
Package coverage: 67.05
The following files are not completely covered by tests:
| file | coverage |
|---|---|
| R/generate_dag.R | 58.33% |
| R/plot_dag.R | 36.42% |
| R/rxp_copy.R | 27.78% |
| R/rxp_ga.R | 66.67% |
| R/rxp_make.R | 0% |
| R/rxp_read_load.R | 0% |
Cyclocomplexity with cyclocomp
The following functions have cyclocomplexity >= 15:
| function | cyclocomplexity |
|---|---|
| gen_pipeline | 33 |
| generate_dag | 25 |
Static code analyses with lintr
lintr found the following 148 potential issues:
| message | number of times |
|---|---|
| Avoid 1:nrow(...) expressions, use seq_len. | 1 |
| Avoid changing the working directory, or restore it in on.exit | 11 |
| Avoid library() and require() calls in packages | 20 |
| Avoid using sapply, consider vapply instead, that's type safe | 24 |
| Lines should not be more than 80 characters. This line is 101 characters. | 1 |
| Lines should not be more than 80 characters. This line is 102 characters. | 1 |
| Lines should not be more than 80 characters. This line is 104 characters. | 1 |
| Lines should not be more than 80 characters. This line is 105 characters. | 2 |
| Lines should not be more than 80 characters. This line is 106 characters. | 1 |
| Lines should not be more than 80 characters. This line is 107 characters. | 1 |
| Lines should not be more than 80 characters. This line is 109 characters. | 1 |
| Lines should not be more than 80 characters. This line is 113 characters. | 4 |
| Lines should not be more than 80 characters. This line is 117 characters. | 1 |
| Lines should not be more than 80 characters. This line is 125 characters. | 2 |
| Lines should not be more than 80 characters. This line is 138 characters. | 2 |
| Lines should not be more than 80 characters. This line is 159 characters. | 1 |
| Lines should not be more than 80 characters. This line is 169 characters. | 1 |
| Lines should not be more than 80 characters. This line is 171 characters. | 2 |
| Lines should not be more than 80 characters. This line is 173 characters. | 2 |
| Lines should not be more than 80 characters. This line is 174 characters. | 1 |
| Lines should not be more than 80 characters. This line is 193 characters. | 1 |
| Lines should not be more than 80 characters. This line is 197 characters. | 3 |
| Lines should not be more than 80 characters. This line is 203 characters. | 1 |
| Lines should not be more than 80 characters. This line is 205 characters. | 1 |
| Lines should not be more than 80 characters. This line is 281 characters. | 1 |
| Lines should not be more than 80 characters. This line is 310 characters. | 2 |
| Lines should not be more than 80 characters. This line is 357 characters. | 1 |
| Lines should not be more than 80 characters. This line is 362 characters. | 1 |
| Lines should not be more than 80 characters. This line is 373 characters. | 1 |
| Lines should not be more than 80 characters. This line is 380 characters. | 1 |
| Lines should not be more than 80 characters. This line is 399 characters. | 1 |
| Lines should not be more than 80 characters. This line is 415 characters. | 1 |
| Lines should not be more than 80 characters. This line is 426 characters. | 1 |
| Lines should not be more than 80 characters. This line is 429 characters. | 1 |
| Lines should not be more than 80 characters. This line is 450 characters. | 1 |
| Lines should not be more than 80 characters. This line is 482 characters. | 1 |
| Lines should not be more than 80 characters. This line is 526 characters. | 1 |
| Lines should not be more than 80 characters. This line is 597 characters. | 1 |
| Lines should not be more than 80 characters. This line is 81 characters. | 4 |
| Lines should not be more than 80 characters. This line is 82 characters. | 5 |
| Lines should not be more than 80 characters. This line is 83 characters. | 8 |
| Lines should not be more than 80 characters. This line is 84 characters. | 1 |
| Lines should not be more than 80 characters. This line is 85 characters. | 5 |
| Lines should not be more than 80 characters. This line is 86 characters. | 4 |
| Lines should not be more than 80 characters. This line is 87 characters. | 1 |
| Lines should not be more than 80 characters. This line is 88 characters. | 3 |
| Lines should not be more than 80 characters. This line is 92 characters. | 7 |
| Lines should not be more than 80 characters. This line is 93 characters. | 2 |
| Lines should not be more than 80 characters. This line is 94 characters. | 1 |
| Lines should not be more than 80 characters. This line is 95 characters. | 1 |
| Lines should not be more than 80 characters. This line is 96 characters. | 2 |
| Lines should not be more than 80 characters. This line is 97 characters. | 1 |
| unexpected end of input | 1 |
| unexpected symbol | 1 |
4. Other Checks
Details of other checks (click to open)
:heavy_multiplication_x: The following function name is duplicated in other packages:
-
get_nodes_edgesfrom malan
Package Versions
| package | version |
|---|---|
| pkgstats | 0.2.0.54 |
| pkgcheck | 0.1.2.126 |
Editor-in-Chief Instructions:
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
Thanks @b-rodrigues, can you please address the three failing checks:
✖️ The following functions have no documented return values: [export_nix_archive, import_nix_archive, print.derivation, rxp_init] ✖️ These functions do not have examples: [export_nix_archive, import_nix_archive, print.derivation, rxp_common_setup, rxp_file_common, rxp_inspect, rxp_list_logs, rxp_make, rxp_py_file, rxp_r_file]. ✖️ Package coverage is 67% (should be at least 75%).
I also note that the function with a duplicated name is get_nodes_edges(), which is likely overly generic. I see you've prepended many functions with rxp_ - perhaps you could also do the same with that function? I also see you don't currently use our pkgcheck action. That might help to ensure everything is okay, or if you'd rather not, you can check locally, and then once you confirm all is ✔ , feel free to call @ropensci-review-bot check package. Thanks!
Ok, so I've implemented the changes, but for the unit test coverage. As explained, the package relies a lot on side-effects, so increasing to 75% will be quite difficult, especially because the functions that are not tested are those that would required build artifacts in the Nix store. Mocking that would be pain in the bottom. As a compromise, I set up this repo: https://github.com/b-rodrigues/rixpress_demos with complete pipelines that test these functions.
Would this be ok?
I also note that the function with a duplicated name is get_nodes_edges(), which is likely overly generic.
This function was being exported by mistake, I don't export it anymore, so the clash shouldn't cause any issue.
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
Checks for rixpress (v0.2.0)
git hash: 8e396034
- :heavy_check_mark: Package name is available
- :heavy_check_mark: has a 'codemeta.json' file.
- :heavy_check_mark: has a 'contributing' file.
- :heavy_check_mark: uses 'roxygen2'.
- :heavy_check_mark: 'DESCRIPTION' has a URL field.
- :heavy_check_mark: 'DESCRIPTION' has a BugReports field.
- :heavy_check_mark: Package has at least one HTML vignette
- :heavy_check_mark: All functions have examples.
- :heavy_check_mark: Package has continuous integration checks.
- :heavy_multiplication_x: Package coverage is 67.5% (should be at least 75%).
- :heavy_check_mark: R CMD check found no errors.
- :heavy_check_mark: R CMD check found no warnings.
Important: All failing checks above must be addressed prior to proceeding
Package License: GPL (>= 3)
1. Package Dependencies
Details of Package Dependency Usage (click to open)
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
| type | package | ncalls |
|---|---|---|
| internal | base | 331 |
| internal | rixpress | 52 |
| internal | stats | 8 |
| internal | graphics | 6 |
| internal | utils | 1 |
| imports | jsonlite | 5 |
| imports | igraph | 4 |
| imports | processx | 3 |
| suggests | ggplot2 | 9 |
| suggests | ggdag | 3 |
| suggests | dplyr | NA |
| suggests | knitr | NA |
| suggests | mockery | NA |
| suggests | reticulate | NA |
| suggests | rix | NA |
| suggests | rmarkdown | NA |
| suggests | testthat | NA |
| suggests | usethis | NA |
| suggests | visNetwork | NA |
| linking_to | NA | NA |
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.
base
list (31), sprintf (20), paste0 (15), file.path (14), vapply (13), c (12), deparse1 (12), substitute (12), grep (11), sapply (11), lapply (10), character (9), for (9), gsub (9), list.files (8), paste (8), readLines (8), length (7), file (6), data.frame (5), match (5), regmatches (5), unlist (5), args (4), grepl (4), basename (3), Filter (3), format (3), gregexpr (3), pretty (3), seq_along (3), strsplit (3), sub (3), subset (3), unique (3), any (2), append (2), if (2), lengths (2), setdiff (2), stdout (2), system2 (2), tryCatch (2), which (2), as.character (1), cat (1), col (1), deparse (1), dirname (1), do.call (1), drop (1), file.info (1), getwd (1), I (1), identity (1), is.list (1), is.null (1), logical (1), names (1), Negate (1), nrow (1), numeric (1), readline (1), readRDS (1), Reduce (1), regexec (1), rep (1), return (1), round (1), source (1), stop (1), Sys.time (1), system.file (1), vector (1)
rixpress
cb (3), get_need_py (3), get_need_r (3), gen_flat_pipeline (2), gen_pipeline (2), generate_configurePhase (2), parse_nix_envs (2), parse_packages (2), parse_rpkgs_git (2), rxp_inspect (2), rxp_list_logs (2), rxp_read_load_setup (2), unnest_all_columns (2), add_import (1), adjust_import (1), adjust_py_packages (1), confirm (1), dag_for_ci (1), export_nix_archive (1), generate_dag (1), generate_libraries_from_nix (1), generate_libraries_script (1), generate_py_libraries_from_nix (1), generate_r_libraries_from_nix (1), generate_r_or_py_libraries_from_nix (1), get_nodes_edges (1), import_formatter_py (1), import_formatter_r (1), import_nix_archive (1), load_line (1), print.derivation (1), rixpress (1), rxp_common_setup (1), rxp_copy (1), rxp_file_common (1), rxp_ga (1)
ggplot2
aes (7), scale_fill_manual (1), scale_shape_manual (1)
stats
df (5), var (2), line (1)
graphics
lines (6)
jsonlite
write_json (3), fromJSON (1), read_json (1)
igraph
write_graph (2), graph_from_data_frame (1), V (1)
ggdag
geom_dag_node (2), as_tidy_dagitty (1)
processx
run (3)
utils
timestamp (1)
2. Statistical Properties
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
Details of statistical properties (click to open)
The package has:
- code in R (100% in 12 files) and
- 1 authors
- 7 vignettes
- no internal data file
- 3 imported packages
- 29 exported functions (median 26 lines of code)
- 70 non-exported functions in R (median 30 lines of code)
Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:
loc= "Lines of Code"fn= "function"exp/not_exp= exported / not exported
All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function
The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.
| measure | value | percentile | noteworthy |
|---|---|---|---|
| files_R | 12 | 63.4 | |
| files_vignettes | 7 | 98.0 | |
| files_tests | 11 | 88.5 | |
| loc_R | 1948 | 81.6 | |
| loc_vignettes | 1257 | 93.2 | |
| loc_tests | 1119 | 85.4 | |
| num_vignettes | 7 | 98.4 | TRUE |
| n_fns_r | 99 | 74.8 | |
| n_fns_r_exported | 29 | 76.9 | |
| n_fns_r_not_exported | 70 | 74.4 | |
| n_fns_per_file_r | 5 | 68.3 | |
| num_params_per_fn | 3 | 29.3 | |
| loc_per_fn_r | 28 | 74.0 | |
| loc_per_fn_r_exp | 26 | 57.3 | |
| loc_per_fn_r_not_exp | 30 | 78.2 | |
| rel_whitespace_R | 15 | 77.2 | |
| rel_whitespace_vignettes | 25 | 91.4 | |
| rel_whitespace_tests | 16 | 80.6 | |
| doclines_per_fn_exp | 29 | 31.4 | |
| doclines_per_fn_not_exp | 0 | 0.0 | TRUE |
| fn_call_network_size | 37 | 58.8 |
2a. Network visualisation
Click to see the interactive network visualisation of calls between objects in package
3. goodpractice and other checks
Details of goodpractice checks (click to open)
3a. Continuous Integration Badges
GitHub Workflow Results
| id | name | conclusion | sha | run_number | date |
|---|---|---|---|---|---|
| 14976041988 | acidophilic-americancreamdraft: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | success | 0c0291 | 375 | 2025-05-12 |
| 14976311675 | devtools-tests-via-r-nix | success | 8e3960 | 406 | 2025-05-12 |
| 14976376869 | pages build and deployment | success | ced3fb | 358 | 2025-05-12 |
| 14976311672 | pkgcheck | NA | 8e3960 | 8 | 2025-05-12 |
| 14976311670 | pkgdown.yaml | success | 8e3960 | 414 | 2025-05-12 |
| 14976311683 | run-rhub-checks | success | 8e3960 | 381 | 2025-05-12 |
| 14976363240 | serpentine-xoloitzcuintli: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | NA | 8e3960 | 378 | 2025-05-12 |
| 14976311680 | Test coverage | success | 8e3960 | 154 | 2025-05-12 |
| 14976108488 | timeconsuming-limpkin: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | success | 932bbf | 376 | 2025-05-12 |
| 14976197952 | transcendentalistic-lowchen: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14 | success | 932bbf | 377 | 2025-05-12 |
| 14976311678 | Trigger Demo Actions | success | 8e3960 | 247 | 2025-05-12 |
3b. goodpractice results
R CMD check with rcmdcheck
rcmdcheck found no errors, warnings, or notes
Test coverage with covr
Package coverage: 67.5
The following files are not completely covered by tests:
| file | coverage |
|---|---|
| R/generate_dag.R | 58.33% |
| R/plot_dag.R | 36.42% |
| R/rxp_copy.R | 27.78% |
| R/rxp_ga.R | 66.67% |
| R/rxp_make.R | 0% |
| R/rxp_read_load.R | 0% |
Cyclocomplexity with cyclocomp
The following functions have cyclocomplexity >= 15:
| function | cyclocomplexity |
|---|---|
| gen_pipeline | 33 |
| generate_dag | 25 |
Static code analyses with lintr
lintr found the following 134 potential issues:
| message | number of times |
|---|---|
| Avoid 1:nrow(...) expressions, use seq_len. | 1 |
| Avoid changing the working directory, or restore it in on.exit | 11 |
| Avoid library() and require() calls in packages | 20 |
| Avoid using sapply, consider vapply instead, that's type safe | 10 |
| Lines should not be more than 80 characters. This line is 101 characters. | 1 |
| Lines should not be more than 80 characters. This line is 102 characters. | 1 |
| Lines should not be more than 80 characters. This line is 104 characters. | 1 |
| Lines should not be more than 80 characters. This line is 105 characters. | 2 |
| Lines should not be more than 80 characters. This line is 106 characters. | 1 |
| Lines should not be more than 80 characters. This line is 107 characters. | 1 |
| Lines should not be more than 80 characters. This line is 109 characters. | 1 |
| Lines should not be more than 80 characters. This line is 113 characters. | 4 |
| Lines should not be more than 80 characters. This line is 117 characters. | 1 |
| Lines should not be more than 80 characters. This line is 125 characters. | 2 |
| Lines should not be more than 80 characters. This line is 138 characters. | 2 |
| Lines should not be more than 80 characters. This line is 159 characters. | 1 |
| Lines should not be more than 80 characters. This line is 169 characters. | 1 |
| Lines should not be more than 80 characters. This line is 171 characters. | 2 |
| Lines should not be more than 80 characters. This line is 173 characters. | 2 |
| Lines should not be more than 80 characters. This line is 174 characters. | 1 |
| Lines should not be more than 80 characters. This line is 193 characters. | 1 |
| Lines should not be more than 80 characters. This line is 197 characters. | 3 |
| Lines should not be more than 80 characters. This line is 203 characters. | 1 |
| Lines should not be more than 80 characters. This line is 205 characters. | 1 |
| Lines should not be more than 80 characters. This line is 281 characters. | 1 |
| Lines should not be more than 80 characters. This line is 310 characters. | 2 |
| Lines should not be more than 80 characters. This line is 357 characters. | 1 |
| Lines should not be more than 80 characters. This line is 362 characters. | 1 |
| Lines should not be more than 80 characters. This line is 373 characters. | 1 |
| Lines should not be more than 80 characters. This line is 380 characters. | 1 |
| Lines should not be more than 80 characters. This line is 399 characters. | 1 |
| Lines should not be more than 80 characters. This line is 415 characters. | 1 |
| Lines should not be more than 80 characters. This line is 426 characters. | 1 |
| Lines should not be more than 80 characters. This line is 429 characters. | 1 |
| Lines should not be more than 80 characters. This line is 450 characters. | 1 |
| Lines should not be more than 80 characters. This line is 482 characters. | 1 |
| Lines should not be more than 80 characters. This line is 526 characters. | 1 |
| Lines should not be more than 80 characters. This line is 597 characters. | 1 |
| Lines should not be more than 80 characters. This line is 81 characters. | 4 |
| Lines should not be more than 80 characters. This line is 82 characters. | 5 |
| Lines should not be more than 80 characters. This line is 83 characters. | 8 |
| Lines should not be more than 80 characters. This line is 84 characters. | 1 |
| Lines should not be more than 80 characters. This line is 85 characters. | 3 |
| Lines should not be more than 80 characters. This line is 86 characters. | 4 |
| Lines should not be more than 80 characters. This line is 87 characters. | 3 |
| Lines should not be more than 80 characters. This line is 88 characters. | 3 |
| Lines should not be more than 80 characters. This line is 92 characters. | 7 |
| Lines should not be more than 80 characters. This line is 93 characters. | 2 |
| Lines should not be more than 80 characters. This line is 94 characters. | 1 |
| Lines should not be more than 80 characters. This line is 95 characters. | 1 |
| Lines should not be more than 80 characters. This line is 96 characters. | 2 |
| Lines should not be more than 80 characters. This line is 97 characters. | 1 |
| unexpected end of input | 1 |
| unexpected symbol | 1 |
Package Versions
| package | version |
|---|---|
| pkgstats | 0.2.0.54 |
| pkgcheck | 0.1.2.126 |
Editor-in-Chief Instructions:
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
Preliminay Editor checks:
- [ ] Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [ ] Is the reference index page clear (grouped by topic if necessary)?
- [x] Are vignettes readable, sufficiently detailed and not just perfunctory?
- [x] Fit: The package meets criteria for fit and overlap.
- [x] Installation instructions: Are installation instructions clear enough for human users?
- [x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
- [ ] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
- [x] License: The package has a CRAN or OSI accepted license.
- [x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?
Editor comments
Thanks for your submission @b-rodrigues, which looks like a very useful extension of {rix}. I expect we'll proceed soon, but note first a couple of very minor issues from the checks above:
- The package reference page has all functions together. Could you please structure the reference index by adding {roxygen2}
@familytags, as described in this section of our Dev Guide? - Your extended checks repository in https://github.com/b-rodrigues/rixpress_demos is a great solution to testing, and definitely satisfactory for us. In order to satisfy the second missing item in the checklist above, could you please:
- Add a bit more detail to your current
CONTRIBUTING.md, especially including description of how {rixpress-demo} is used in tests; and - Explicitly reference
CONTRIBUTING.mdsomewhere in your readme, with brief instructions on how to contribute. - Not necessary now, but good to keep in mind: Issue templates would provide a great way to ensure all who wanted to contribute were aware of {rixpress-demo}, and understood the relationship between the two repos.
- Add a bit more detail to your current
Let us know when those points have been addressed, and we'll proceed from there. Thanks :+1:
hi @mpadge thanks for your feedback! I've addressed your suggestions.
@b-rodrigues Sorry for slight delay here, we're still trying to find and assign an editor to handle this. Should be assigned soon.
no worries :)
@ropensci-review-bot assign @ldecicco-USGS as editor
Assigned! @ldecicco-USGS is now the editor
Editor checks:
- [x] Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [x] Is the reference index page clear (grouped by topic if necessary)?
- [x] Are vignettes readable, sufficiently detailed and not just perfunctory?
- [x] Fit: The package meets criteria for fit and overlap.
- [x] Installation instructions: Are installation instructions clear enough for human users?
- [x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
- [x] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
- [x] License: The package has a CRAN or OSI accepted license.
- [x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?
Editor comments
Looks great as usual.
@ropensci-review-bot seeking reviewers
Please add this badge to the README of your package repository:
[](https://github.com/ropensci/software-review/issues/706)
Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news
@ropensci-review-bot assign @wlandau as reviewer
@wlandau added to the reviewers list. Review due date is 2025-07-28. Thanks @wlandau for accepting to review! Please refer to our reviewer guide.
rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.
@wlandau: If you haven't done so, please fill this form for us to update our reviewers records.
@ropensci-review-bot assign @amart90 as reviewer
@amart90 added to the reviewers list. Review due date is 2025-07-28. Thanks @amart90 for accepting to review! Please refer to our reviewer guide.
rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.
@amart90: If you haven't done so, please fill this form for us to update our reviewers records.
Package Review
- Briefly describe any working relationship you have (had) with the package authors.
Bruno and I follow each other's work as members of the R community. We have not yet worked together directly on a project.
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).
As the author of targets, I took a careful look at the coi guidelines:
The potential editor or reviewer has a conflict of interest if:...The potential reviewer/editor has significantly contributed to a competitor project.
There is obvious overlap, but I would not say rixpress is a competitor. rixpress has a niche outside the scope of targets:
nix-storeas the engine to run pipelines and store data.- Polyglot pipelines where Python and Julia are first-class citizens alongside R.
- Multi-environment pipelines.
I checked with @ldecicco-USGS, who agreed.
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
- [x] Installation instructions: for the development version of package and any non-standard dependencies in README
- [x] Vignette(s): demonstrating major functionality that runs successfully locally
- [x] Function Documentation: for all exported functions
- [x] Examples: (that run successfully locally) for all exported functions
- [x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL,BugReportsandMaintainer(which may be autogenerated viaAuthors@R).
Functionality
- [x] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software have been confirmed.
- [x] Performance: Any performance claims of the software have been confirmed.
- [x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
Estimated hours spent reviewing: 4
- [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Review Comments
Overview
rixpress is an excellent prospective addition to rOpenSci. It fills a valuable niche in reproducible computation, the engineering is fantastic, and the documentation is comprehensive. Because the quality is already so high, I did not need to spend much time checking package development minutia. I spent most of my review time on high-level issues and my own experience as a new user.
Scope of this review
I reviewed:
- The documentation at https://b-rodrigues.github.io/rixpress/.
- Examples
basic_r,r_multi_envs, andr_py_xgboostfrom https://github.com/b-rodrigues/rixpress_demos. - The
rixpresssource code and test suite.
Scope of rixpress
Arguably the most essential but most difficult part of developing any tool is establishing a clear and crisp set of requirements. Explicit pre-specified boundaries help prevent scope creep and ensure a package's priorities succeed long-term. For pipeline tools, scope is even more essential and even more challenging than usual, both because of the many different opinions about what a pipeline tool should do, and because of the huge variety of pipelines users routinely create.
I developed targets as a highly opinionated tool with R-focused research-oriented scenarios in mind. This vision was somewhat implicit, and I did not have enough experience then to completely spell it out. I regularly hear from people who use it for cases I did not consider: simple ETL operations on big data, database query workflows, daily pipelines where historical runs matter, etc. Some users even approach targets as an Airflow-like tool rather than a Make-like one, and they are looking for a feature set closer to what maestro provides.
You might have the same experience with rixpress. For example, users who switch from targets to rixpress may ask you to support branching, alternative DSLs, interactive debugging, fancy progress monitoring, alternative storage options, computing on clusters, alternative DAG visualizations, etc.
For rixpress, I would like to understand what the package may cover in the future, and what it definitely will not support. I think a dedicated section on scope in the documentation (possibly linked from the issue templates) will help set expectations for users who request features, and it will help you maintain rixpress for years to come.
At this early stage, the main areas of focus seem to be as follows (please correct me if I am wrong):
- Bringing pipeline functionality from
nix-storeto R. - Interactive read-only inspection: visualization, reading from the data store, and historical runs.
- R/Python/Julia interoperability.
- Portability: through Nix itself, and continuous integration.
(1) seems like a promising direction because it invests in the qualities that makes rixpress most unique.
An aside: if you intend to expand on (1), e.g. alternative store types, you might also consider writing a low-level Nix client to facilitate the implementation, kind of like gert for Git, cmdstanr for CmdStan, or paws for AWS. This might even help you maintain rix.
Visualization
DAG visualizations greatly improve the user experience, but they are also a Pandora's Box of scope creep. rixpress already supports 3 backends for graphs: visNetwork, ggdag, and GraphViz (DOT; for CI). And each one is a magnet for feature requests.
To simplify the visualization feature set, what about using mermaid.js instead of ggdag or GraphViz? Mermaid graphs are just text, and they are very easy to generate without any additional R packages. For CI, you could use https://github.com/AlexanderGrooff/mermaid-ascii, which I think would produce graphs that are more readable and visually appealing than GraphViz can render (e.g. https://github.com/b-rodrigues/rixpress_demos/actions/runs/16252270236/job/45883546684#step:9:11).
visNetwork might only be necessary if you expect enormous pipelines whose graphs can only be explored interactively. If you do decide to keep rixpress::rxp_visnetwork(), I suggest keeping the feature set simple and tightly scoped. (Maybe it would also be a good idea to disable physics to improve rendering performance for large graphs.) visNetwork is great at zooming in and out of graphs of pretty much any size, but from experience developing targets::tar_visnetwork(), I have found it does not excel at creating nice-looking polished graphs. I think mermaid.js is much better at feature-rich pretty graphs.
Workflow functions
From the examples, I see two patterns for setting up and running pipelines. In simple cases:
list(
rxp_r(...)
) |>
rixpress()
but for Python projects such as r_py_xgboost:
list(
rxp_py(...)
) |>
rixpress(build = FALSE)
adjust_import(...)
rxp_make()
I think it would be clearer and more consistent to create a separate function (maybe rxp_populate()) which runs the equivalent of rixpress(build = FALSE). Then, if rixpress() itself is still needed, it could serve as the equivalent of rxp_populate() + rxp_make().
add_import() and adjust_import() feel a bit awkward as separate steps. You might instead consider an interface like:
list(
rxp_py(...)
) |>
rxp_populate(
derivations,
py_imports = c(
numpy = "from numpy import array, loadtxt",
xgboost = "from xgboost import XGBClassifier"
)
)
Names for functions and classes
I have minor suggestions to make the names of functions more internally consistent:
export_nix_archive()=>rxp_export_nix_archive()import_nix_archive()=>rxp_import_nix_archive()generate_dag()=>rxp_generate_dag()orrxp_write_dag()orrxp_save_dag()
(You may not need dag_for_ci(), add_import(), or adjust_import() if you agree with my suggestions from earlier.)
In addition, functions like rxp_r() produce an object of class "derivation". I suggest renaming it to something like "rxp_derivation" so it does not conflict with e.g. mathematical packages with their own kinds of "derivations".
Installation experience
I am new to Nix, rix, and rixpress, and I began by installing the toolchain from scratch on an M2 Macbook Pro with OS 15.5. This is my work computer, so it has more security restrictions than a regular personal computer.
I followed the rix setup guide for macOS, which was clear and comprehensive. The curl command successfully downloaded the Determinate Systems installer, but the installer itself failed. First I realized I needed to run it with sudo, but even that failed. Nix installed successfully when I navigated a browser https://docs.determinate.systems/, manually downloaded the installer, and double-clicked it to run it. Maybe consider updating the vignette to mention that the point-and-click route is possible?
Afterwards, I installed cachix and ran cachix use rstats-on-nix. library(rix) initially showed these warning messages, but rix::setup_cachix() silenced them. The next library(rix) gave me a warning about an incomplete final line in ~/.config/nix/nix.conf, which I solved by manually opening the text file and adding a line break. I suggest ensuring rix::setup_cachix() leaves a terminating newline character in ~/.config/nix/nix.conf.
Storage
I really like the build logs feature you describe in https://b-rodrigues.github.io/rixpress/articles/g-logs.html. Over multiple pipeline runs, however, storage may accumulate, especially because Nix uses content-addressable storage (by hash). It may help to describe in that vignette how users can leverage the garbage collection features of Nix to clear out the data that is no longer need.
Multi-line expressions
The following pipeline succeeds:
library(rixpress)
list(
rxp_r(
name = derivation,
expr = 1 + 1
)
) |>
rixpress()
But a similar one fails:
library(rixpress)
list(
rxp_r(
name = derivation,
expr = {
message("Running derivation")
1 + 1
}
)
) |>
rixpress()
with the error message:
Error: unexpected numeric constant in " derivation <- { message('Running derivation') 1
Same if I add a semicolon after the message() statement. I expect this rxp_r() etc. do not support multi-line expressions. I would suggest either adding this support or requiring expressions to be pure function calls.
Testing
I recommend including skip_if_not_installed() statements in tests where Suggests: packages are used (such as mockery and reticulate). In addition, when I ran the tests locally, one test threw a warning:
Warning (test-generate_libraries_from_nix.R:42:3): generate_py_libraries_from_nix: generate Py script by parsing default.nix
Python packages have been requested, but 'reticulate' is not in your list of R packages. If you want to handle Python objects from your R session, consider adding 'reticulate' to the list of R packages.
Backtrace
▆
1. └─rix::rix(...) at test-generate_libraries_from_nix.R:42:3
Adding r_pkgs = reticulate to rix::rix() in https://github.com/b-rodrigues/rixpress/blob/ea052f4a024bb47705bf186541380d5febac279e/tests/testthat/test-generate_libraries_from_nix.R#L42 should remove it.
Test coverage from covr is lower than I normally see in packages, but I really like your approach to offload to https://github.com/b-rodrigues/rixpress_demos. If the number of projects in that repo grows unmanageable at some point, you might consider creating a new GitHub org for them like https://github.com/nf-core does for Nextflow.
Checks
When I ran devtools::check() locally, I saw: the note:
✔ checking for non-standard things in the check directory
N checking for detritus in the temp directory
Found the following files/directories:
‘RtmptcpOyn_repo_hash_url_jnlhe’
I have been flagged for this before when trying to submit packages to CRAN.
Lints
On my local machine, devtools::lint() shows many lints, including:
data-raw/gen_pipeline.R:6:3: style: [quotes_linter] Only use double-quotes.
'mtcars.csv',
^~~~~~~~~~~~
data-raw/gen_pipeline.R:43:2: style: [commented_code_linter] Remove commented code.
#rxp_make()
^~~~~~~~~~
data-raw/jl_example/functions.R:1:34: style: [brace_linter] There should be a space before an opening curly brace.
prepare_data <- function(laplace){
^
I have never used the Air formatter, and I do see you have https://github.com/b-rodrigues/rixpress/blob/main/.github/workflows/style-with-air.yaml, so please disregard if there is an inherent conflict between Air and lintr.
devtools::spell_check() has many findings, including:
> devtools::spell_check()
DESCRIPTION does not contain 'Language' field. Defaulting to 'en-US'.
WORD FOUND IN
’s README.md:18
al a-intro-concepts.Rmd:124
Analysing d-polyglot.Rmd:34
autoplay b-core-functions.Rmd:272
buildInputs make_derivation_snippet.Rd:20
cachix d-polyglot.Rmd:46
Cachix d-polyglot.Rmd:62
cancelled rxp_init.Rd:17
ci dag_for_ci.Rd:36,40
generate_dag.Rd:29,33
rxp_ga.Rd:29,33
cmdstanr f-cmdstanr.Rmd:2
configurePhase make_derivation_snippet.Rd:20
cryptographic a-intro-concepts.Rmd:164,171,214,230
CTRL c-tutorial.Rmd:105
d-polyglot.Rmd:107
d2-polyglot-julia.Rmd:82
deriv rixpress.Rd:18
...
You can exclude specific false positives in inst/WORDLIST.
On my machine, urlchecker::url_check() shows:
✖ Error: vignettes/a-intro-concepts.Rmd:32:21
403: Forbidden
spectrum/continuum](https://www.researchgate.net/figure/Reproducibility-spectrum-as-Peng-2011-stated_fig1_354765302),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This could be because of the extra network security in my work environment, or it could be because ResearchGate has checks for bots. In the latter case, CRAN might flag the URL.
Miscellaneous suggestions
- In
rxp_copy(), I seeSys.chmod(all_files, mode = "777"), which could be risky on shared file systems. Is there a more restricted permission set that would still work? - For
rxp_r_file(), the implicitroxygen2@titletag is"rxp_r_file". I suggest a more descriptive name. Same forrxp_py_file(). - Please consider changing the name of the default branch from
"master"to"main"in https://github.com/b-rodrigues/rixpress_demos. nix-store --realizehas many options for--verbose. I suggest making theverboseargument ofrxp_make()an integer to support this existing functionality. There are many more features you could consider for helping users monitor pipelines, some of which are more feasible than others, and this one seems like the lowest-hanging fruit.- In
rixpress_demos/r_multi_envs, I recommend a more formal/safe choice for the meme image. - Instead of prefixes to control the order vignettes are listed, you could consider relying on pkgdown yaml for this, e.g. https://github.com/wlandau/crew/blob/728c45536d58faf1794e2a16c469fdce4a815176/_pkgdown.yml#L6-L19.
Many thanks @wlandau for your review!
I'm currently on holidays without access to a computer so I'll only be able to address your comments in 2 weeks time. Just wanted to let you know 😁
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
- Briefly describe any working relationship you have (had) with the package authors.
- While I have followed the author's work and read his book, I have not worked with the package author.
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
- [x] Installation instructions: for the development version of package and any non-standard dependencies in README
- There are instructions for installing
rixpressin the README. This package is somewhat unique in that, while its intallation is straghtforward, to use it as intended a fairly involved installation process must be completed. There is a link to these instructions which included as a part of therixpackage.
- There are instructions for installing
- [x] Vignette(s): demonstrating major functionality that runs successfully locally
- [x] Function Documentation: for all exported functions
- [x] Examples: (that run successfully locally) for all exported functions
- [x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL,BugReportsandMaintainer(which may be autogenerated viaAuthors@R).
Functionality
- [x] Installation: Installation succeeds as documented.
- While I was able to sucessfully install
rixpresson my native OS (Windows) and within the Nix shell, I was unsuccessful getting some aspects of therixinstallation completed, including IDE integration. Because of the security requirements of my work computer, I think this is not arixproblem and my own IT issue. I wanted to include that information to provide context for what I reviewed; however, I think it is out of the scope of the review ofrixpress.
- While I was able to sucessfully install
- [x] Functionality: Any functional claims of the software have been confirmed.
- [x] Performance: Any performance claims of the software have been confirmed.
- [x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
Estimated hours spent reviewing: 14
- [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Review Comments
I, similar to Will, am familiar with the quality of your work. I did a relatively quick look through the source code but I focused my effort on the usability, particularly as a user that is new to rixpress, rix, and Nix. I will try to ensure I don't have comments that overlap with Will's.
I had some difficulty getting rix installed through WSL on my machine and I was never able to sucessfully use an IDE (either through a native installation or a Nix-managed installation). These are likely issues that are due, at least partially, to the enhanced security requirements of my work computer. I won't go into any further details about this here, however, because they are related to Nix, WSL, and other things outside of the scope of this review.
General impressions
While I am a regular user of reproducible pipelines, my expirience comes as a user of targets. rixpress is different, not just in the implementation, but in the paradigm as well. While targets offers some gains in reproducibility (primarily in the isolation of runtime environments) rixpress offers several additional layers of reproducibility through the tracking of not just the input files and code, but the full environment: R and R packages, system-level software dependencies, environmental variables, etc. Additionally, it supports a truly polyglot pipeline that isn't reliant on reticulate to execute Python code, which can be brittle. This is sure to be an important contribution to the R community.
This package is conceptualized and implemented thoughtfully. As usual, your documentation is thorough and well written, which allows for the success of this package despite the complex concepts and details required to implement this. You have done a good job tucking away the heavy technical details from users who want this to "just work," while providing the details in the many vignettes for advanced users who want or need this information.
However, this does create a lot of opportunities for users to get lost when considering "multiple points of entry." There are many cases where information that would be helpful for the user to know is described in a vignette, but not in the function documentation. While I understand that a balance must be struck between completeness of documentation and brevity, there are opportunities for improving the completeness of the documentation, even if it is at the cost of repeating information. I include some examples in the Documentation section below, but a thoughtful review of all function documentation would greatly benefit the ability for a new user to begin using rixpress.
Packaging
- I built rixpress locally and did not recieve any errors, warnings or notes from R CMD check.
- All expected components of the package are present.
Code and testing
- I appreciate the automatic code formatting with Air. It makes scanning the code predicatable and easy.
- I executed all the tests I could on my native system (without having to execute them in a Nix environment) and they all passed.
- While test coverage is reported as being low (
covr::package_coverage()reports 46.64% coverage overall), I found testing to be reasonably comprehensive. I don't know all the specifics for howcovrworks, but there must be some disconnect with the way your tests are written and the way it assesses coverage. For example, it evaluatedR/generate_dag.Ras having 0.0% test coverage. However, I foundtests/testthat/test-generate_dag.Rto reasonable covergenerate_dag(). One can always add more tests (and even superfical tests to inflate test coverage), but the current coverage seems appropriate to me despite low coverage reports. testthat::skip_on_cran()is used on the test"rxp_init creates expected files"and I am not sure that this needs to be skipped. There are other tests that create files (e.g., test-rxp_copy.R) where files are created but the test can be completed on CRAN. Evaluate if this is necessary.
User interface
- I had a similar suggestion to Will about the
rixpress()function. It doesn't follow the same pattern used throughout most of the function names where it has arxp_prefix. At the very least, I think it could benefit from a more intuitive name that begins withrxp_and is followed by a verb (as discussed in Tidy design princiles). Even better might be removingrixpressaltogether and having a function that builds the pipeline plan and another to execute the pipeline. This is largely what Will said, but I include it because I think it is an important improvement to the interface. - I found error messages to be helpful and adequately descriptive in my testing.
Documentation
- There is a missing word in the desccription field in the documentation for
rixpress(). I think the word built needs to be added to the sentence "By default, the pipeline is also immediately built after being generated...". - The documentation for
rxp_r/py_file()includes three methods to read data in. In the Examples, you demonstate methods 1 and 3. It might also be worthwhile to include method 2. While there has to be a balance between completeness and brevity in the Examples, I think it is a worthwhile addition. - In
vignettes/b-core-functions.Rmd, I think the third paragraph would make more sense it it started with\read_function` requires an R function with a single argument...sincerxp_r_file()` has three required arguments and only one of which requires an R function. - At the end of the Generating the pipeline section of
b-core-functions.Rmd, you provide a bulleted list of actions thatrixpress()performs. I found this quite helpful and I think the function documentation forrixpress()would benefit form these bullets (or something similarly concise and explicit). - In Vignette C: Tutorual, in regards to
rxp_inspect(), you mention "... and an object you didn’t define called all-derivations. This last object is mostly for internal rixpress use, and you can safely ignore it." I found this helpful context that I wish I had seen earlier in my use ofrxp_inspect(). Consider adding a note aboutall-derivationsin the function documentation forrxp_inspect(). - In Vignette D: Polyglot pipelines you give an example where seralize and unserialize functions are passed as characters rather than expressions for both
rxp_py()andrxp_r(). I think Vignette F: Cmdstandr suggests that custom functions defined infunctions.Rshould be passed as characters, but in either case, it in not clear to me when a character should be passed rather than a function forrxp_r(). It would be helpful to update the documentation forrxp_r()to describe all the acceptable input types and when they should be used. - In Vignette D: Polyglot pipelines, you mention "In the future, other languages could be added to rixpress, notably Julia." However, it appears that Julia is already supported. Consider clarifying this.
- Vignete G: cached artifacts is helpful. Consider linking to in in the function documentation for
Functionality
- I am curious about the case where there are there are multiple different inputs to an
rxp_r/py/jl()call that need different unserialization functions. For example, if I haverxp_r(out_df, custom_fn(model = keras_model, data = data_frame))wherekeras_modelshould be unserialized withkeras::load_model_hdf5()anddata_frameshould be unserialized withreadRDS. Can theunserialize_functiontake multiple functions? If so, it might be helpful to make that explicit in the documentation. If not, is there a plan to support derications that have multiple inputs that require different unserialization functions? I don't want to contrive a bunch of edge cases, but to me this seems like it would be somewhat common (at least in my workflows). - Similarly, I tried to use a custom function defined in "functions.R" (with
additional_files = "functions.R")specified) as a named function inserialization_function, the custom function was not found. It might be nice to be able to write cusome (un)serialization function. If that is not simple to implement, it is probably worth making it explicit in the domumentation that this must be a namespace function or anonymous function. - It seems likely to me that after many repeated builds of the same pipeline, especially one that requires large datasets or many intermediate derivations, storage space could become an issue with all previous artifacts stored (as described in Vignete G: cached artifacts). Now, to clear the store the guidence is to call
nix-store --gcin the terminal. It might be nice to provide an R wrapper for this. And beter yet, provide a bit more control besides just clearing all build artifacts; for example, giving the user tehe option to delete stored artifacts from before a given date. This is a soft suggestion.
:calendar: @wlandau you have 2 days left before the due date for your review (2025-07-28).
:calendar: @amart90 you have 2 days left before the due date for your review (2025-07-28).
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/706#issuecomment-3109964972 time 6