software-review rixpress: Reproducible Analytical Pipelines with Nix

Submitting Author Name: Bruno Rodrigues Submitting Author Github Handle: @b-rodrigues Repository: https://github.com/b-rodrigues/rixpress Version submitted: 0.2.0 Submission type: Standard Editor: @ldecicco-USGS Reviewers: TBD

Archive: TBD Version accepted: TBD Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: rixpress
Title: Build Reproducible Analytical Pipelines With Nix
Version: 0.2.0
Authors@R:
    person("Bruno", "Rodrigues", , "[email protected]", role = c("aut", "cre"))
Description: Streamlines the creation of reproducible analytical pipelines using
  `default.nix` expressions generated via `{rix}` for reproducibility. Define
  derivations in R or Python, chain them into a composition of pure functions
  and build the resulting pipeline using `Nix` as the underlying end-to-end build
  tool. Functions to plot a DAG representation of the pipeline are included,
  as well as functions to load and inspect intermediary results for interactive
  analysis. User experience heavily inspired by the `{targets}` package.
License: GPL (>= 3)
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
URL: https://github.com/b-rodrigues/rixpress/, https://b-rodrigues.github.io/rixpress/
BugReports: https://github.com/b-rodrigues/rixpress/issues
Depends:
    R (>= 4.1.0)
Imports:
    igraph,
    jsonlite,
    processx
RoxygenNote: 7.3.2
Suggests:
    dplyr,
    ggdag,
    ggplot2,
    knitr,
    mockery,
    reticulate,
    rix,
    rmarkdown,
    testthat (>= 3.0.0),
    usethis,
    visNetwork
Config/testthat/edition: 3
VignetteBuilder: knitr

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- [ ] data retrieval
- [ ] data extraction
- [ ] data munging
- [ ] data deposition
- [ ] data validation and testing
- [x] workflow automation
- [ ] version control
- [ ] citation management and bibliometrics
- [ ] scientific software wrappers
- [ ] field and lab reproducibility tools
- [ ] database software bindings
- [ ] geospatial data
Explain how and why the package falls under these categories (briefly, 1-2 sentences):

This package is intended to help users set up reproducible pipelines using the Nix programming language for enhanced reproducibility.

Who is the target audience and what are scientific applications of this package?

The target audience is anyone wanting to switch from "script-based workflows" to build automation. rixpress generates valid Nix expressions from simple R function to define reproducible pipelines, and is heavily inspired by {targets}. The main difference between {targets} and this package is that the "heavy lifting" is performed by Nix, and it works very closely with my previous packages called {rix} which allows data scientists to set up reproducible environments using Nix. Also, because the underlying engine is Nix, it is language-agnostic, and so it is possible to define steps that use Python. These steps written in Python are not executed with {reticulate}, but instead run in a dedicated Python environment. Data transfer between Python an R is facilitated with {reticulate} though.

Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

The main inspiration of this packages is {targets} and in combination with {rix}, one could set up a pipeline in a reproducible environment as well.

(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Link to presubmission: https://github.com/ropensci/software-review/issues/699

@maurolepore

Explain reasons for any pkgcheck items which your package is unable to pass.

Because this package relies heavily on side effects, unit tests are quite cumbersome to write, so I set up this other repository: https://github.com/b-rodrigues/rixpress_demos which contains many example pipelines that run on each push to {rixpress}'s repository. Thanks to LLM's I was able to improve test coverage to 67% (see https://github.com/b-rodrigues/rixpress/actions/runs/14971090564)

Technical checks

Confirm each of the following by checking the box.

[x] I have read the rOpenSci packaging guide.
[x] I have read the author guide and I expect to maintain this package for at least 2 years or to find a replacement.

This package:

[x] does not violate the Terms of Service of any service it interacts with.
[ ] has a CRAN and OSI accepted license.
[x] contains a README with instructions for installing the development version.
[x] includes documentation with examples for all functions, created with roxygen2.
[x] contains a vignette with examples of its essential functions and uses.
[x] has a test suite.
[x] has continuous integration, including reporting of test coverage.

Publication options

[x] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options

[ ] The package is novel and will be of interest to the broad readership of the journal.
[ ] The manuscript describing the package is no longer than 3000 words.
[ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
(Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
(Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
(Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

[x] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

May 12 '25 11:05 b-rodrigues

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

May 12 '25 11:05 ropensci-review-bot

:rocket:

Editor check started

:wave:

May 12 '25 11:05 ropensci-review-bot

Checks for rixpress (v0.2.0)

git hash: dbdc68c8

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_multiplication_x: The following functions have no documented return values: [export_nix_archive, import_nix_archive, print.derivation, rxp_init]
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_multiplication_x: These functions do not have examples: [export_nix_archive, import_nix_archive, print.derivation, rxp_common_setup, rxp_file_common, rxp_inspect, rxp_list_logs, rxp_make, rxp_py_file, rxp_r_file].
:heavy_check_mark: Package has continuous integration checks.
:heavy_multiplication_x: Package coverage is 67% (should be at least 75%).
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.
:eyes: Function names are duplicated in other packages

Important: All failing checks above must be addressed prior to proceeding

(Checks marked with :eyes: may be optionally addressed.)

Package License: GPL (>= 3)

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	328
internal	rixpress	53
internal	stats	8
internal	graphics	6
internal	utils	1
imports	jsonlite	5
imports	igraph	4
imports	processx	3
suggests	ggplot2	9
suggests	ggdag	3
suggests	dplyr	NA
suggests	knitr	NA
suggests	mockery	NA
suggests	reticulate	NA
suggests	rix	NA
suggests	rmarkdown	NA
suggests	testthat	NA
suggests	usethis	NA
suggests	visNetwork	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (31), sapply (24), sprintf (20), paste0 (15), file.path (14), c (12), deparse1 (12), substitute (12), grep (11), lapply (10), for (9), gsub (9), list.files (8), paste (8), readLines (8), length (7), file (6), data.frame (5), match (5), regmatches (5), unlist (5), args (4), character (4), grepl (4), basename (3), Filter (3), format (3), gregexpr (3), pretty (3), seq_along (3), strsplit (3), sub (3), subset (3), unique (3), vapply (3), any (2), append (2), if (2), lengths (2), setdiff (2), stdout (2), system2 (2), tryCatch (2), which (2), as.character (1), cat (1), col (1), deparse (1), dirname (1), do.call (1), drop (1), file.info (1), getwd (1), I (1), identity (1), is.list (1), is.null (1), names (1), Negate (1), nrow (1), numeric (1), readline (1), readRDS (1), Reduce (1), regexec (1), rep (1), return (1), round (1), source (1), stop (1), Sys.time (1), system.file (1), vector (1)

rixpress

cb (3), get_need_py (3), get_need_r (3), gen_flat_pipeline (2), gen_pipeline (2), generate_configurePhase (2), load_line (2), parse_nix_envs (2), parse_packages (2), parse_rpkgs_git (2), rxp_inspect (2), rxp_list_logs (2), rxp_read_load_setup (2), unnest_all_columns (2), add_import (1), adjust_import (1), adjust_py_packages (1), confirm (1), dag_for_ci (1), export_nix_archive (1), generate_dag (1), generate_libraries_from_nix (1), generate_libraries_script (1), generate_py_libraries_from_nix (1), generate_r_libraries_from_nix (1), generate_r_or_py_libraries_from_nix (1), get_nodes_edges (1), import_formatter_py (1), import_formatter_r (1), import_nix_archive (1), print.derivation (1), rixpress (1), rxp_common_setup (1), rxp_copy (1), rxp_file_common (1), rxp_ga (1)

ggplot2

aes (7), scale_fill_manual (1), scale_shape_manual (1)

stats

df (5), var (2), line (1)

graphics

lines (6)

jsonlite

write_json (3), fromJSON (1), read_json (1)

igraph

write_graph (2), graph_from_data_frame (1), V (1)

ggdag

geom_dag_node (2), as_tidy_dagitty (1)

processx

run (3)

utils

timestamp (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 12 files) and
1 authors
7 vignettes
no internal data file
3 imported packages
29 exported functions (median 26 lines of code)
70 non-exported functions in R (median 30 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	12	63.4
files_vignettes	7	98.0
files_tests	11	88.5
loc_R	1926	81.4
loc_vignettes	1257	93.2
loc_tests	1119	85.4
num_vignettes	7	98.4	TRUE
n_fns_r	99	74.8
n_fns_r_exported	29	76.9
n_fns_r_not_exported	70	74.4
n_fns_per_file_r	5	68.3
num_params_per_fn	3	29.3
loc_per_fn_r	28	74.0
loc_per_fn_r_exp	26	57.3
loc_per_fn_r_not_exp	30	78.2
rel_whitespace_R	15	77.2
rel_whitespace_vignettes	25	91.4
rel_whitespace_tests	16	80.6
doclines_per_fn_exp	25	23.9
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	37	58.8

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
14971213883	anthophilic-walkingstick: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	success	dbdc68	367	2025-05-12
14968903732	crabby-dromaeosaur: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	failure	ebdacd	364	2025-05-12
14971161307	devtools-tests-via-r-nix	success	dbdc68	395	2025-05-12
14971138662	divinatory-neonredguppy: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	success	1c903f	366	2025-05-12
14969011823	lousy-mice: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	success	5d36a0	365	2025-05-12
14971227259	pages build and deployment	success	6f3863	347	2025-05-12
14971161309	pkgdown.yaml	success	dbdc68	403	2025-05-12
14971161323	run-rhub-checks	success	dbdc68	370	2025-05-12
14968514429	skeletonlike-wombat: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	failure	433cb9	363	2025-05-12
14971161311	Test coverage	success	dbdc68	143	2025-05-12
14971161310	Trigger Demo Actions	success	dbdc68	236	2025-05-12

3b. `goodpractice` results

`R CMD check` with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 67.05

The following files are not completely covered by tests:

file	coverage
R/generate_dag.R	58.33%
R/plot_dag.R	36.42%
R/rxp_copy.R	27.78%
R/rxp_ga.R	66.67%
R/rxp_make.R	0%
R/rxp_read_load.R	0%

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
gen_pipeline	33
generate_dag	25

Static code analyses with lintr

lintr found the following 148 potential issues:

message	number of times
Avoid 1:nrow(...) expressions, use seq_len.	1
Avoid changing the working directory, or restore it in on.exit	11
Avoid library() and require() calls in packages	20
Avoid using sapply, consider vapply instead, that's type safe	24
Lines should not be more than 80 characters. This line is 101 characters.	1
Lines should not be more than 80 characters. This line is 102 characters.	1
Lines should not be more than 80 characters. This line is 104 characters.	1
Lines should not be more than 80 characters. This line is 105 characters.	2
Lines should not be more than 80 characters. This line is 106 characters.	1
Lines should not be more than 80 characters. This line is 107 characters.	1
Lines should not be more than 80 characters. This line is 109 characters.	1
Lines should not be more than 80 characters. This line is 113 characters.	4
Lines should not be more than 80 characters. This line is 117 characters.	1
Lines should not be more than 80 characters. This line is 125 characters.	2
Lines should not be more than 80 characters. This line is 138 characters.	2
Lines should not be more than 80 characters. This line is 159 characters.	1
Lines should not be more than 80 characters. This line is 169 characters.	1
Lines should not be more than 80 characters. This line is 171 characters.	2
Lines should not be more than 80 characters. This line is 173 characters.	2
Lines should not be more than 80 characters. This line is 174 characters.	1
Lines should not be more than 80 characters. This line is 193 characters.	1
Lines should not be more than 80 characters. This line is 197 characters.	3
Lines should not be more than 80 characters. This line is 203 characters.	1
Lines should not be more than 80 characters. This line is 205 characters.	1
Lines should not be more than 80 characters. This line is 281 characters.	1
Lines should not be more than 80 characters. This line is 310 characters.	2
Lines should not be more than 80 characters. This line is 357 characters.	1
Lines should not be more than 80 characters. This line is 362 characters.	1
Lines should not be more than 80 characters. This line is 373 characters.	1
Lines should not be more than 80 characters. This line is 380 characters.	1
Lines should not be more than 80 characters. This line is 399 characters.	1
Lines should not be more than 80 characters. This line is 415 characters.	1
Lines should not be more than 80 characters. This line is 426 characters.	1
Lines should not be more than 80 characters. This line is 429 characters.	1
Lines should not be more than 80 characters. This line is 450 characters.	1
Lines should not be more than 80 characters. This line is 482 characters.	1
Lines should not be more than 80 characters. This line is 526 characters.	1
Lines should not be more than 80 characters. This line is 597 characters.	1
Lines should not be more than 80 characters. This line is 81 characters.	4
Lines should not be more than 80 characters. This line is 82 characters.	5
Lines should not be more than 80 characters. This line is 83 characters.	8
Lines should not be more than 80 characters. This line is 84 characters.	1
Lines should not be more than 80 characters. This line is 85 characters.	5
Lines should not be more than 80 characters. This line is 86 characters.	4
Lines should not be more than 80 characters. This line is 87 characters.	1
Lines should not be more than 80 characters. This line is 88 characters.	3
Lines should not be more than 80 characters. This line is 92 characters.	7
Lines should not be more than 80 characters. This line is 93 characters.	2
Lines should not be more than 80 characters. This line is 94 characters.	1
Lines should not be more than 80 characters. This line is 95 characters.	1
Lines should not be more than 80 characters. This line is 96 characters.	2
Lines should not be more than 80 characters. This line is 97 characters.	1
unexpected end of input	1
unexpected symbol	1

4. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following function name is duplicated in other packages:

- get_nodes_edges from malan

Package Versions

package	version
pkgstats	0.2.0.54
pkgcheck	0.1.2.126

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

May 12 '25 12:05 ropensci-review-bot

Thanks @b-rodrigues, can you please address the three failing checks:

✖️ The following functions have no documented return values: [export_nix_archive, import_nix_archive, print.derivation, rxp_init] ✖️ These functions do not have examples: [export_nix_archive, import_nix_archive, print.derivation, rxp_common_setup, rxp_file_common, rxp_inspect, rxp_list_logs, rxp_make, rxp_py_file, rxp_r_file]. ✖️ Package coverage is 67% (should be at least 75%).

I also note that the function with a duplicated name is get_nodes_edges(), which is likely overly generic. I see you've prepended many functions with rxp_ - perhaps you could also do the same with that function? I also see you don't currently use our pkgcheck action. That might help to ensure everything is okay, or if you'd rather not, you can check locally, and then once you confirm all is ✔ , feel free to call @ropensci-review-bot check package. Thanks!

May 12 '25 12:05 mpadge

Ok, so I've implemented the changes, but for the unit test coverage. As explained, the package relies a lot on side-effects, so increasing to 75% will be quite difficult, especially because the functions that are not tested are those that would required build artifacts in the Nix store. Mocking that would be pain in the bottom. As a compromise, I set up this repo: https://github.com/b-rodrigues/rixpress_demos with complete pipelines that test these functions.

Would this be ok?

I also note that the function with a duplicated name is get_nodes_edges(), which is likely overly generic.

This function was being exported by mistake, I don't export it anymore, so the clash shouldn't cause any issue.

May 12 '25 15:05 b-rodrigues

@ropensci-review-bot check package

May 12 '25 15:05 b-rodrigues

Thanks, about to send the query.

May 12 '25 15:05 ropensci-review-bot

:rocket:

Editor check started

:wave:

May 12 '25 15:05 ropensci-review-bot

Checks for rixpress (v0.2.0)

git hash: 8e396034

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_multiplication_x: Package coverage is 67.5% (should be at least 75%).
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 3)

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	331
internal	rixpress	52
internal	stats	8
internal	graphics	6
internal	utils	1
imports	jsonlite	5
imports	igraph	4
imports	processx	3
suggests	ggplot2	9
suggests	ggdag	3
suggests	dplyr	NA
suggests	knitr	NA
suggests	mockery	NA
suggests	reticulate	NA
suggests	rix	NA
suggests	rmarkdown	NA
suggests	testthat	NA
suggests	usethis	NA
suggests	visNetwork	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (31), sprintf (20), paste0 (15), file.path (14), vapply (13), c (12), deparse1 (12), substitute (12), grep (11), sapply (11), lapply (10), character (9), for (9), gsub (9), list.files (8), paste (8), readLines (8), length (7), file (6), data.frame (5), match (5), regmatches (5), unlist (5), args (4), grepl (4), basename (3), Filter (3), format (3), gregexpr (3), pretty (3), seq_along (3), strsplit (3), sub (3), subset (3), unique (3), any (2), append (2), if (2), lengths (2), setdiff (2), stdout (2), system2 (2), tryCatch (2), which (2), as.character (1), cat (1), col (1), deparse (1), dirname (1), do.call (1), drop (1), file.info (1), getwd (1), I (1), identity (1), is.list (1), is.null (1), logical (1), names (1), Negate (1), nrow (1), numeric (1), readline (1), readRDS (1), Reduce (1), regexec (1), rep (1), return (1), round (1), source (1), stop (1), Sys.time (1), system.file (1), vector (1)

rixpress

cb (3), get_need_py (3), get_need_r (3), gen_flat_pipeline (2), gen_pipeline (2), generate_configurePhase (2), parse_nix_envs (2), parse_packages (2), parse_rpkgs_git (2), rxp_inspect (2), rxp_list_logs (2), rxp_read_load_setup (2), unnest_all_columns (2), add_import (1), adjust_import (1), adjust_py_packages (1), confirm (1), dag_for_ci (1), export_nix_archive (1), generate_dag (1), generate_libraries_from_nix (1), generate_libraries_script (1), generate_py_libraries_from_nix (1), generate_r_libraries_from_nix (1), generate_r_or_py_libraries_from_nix (1), get_nodes_edges (1), import_formatter_py (1), import_formatter_r (1), import_nix_archive (1), load_line (1), print.derivation (1), rixpress (1), rxp_common_setup (1), rxp_copy (1), rxp_file_common (1), rxp_ga (1)

ggplot2

aes (7), scale_fill_manual (1), scale_shape_manual (1)

stats

df (5), var (2), line (1)

graphics

lines (6)

jsonlite

write_json (3), fromJSON (1), read_json (1)

igraph

write_graph (2), graph_from_data_frame (1), V (1)

ggdag

geom_dag_node (2), as_tidy_dagitty (1)

processx

run (3)

utils

timestamp (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 12 files) and
1 authors
7 vignettes
no internal data file
3 imported packages
29 exported functions (median 26 lines of code)
70 non-exported functions in R (median 30 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	12	63.4
files_vignettes	7	98.0
files_tests	11	88.5
loc_R	1948	81.6
loc_vignettes	1257	93.2
loc_tests	1119	85.4
num_vignettes	7	98.4	TRUE
n_fns_r	99	74.8
n_fns_r_exported	29	76.9
n_fns_r_not_exported	70	74.4
n_fns_per_file_r	5	68.3
num_params_per_fn	3	29.3
loc_per_fn_r	28	74.0
loc_per_fn_r_exp	26	57.3
loc_per_fn_r_not_exp	30	78.2
rel_whitespace_R	15	77.2
rel_whitespace_vignettes	25	91.4
rel_whitespace_tests	16	80.6
doclines_per_fn_exp	29	31.4
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	37	58.8

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
14976041988	acidophilic-americancreamdraft: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	success	0c0291	375	2025-05-12
14976311675	devtools-tests-via-r-nix	success	8e3960	406	2025-05-12
14976376869	pages build and deployment	success	ced3fb	358	2025-05-12
14976311672	pkgcheck	NA	8e3960	8	2025-05-12
14976311670	pkgdown.yaml	success	8e3960	414	2025-05-12
14976311683	run-rhub-checks	success	8e3960	381	2025-05-12
14976363240	serpentine-xoloitzcuintli: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	NA	8e3960	378	2025-05-12
14976311680	Test coverage	success	8e3960	154	2025-05-12
14976108488	timeconsuming-limpkin: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	success	932bbf	376	2025-05-12
14976197952	transcendentalistic-lowchen: linux, macos, macos-arm64, windows, ubuntu-next, ubuntu-release, gcc14	success	932bbf	377	2025-05-12
14976311678	Trigger Demo Actions	success	8e3960	247	2025-05-12

3b. `goodpractice` results

`R CMD check` with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 67.5

The following files are not completely covered by tests:

file	coverage
R/generate_dag.R	58.33%
R/plot_dag.R	36.42%
R/rxp_copy.R	27.78%
R/rxp_ga.R	66.67%
R/rxp_make.R	0%
R/rxp_read_load.R	0%

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
gen_pipeline	33
generate_dag	25

Static code analyses with lintr

lintr found the following 134 potential issues:

message	number of times
Avoid 1:nrow(...) expressions, use seq_len.	1
Avoid changing the working directory, or restore it in on.exit	11
Avoid library() and require() calls in packages	20
Avoid using sapply, consider vapply instead, that's type safe	10
Lines should not be more than 80 characters. This line is 101 characters.	1
Lines should not be more than 80 characters. This line is 102 characters.	1
Lines should not be more than 80 characters. This line is 104 characters.	1
Lines should not be more than 80 characters. This line is 105 characters.	2
Lines should not be more than 80 characters. This line is 106 characters.	1
Lines should not be more than 80 characters. This line is 107 characters.	1
Lines should not be more than 80 characters. This line is 109 characters.	1
Lines should not be more than 80 characters. This line is 113 characters.	4
Lines should not be more than 80 characters. This line is 117 characters.	1
Lines should not be more than 80 characters. This line is 125 characters.	2
Lines should not be more than 80 characters. This line is 138 characters.	2
Lines should not be more than 80 characters. This line is 159 characters.	1
Lines should not be more than 80 characters. This line is 169 characters.	1
Lines should not be more than 80 characters. This line is 171 characters.	2
Lines should not be more than 80 characters. This line is 173 characters.	2
Lines should not be more than 80 characters. This line is 174 characters.	1
Lines should not be more than 80 characters. This line is 193 characters.	1
Lines should not be more than 80 characters. This line is 197 characters.	3
Lines should not be more than 80 characters. This line is 203 characters.	1
Lines should not be more than 80 characters. This line is 205 characters.	1
Lines should not be more than 80 characters. This line is 281 characters.	1
Lines should not be more than 80 characters. This line is 310 characters.	2
Lines should not be more than 80 characters. This line is 357 characters.	1
Lines should not be more than 80 characters. This line is 362 characters.	1
Lines should not be more than 80 characters. This line is 373 characters.	1
Lines should not be more than 80 characters. This line is 380 characters.	1
Lines should not be more than 80 characters. This line is 399 characters.	1
Lines should not be more than 80 characters. This line is 415 characters.	1
Lines should not be more than 80 characters. This line is 426 characters.	1
Lines should not be more than 80 characters. This line is 429 characters.	1
Lines should not be more than 80 characters. This line is 450 characters.	1
Lines should not be more than 80 characters. This line is 482 characters.	1
Lines should not be more than 80 characters. This line is 526 characters.	1
Lines should not be more than 80 characters. This line is 597 characters.	1
Lines should not be more than 80 characters. This line is 81 characters.	4
Lines should not be more than 80 characters. This line is 82 characters.	5
Lines should not be more than 80 characters. This line is 83 characters.	8
Lines should not be more than 80 characters. This line is 84 characters.	1
Lines should not be more than 80 characters. This line is 85 characters.	3
Lines should not be more than 80 characters. This line is 86 characters.	4
Lines should not be more than 80 characters. This line is 87 characters.	3
Lines should not be more than 80 characters. This line is 88 characters.	3
Lines should not be more than 80 characters. This line is 92 characters.	7
Lines should not be more than 80 characters. This line is 93 characters.	2
Lines should not be more than 80 characters. This line is 94 characters.	1
Lines should not be more than 80 characters. This line is 95 characters.	1
Lines should not be more than 80 characters. This line is 96 characters.	2
Lines should not be more than 80 characters. This line is 97 characters.	1
unexpected end of input	1
unexpected symbol	1

Package Versions

package	version
pkgstats	0.2.0.54
pkgcheck	0.1.2.126

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

May 12 '25 15:05 ropensci-review-bot

Preliminay Editor checks:

[ ] Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [ ] Is the reference index page clear (grouped by topic if necessary)?
- [x] Are vignettes readable, sufficiently detailed and not just perfunctory?
[x] Fit: The package meets criteria for fit and overlap.
[x] Installation instructions: Are installation instructions clear enough for human users?
[x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
[ ] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
[x] License: The package has a CRAN or OSI accepted license.
[x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Editor comments

Thanks for your submission @b-rodrigues, which looks like a very useful extension of {rix}. I expect we'll proceed soon, but note first a couple of very minor issues from the checks above:

The package reference page has all functions together. Could you please structure the reference index by adding {roxygen2} @family tags, as described in this section of our Dev Guide?
Your extended checks repository in https://github.com/b-rodrigues/rixpress_demos is a great solution to testing, and definitely satisfactory for us. In order to satisfy the second missing item in the checklist above, could you please:
- Add a bit more detail to your current CONTRIBUTING.md, especially including description of how {rixpress-demo} is used in tests; and
- Explicitly reference CONTRIBUTING.md somewhere in your readme, with brief instructions on how to contribute.
- Not necessary now, but good to keep in mind: Issue templates would provide a great way to ensure all who wanted to contribute were aware of {rixpress-demo}, and understood the relationship between the two repos.

Let us know when those points have been addressed, and we'll proceed from there. Thanks :+1:

May 14 '25 08:05 mpadge

hi @mpadge thanks for your feedback! I've addressed your suggestions.

May 14 '25 18:05 b-rodrigues

@b-rodrigues Sorry for slight delay here, we're still trying to find and assign an editor to handle this. Should be assigned soon.

May 20 '25 08:05 mpadge

no worries :)

May 20 '25 09:05 b-rodrigues

@ropensci-review-bot assign @ldecicco-USGS as editor

May 27 '25 13:05 ldecicco-USGS

Assigned! @ldecicco-USGS is now the editor

May 27 '25 13:05 ropensci-review-bot

Editor checks:

[x] Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [x] Is the reference index page clear (grouped by topic if necessary)?
- [x] Are vignettes readable, sufficiently detailed and not just perfunctory?
[x] Fit: The package meets criteria for fit and overlap.
[x] Installation instructions: Are installation instructions clear enough for human users?
[x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
[x] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
[x] License: The package has a CRAN or OSI accepted license.
[x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Editor comments

Looks great as usual.

Jun 10 '25 21:06 ldecicco-USGS

@ropensci-review-bot seeking reviewers

Jun 10 '25 21:06 ldecicco-USGS

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/706_status.svg)](https://github.com/ropensci/software-review/issues/706)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

Jun 10 '25 21:06 ropensci-review-bot

@ropensci-review-bot assign @wlandau as reviewer

Jul 07 '25 12:07 ldecicco-USGS

@wlandau added to the reviewers list. Review due date is 2025-07-28. Thanks @wlandau for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

Jul 07 '25 12:07 ropensci-review-bot

@wlandau: If you haven't done so, please fill this form for us to update our reviewers records.

Jul 07 '25 12:07 ropensci-review-bot

@ropensci-review-bot assign @amart90 as reviewer

Jul 07 '25 14:07 ldecicco-USGS

@amart90 added to the reviewers list. Review due date is 2025-07-28. Thanks @amart90 for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

Jul 07 '25 14:07 ropensci-review-bot

@amart90: If you haven't done so, please fill this form for us to update our reviewers records.

Jul 07 '25 14:07 ropensci-review-bot

Package Review

Briefly describe any working relationship you have (had) with the package authors.

Bruno and I follow each other's work as members of the R community. We have not yet worked together directly on a project.

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

As the author of targets, I took a careful look at the coi guidelines:

The potential editor or reviewer has a conflict of interest if:...The potential reviewer/editor has significantly contributed to a competitor project.

There is obvious overlap, but I would not say rixpress is a competitor. rixpress has a niche outside the scope of targets:

nix-store as the engine to run pipelines and store data.
Polyglot pipelines where Python and Julia are first-class citizens alongside R.
Multi-environment pipelines.

I checked with @ldecicco-USGS, who agreed.

Documentation

The package includes all the following forms of documentation:

[x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s): demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions
[x] Examples: (that run successfully locally) for all exported functions
[x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software have been confirmed.
[x] Performance: Any performance claims of the software have been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 4

[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Overview

rixpress is an excellent prospective addition to rOpenSci. It fills a valuable niche in reproducible computation, the engineering is fantastic, and the documentation is comprehensive. Because the quality is already so high, I did not need to spend much time checking package development minutia. I spent most of my review time on high-level issues and my own experience as a new user.

Scope of this review

I reviewed:

The documentation at https://b-rodrigues.github.io/rixpress/.
Examples basic_r, r_multi_envs, and r_py_xgboost from https://github.com/b-rodrigues/rixpress_demos.
The rixpress source code and test suite.

Scope of `rixpress`

Arguably the most essential but most difficult part of developing any tool is establishing a clear and crisp set of requirements. Explicit pre-specified boundaries help prevent scope creep and ensure a package's priorities succeed long-term. For pipeline tools, scope is even more essential and even more challenging than usual, both because of the many different opinions about what a pipeline tool should do, and because of the huge variety of pipelines users routinely create.

I developed targets as a highly opinionated tool with R-focused research-oriented scenarios in mind. This vision was somewhat implicit, and I did not have enough experience then to completely spell it out. I regularly hear from people who use it for cases I did not consider: simple ETL operations on big data, database query workflows, daily pipelines where historical runs matter, etc. Some users even approach targets as an Airflow-like tool rather than a Make-like one, and they are looking for a feature set closer to what maestro provides.

You might have the same experience with rixpress. For example, users who switch from targets to rixpress may ask you to support branching, alternative DSLs, interactive debugging, fancy progress monitoring, alternative storage options, computing on clusters, alternative DAG visualizations, etc.

For rixpress, I would like to understand what the package may cover in the future, and what it definitely will not support. I think a dedicated section on scope in the documentation (possibly linked from the issue templates) will help set expectations for users who request features, and it will help you maintain rixpress for years to come.

At this early stage, the main areas of focus seem to be as follows (please correct me if I am wrong):

Bringing pipeline functionality from nix-store to R.
Interactive read-only inspection: visualization, reading from the data store, and historical runs.
R/Python/Julia interoperability.
Portability: through Nix itself, and continuous integration.

(1) seems like a promising direction because it invests in the qualities that makes rixpress most unique.

An aside: if you intend to expand on (1), e.g. alternative store types, you might also consider writing a low-level Nix client to facilitate the implementation, kind of like gert for Git, cmdstanr for CmdStan, or paws for AWS. This might even help you maintain rix.

Visualization

DAG visualizations greatly improve the user experience, but they are also a Pandora's Box of scope creep. rixpress already supports 3 backends for graphs: visNetwork, ggdag, and GraphViz (DOT; for CI). And each one is a magnet for feature requests.

To simplify the visualization feature set, what about using mermaid.js instead of ggdag or GraphViz? Mermaid graphs are just text, and they are very easy to generate without any additional R packages. For CI, you could use https://github.com/AlexanderGrooff/mermaid-ascii, which I think would produce graphs that are more readable and visually appealing than GraphViz can render (e.g. https://github.com/b-rodrigues/rixpress_demos/actions/runs/16252270236/job/45883546684#step:9:11).

visNetwork might only be necessary if you expect enormous pipelines whose graphs can only be explored interactively. If you do decide to keep rixpress::rxp_visnetwork(), I suggest keeping the feature set simple and tightly scoped. (Maybe it would also be a good idea to disable physics to improve rendering performance for large graphs.) visNetwork is great at zooming in and out of graphs of pretty much any size, but from experience developing targets::tar_visnetwork(), I have found it does not excel at creating nice-looking polished graphs. I think mermaid.js is much better at feature-rich pretty graphs.

Workflow functions

From the examples, I see two patterns for setting up and running pipelines. In simple cases:

list(
  rxp_r(...)
) |>
  rixpress()

but for Python projects such as r_py_xgboost:

list(
  rxp_py(...)
) |>
  rixpress(build = FALSE)
  
adjust_import(...)

rxp_make()

I think it would be clearer and more consistent to create a separate function (maybe rxp_populate()) which runs the equivalent of rixpress(build = FALSE). Then, if rixpress() itself is still needed, it could serve as the equivalent of rxp_populate() + rxp_make().

add_import() and adjust_import() feel a bit awkward as separate steps. You might instead consider an interface like:

list(
  rxp_py(...)
) |>
rxp_populate(
  derivations,
  py_imports = c(
    numpy = "from numpy import array, loadtxt",
    xgboost = "from xgboost import XGBClassifier"
  )
)

Names for functions and classes

I have minor suggestions to make the names of functions more internally consistent:

export_nix_archive() => rxp_export_nix_archive()
import_nix_archive() => rxp_import_nix_archive()
generate_dag() => rxp_generate_dag() or rxp_write_dag() or rxp_save_dag()

(You may not need dag_for_ci(), add_import(), or adjust_import() if you agree with my suggestions from earlier.)

In addition, functions like rxp_r() produce an object of class "derivation". I suggest renaming it to something like "rxp_derivation" so it does not conflict with e.g. mathematical packages with their own kinds of "derivations".

Installation experience

I am new to Nix, rix, and rixpress, and I began by installing the toolchain from scratch on an M2 Macbook Pro with OS 15.5. This is my work computer, so it has more security restrictions than a regular personal computer.

I followed the rix setup guide for macOS, which was clear and comprehensive. The curl command successfully downloaded the Determinate Systems installer, but the installer itself failed. First I realized I needed to run it with sudo, but even that failed. Nix installed successfully when I navigated a browser https://docs.determinate.systems/, manually downloaded the installer, and double-clicked it to run it. Maybe consider updating the vignette to mention that the point-and-click route is possible?

Afterwards, I installed cachix and ran cachix use rstats-on-nix. library(rix) initially showed these warning messages, but rix::setup_cachix() silenced them. The next library(rix) gave me a warning about an incomplete final line in ~/.config/nix/nix.conf, which I solved by manually opening the text file and adding a line break. I suggest ensuring rix::setup_cachix() leaves a terminating newline character in ~/.config/nix/nix.conf.

Storage

I really like the build logs feature you describe in https://b-rodrigues.github.io/rixpress/articles/g-logs.html. Over multiple pipeline runs, however, storage may accumulate, especially because Nix uses content-addressable storage (by hash). It may help to describe in that vignette how users can leverage the garbage collection features of Nix to clear out the data that is no longer need.

Multi-line expressions

The following pipeline succeeds:

library(rixpress)
list(
  rxp_r(
    name = derivation,
    expr = 1 + 1
  )
) |>
  rixpress()

But a similar one fails:

library(rixpress)
list(
  rxp_r(
    name = derivation,
    expr = {
      message("Running derivation")
      1 + 1
    }
  )
) |>
  rixpress()

with the error message:

Error: unexpected numeric constant in "  derivation <- {     message('Running derivation')     1

Same if I add a semicolon after the message() statement. I expect this rxp_r() etc. do not support multi-line expressions. I would suggest either adding this support or requiring expressions to be pure function calls.

Testing

I recommend including skip_if_not_installed() statements in tests where Suggests: packages are used (such as mockery and reticulate). In addition, when I ran the tests locally, one test threw a warning:

Warning (test-generate_libraries_from_nix.R:42:3): generate_py_libraries_from_nix: generate Py script by parsing default.nix
Python packages have been requested, but 'reticulate' is not in your list of R packages. If you want to handle Python objects from your R session, consider adding 'reticulate' to the list of R packages.
Backtrace
    ▆
 1. └─rix::rix(...) at test-generate_libraries_from_nix.R:42:3

Adding r_pkgs = reticulate to rix::rix() in https://github.com/b-rodrigues/rixpress/blob/ea052f4a024bb47705bf186541380d5febac279e/tests/testthat/test-generate_libraries_from_nix.R#L42 should remove it.

Test coverage from covr is lower than I normally see in packages, but I really like your approach to offload to https://github.com/b-rodrigues/rixpress_demos. If the number of projects in that repo grows unmanageable at some point, you might consider creating a new GitHub org for them like https://github.com/nf-core does for Nextflow.

Checks

When I ran devtools::check() locally, I saw: the note:

✔  checking for non-standard things in the check directory
N  checking for detritus in the temp directory
   Found the following files/directories:
     ‘RtmptcpOyn_repo_hash_url_jnlhe’

I have been flagged for this before when trying to submit packages to CRAN.

Lints

On my local machine, devtools::lint() shows many lints, including:

data-raw/gen_pipeline.R:6:3: style: [quotes_linter] Only use double-quotes.
  'mtcars.csv',
  ^~~~~~~~~~~~
data-raw/gen_pipeline.R:43:2: style: [commented_code_linter] Remove commented code.
#rxp_make()
 ^~~~~~~~~~
data-raw/jl_example/functions.R:1:34: style: [brace_linter] There should be a space before an opening curly brace.
prepare_data <- function(laplace){
                                 ^

I have never used the Air formatter, and I do see you have https://github.com/b-rodrigues/rixpress/blob/main/.github/workflows/style-with-air.yaml, so please disregard if there is an inherent conflict between Air and lintr.

devtools::spell_check() has many findings, including:

> devtools::spell_check()
DESCRIPTION does not contain 'Language' field. Defaulting to 'en-US'.
  WORD                FOUND IN
’s                README.md:18
al                  a-intro-concepts.Rmd:124
Analysing           d-polyglot.Rmd:34
autoplay            b-core-functions.Rmd:272
buildInputs         make_derivation_snippet.Rd:20
cachix              d-polyglot.Rmd:46
Cachix              d-polyglot.Rmd:62
cancelled           rxp_init.Rd:17
ci                  dag_for_ci.Rd:36,40
                    generate_dag.Rd:29,33
                    rxp_ga.Rd:29,33
cmdstanr            f-cmdstanr.Rmd:2
configurePhase      make_derivation_snippet.Rd:20
cryptographic       a-intro-concepts.Rmd:164,171,214,230
CTRL                c-tutorial.Rmd:105
                    d-polyglot.Rmd:107
                    d2-polyglot-julia.Rmd:82
deriv               rixpress.Rd:18
...

You can exclude specific false positives in inst/WORDLIST.

On my machine, urlchecker::url_check() shows:

✖ Error: vignettes/a-intro-concepts.Rmd:32:21 
403: Forbidden
spectrum/continuum](https://www.researchgate.net/figure/Reproducibility-spectrum-as-Peng-2011-stated_fig1_354765302),
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This could be because of the extra network security in my work environment, or it could be because ResearchGate has checks for bots. In the latter case, CRAN might flag the URL.

Miscellaneous suggestions

In rxp_copy(), I see Sys.chmod(all_files, mode = "777"), which could be risky on shared file systems. Is there a more restricted permission set that would still work?
For rxp_r_file(), the implicit roxygen2 @title tag is "rxp_r_file". I suggest a more descriptive name. Same for rxp_py_file().
Please consider changing the name of the default branch from "master" to "main" in https://github.com/b-rodrigues/rixpress_demos.
nix-store --realize has many options for --verbose. I suggest making the verbose argument of rxp_make() an integer to support this existing functionality. There are many more features you could consider for helping users monitor pipelines, some of which are more feasible than others, and this one seems like the lowest-hanging fruit.
In rixpress_demos/r_multi_envs, I recommend a more formal/safe choice for the meme image.
Instead of prefixes to control the order vignettes are listed, you could consider relying on pkgdown yaml for this, e.g. https://github.com/wlandau/crew/blob/728c45536d58faf1794e2a16c469fdce4a815176/_pkgdown.yml#L6-L19.

Jul 21 '25 18:07 wlandau

Many thanks @wlandau for your review!

I'm currently on holidays without access to a computer so I'll only be able to address your comments in 2 weeks time. Just wanted to let you know 😁

Jul 23 '25 08:07 b-rodrigues

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors.
- While I have followed the author's work and read his book, I have not worked with the package author.
[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
- There are instructions for installing rixpress in the README. This package is somewhat unique in that, while its intallation is straghtforward, to use it as intended a fairly involved installation process must be completed. There is a link to these instructions which included as a part of the rix package.
[x] Vignette(s): demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions
[x] Examples: (that run successfully locally) for all exported functions
[x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

[x] Installation: Installation succeeds as documented.
- While I was able to sucessfully install rixpress on my native OS (Windows) and within the Nix shell, I was unsuccessful getting some aspects of the rix installation completed, including IDE integration. Because of the security requirements of my work computer, I think this is not a rix problem and my own IT issue. I wanted to include that information to provide context for what I reviewed; however, I think it is out of the scope of the review of rixpress.
[x] Functionality: Any functional claims of the software have been confirmed.
[x] Performance: Any performance claims of the software have been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 14

[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

I, similar to Will, am familiar with the quality of your work. I did a relatively quick look through the source code but I focused my effort on the usability, particularly as a user that is new to rixpress, rix, and Nix. I will try to ensure I don't have comments that overlap with Will's.

I had some difficulty getting rix installed through WSL on my machine and I was never able to sucessfully use an IDE (either through a native installation or a Nix-managed installation). These are likely issues that are due, at least partially, to the enhanced security requirements of my work computer. I won't go into any further details about this here, however, because they are related to Nix, WSL, and other things outside of the scope of this review.

General impressions

While I am a regular user of reproducible pipelines, my expirience comes as a user of targets. rixpress is different, not just in the implementation, but in the paradigm as well. While targets offers some gains in reproducibility (primarily in the isolation of runtime environments) rixpress offers several additional layers of reproducibility through the tracking of not just the input files and code, but the full environment: R and R packages, system-level software dependencies, environmental variables, etc. Additionally, it supports a truly polyglot pipeline that isn't reliant on reticulate to execute Python code, which can be brittle. This is sure to be an important contribution to the R community.

This package is conceptualized and implemented thoughtfully. As usual, your documentation is thorough and well written, which allows for the success of this package despite the complex concepts and details required to implement this. You have done a good job tucking away the heavy technical details from users who want this to "just work," while providing the details in the many vignettes for advanced users who want or need this information.

However, this does create a lot of opportunities for users to get lost when considering "multiple points of entry." There are many cases where information that would be helpful for the user to know is described in a vignette, but not in the function documentation. While I understand that a balance must be struck between completeness of documentation and brevity, there are opportunities for improving the completeness of the documentation, even if it is at the cost of repeating information. I include some examples in the Documentation section below, but a thoughtful review of all function documentation would greatly benefit the ability for a new user to begin using rixpress.

Packaging

I built rixpress locally and did not recieve any errors, warnings or notes from R CMD check.
All expected components of the package are present.

Code and testing

I appreciate the automatic code formatting with Air. It makes scanning the code predicatable and easy.
I executed all the tests I could on my native system (without having to execute them in a Nix environment) and they all passed.
While test coverage is reported as being low (covr::package_coverage() reports 46.64% coverage overall), I found testing to be reasonably comprehensive. I don't know all the specifics for how covr works, but there must be some disconnect with the way your tests are written and the way it assesses coverage. For example, it evaluated R/generate_dag.R as having 0.0% test coverage. However, I found tests/testthat/test-generate_dag.R to reasonable cover generate_dag(). One can always add more tests (and even superfical tests to inflate test coverage), but the current coverage seems appropriate to me despite low coverage reports.
testthat::skip_on_cran() is used on the test "rxp_init creates expected files" and I am not sure that this needs to be skipped. There are other tests that create files (e.g., test-rxp_copy.R) where files are created but the test can be completed on CRAN. Evaluate if this is necessary.

User interface

I had a similar suggestion to Will about the rixpress() function. It doesn't follow the same pattern used throughout most of the function names where it has a rxp_ prefix. At the very least, I think it could benefit from a more intuitive name that begins with rxp_ and is followed by a verb (as discussed in Tidy design princiles). Even better might be removing rixpress altogether and having a function that builds the pipeline plan and another to execute the pipeline. This is largely what Will said, but I include it because I think it is an important improvement to the interface.
I found error messages to be helpful and adequately descriptive in my testing.

Documentation

There is a missing word in the desccription field in the documentation for rixpress(). I think the word built needs to be added to the sentence "By default, the pipeline is also immediately built after being generated...".
The documentation for rxp_r/py_file() includes three methods to read data in. In the Examples, you demonstate methods 1 and 3. It might also be worthwhile to include method 2. While there has to be a balance between completeness and brevity in the Examples, I think it is a worthwhile addition.
In vignettes/b-core-functions.Rmd, I think the third paragraph would make more sense it it started with \read_function` requires an R function with a single argument...sincerxp_r_file()` has three required arguments and only one of which requires an R function.
At the end of the Generating the pipeline section of b-core-functions.Rmd, you provide a bulleted list of actions that rixpress() performs. I found this quite helpful and I think the function documentation for rixpress() would benefit form these bullets (or something similarly concise and explicit).
In Vignette C: Tutorual, in regards to rxp_inspect(), you mention "... and an object you didn’t define called all-derivations. This last object is mostly for internal rixpress use, and you can safely ignore it." I found this helpful context that I wish I had seen earlier in my use of rxp_inspect(). Consider adding a note about all-derivations in the function documentation for rxp_inspect().
In Vignette D: Polyglot pipelines you give an example where seralize and unserialize functions are passed as characters rather than expressions for both rxp_py() and rxp_r(). I think Vignette F: Cmdstandr suggests that custom functions defined in functions.R should be passed as characters, but in either case, it in not clear to me when a character should be passed rather than a function for rxp_r(). It would be helpful to update the documentation for rxp_r() to describe all the acceptable input types and when they should be used.
In Vignette D: Polyglot pipelines, you mention "In the future, other languages could be added to rixpress, notably Julia." However, it appears that Julia is already supported. Consider clarifying this.
Vignete G: cached artifacts is helpful. Consider linking to in in the function documentation for

Functionality

I am curious about the case where there are there are multiple different inputs to an rxp_r/py/jl() call that need different unserialization functions. For example, if I have rxp_r(out_df, custom_fn(model = keras_model, data = data_frame)) where keras_model should be unserialized with keras::load_model_hdf5() and data_frame should be unserialized with readRDS. Can the unserialize_function take multiple functions? If so, it might be helpful to make that explicit in the documentation. If not, is there a plan to support derications that have multiple inputs that require different unserialization functions? I don't want to contrive a bunch of edge cases, but to me this seems like it would be somewhat common (at least in my workflows).
Similarly, I tried to use a custom function defined in "functions.R" (with additional_files = "functions.R") specified) as a named function in serialization_function, the custom function was not found. It might be nice to be able to write cusome (un)serialization function. If that is not simple to implement, it is probably worth making it explicit in the domumentation that this must be a namespace function or anonymous function.
It seems likely to me that after many repeated builds of the same pipeline, especially one that requires large datasets or many intermediate derivations, storage space could become an issue with all previous artifacts stored (as described in Vignete G: cached artifacts). Now, to clear the store the guidence is to call nix-store --gc in the terminal. It might be nice to provide an R wrapper for this. And beter yet, provide a bit more control besides just clearing all build artifacts; for example, giving the user tehe option to delete stored artifacts from before a given date. This is a soft suggestion.

Jul 23 '25 20:07 amart90

:calendar: @wlandau you have 2 days left before the due date for your review (2025-07-28).

Jul 26 '25 12:07 ropensci-review-bot

:calendar: @amart90 you have 2 days left before the due date for your review (2025-07-28).

Jul 26 '25 14:07 ropensci-review-bot

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/706#issuecomment-3109964972 time 6

Jul 28 '25 13:07 ldecicco-USGS

software-review software-review copied to clipboard

rixpress: Reproducible Analytical Pipelines with Nix

Scope

Technical checks

Publication options

Code of conduct

Checks for rixpress (v0.2.0)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

4. Other Checks

Editor-in-Chief Instructions:

Checks for rixpress (v0.2.0)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

Editor-in-Chief Instructions:

Preliminay Editor checks:

Editor comments

Editor checks:

Editor comments

Looks great as usual.

Package Review

Documentation

Functionality

Review Comments

Overview

Scope of this review

Scope of rixpress

Visualization

Workflow functions

Names for functions and classes

Installation experience

Storage

Multi-line expressions

Testing

Checks

Lints

Miscellaneous suggestions

Package Review

Documentation

Functionality

Review Comments

General impressions

Packaging

Code and testing

User interface

Documentation

Functionality

software-review
software-review copied to clipboard

3. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

3. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

Scope of `rixpress`