software-review Presubmission Inquiry

Submitting Author Name: Paul Govan Submitting Author Github Handle: @paulgovan Other Package Authors Github handles: (comma separated, delete if none) Repository: https://github.com/paulgovan/ReliaGrowR Submission type: Pre-submission Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: ReliaGrowR
Title: Reliability Growth Analysis
Version: 0.2
Authors@R: person("Paul", "Govan", email = "[email protected]", 
  role = c("aut", "cre", "cph"), comment = c(ORCID = "0000-0002-1821-8492"))
Description: Modeling and plotting functions for Reliability Growth Analysis (RGA). Models include the Duane (1962) <doi:10.1109/TA.1964.4319640>, Non-Homogeneous Poisson Process (NHPP) by Crow (1975) <https://apps.dtic.mil/sti/citations/ADA020296>, Piecewise Weibull NHPP by Guo et al. (2010) <doi:10.1109/RAMS.2010.5448029>, and Piecewise Weibull NHPP with Change Point Detection based on the 'segmented' package by Muggeo (2024) <https://cran.r-project.org/package=segmented>.
Imports:
  stats,
  graphics,
  segmented
License: CC BY 4.0
Encoding: UTF-8
Roxygen: list (markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
Suggests: 
    ellmer,
    knitr,
    rmarkdown,
    spelling,
    testthat (>= 3.0.0),
    vdiffr
Language: en-US
URL: https://paulgovan.github.io/ReliaGrowR/, https://github.com/paulgovan/ReliaGrowR
Config/testthat/edition: 3
VignetteBuilder: knitr
BugReports: https://github.com/paulgovan/ReliaGrowR/issues
RoxygenNote: 7.3.3
Depends: 
    R (>= 3.5)
LazyData: true

Scope

Please indicate which category or categories from our package fit policies or statistical package categories this package falls under. (Please check one or more appropriate boxes below):

Data Lifecycle Packages
- [ ] data retrieval
- [ ] data extraction
- [ ] data munging
- [ ] data deposition
- [ ] data validation and testing
- [ ] workflow automation
- [ ] version control
- [ ] citation management and bibliometrics
- [ ] scientific software wrappers
- [ ] field and lab reproducibility tools
- [ ] database software bindings
- [ ] geospatial data
- [ ] translation
Statistical Packages
- [ ] Bayesian and Monte Carlo Routines
- [ ] Dimensionality Reduction, Clustering, and Unsupervised Learning
- [ ] Machine Learning
- [x] Regression and Supervised Learning
- [ ] Exploratory Data Analysis (EDA) and Summary Statistics
- [ ] Spatial Analyses
- [ ] Time Series Analyses
- [ ] Probability Distributions
Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
ReliaGrowR provides classic reliability growth models, including the Duane, Crow-AMSAA, Piecewise NHPP, and Piecewise NHPP with Change Point Detection, fit using MLE and supported by visualization tools.
If submitting a statistical package, have you already incorporated documentation of standards into your code via the srr package?
Yes
Who is the target audience and what are scientific applications of this package?
The target audience includes reliability engineers, data analysts, researchers, and students interested in reliability growth analysis.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
To the best of my knowledge, no other R packages are specifically dedicated to reliability growth analysis (RGA). A review of CRAN and other repositories identified packages supporting NHPP modeling, but none that directly address RGA.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
I do not believe this is applicable.
Any other questions or issues we should be aware of?:

Oct 13 '25 14:10 paulgovan

Thanks for your pre-submission to rOpenSci, our editors will reply soon.

Oct 13 '25 14:10 ropensci-review-bot

@ropensci-review-bot check srr

Oct 13 '25 14:10 mpadge

'srr' standards compliance:

Complied with: 71 / 116 = 61.2% (general: 44 / 68; regression: 27 / 48)
Not complied with: 45 / 116 = 38.8% (general: 24 / 68; regression: 21 / 48)

:heavy_check_mark: This package complies with > 50% of all standards and may be submitted.

Oct 13 '25 14:10 ropensci-review-bot

@ropensci-review-bot check package

Oct 13 '25 14:10 mpadge

Thanks, about to send the query.

Oct 13 '25 14:10 ropensci-review-bot

:rocket:

Editor check started

:wave:

Oct 13 '25 14:10 ropensci-review-bot

Checks for ReliaGrowR (v0.2)

git hash: ca872d5f

:heavy_check_mark: Package is already on CRAN.
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 95.4%.
:heavy_check_mark: This is a statistical package which complies with all applicable standards
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.
:eyes: Some goodpractice linters failed.
:eyes: Function names are duplicated in other packages

(Checks marked with :eyes: may be optionally addressed.)

Package License: CC BY 4.0

1. rOpenSci Statistical Standards (`srr` package)

This package is in the following category:

Regression and Supervised Learning

:heavy_check_mark: All applicable standards [v0.2.0] have been documented in this package (506 complied with; 45 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.

2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	85
internal	ReliaGrowR	6
internal	utils	4
imports	stats	26
imports	segmented	8
imports	graphics	3
suggests	ellmer	NA
suggests	knitr	NA
suggests	rmarkdown	NA
suggests	spelling	NA
suggests	testthat	NA
suggests	vdiffr	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

beta (10), c (10), exp (10), log (9), list (6), if (5), cumsum (4), length (4), round (4), data.frame (3), as.numeric (2), ifelse (2), is.list (2), is.matrix (2), sort (2), summary (2), ceiling (1), col (1), is.null (1), labels (1), match.arg (1), merge (1), sum (1), suppressWarnings (1)

stats

predict (6), residuals (5), BIC (4), logLik (4), AIC (3), lm (2), aggregate (1), cor (1)

segmented

intercept (3), segmented (3), slope (2)

ReliaGrowR

duane (1), FUN (1), plot.duane (1), plot.rga (1), ppplot.rga (1), print.duane (1)

utils

data (4)

graphics

lines (2), abline (1)

3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 8 files) and
1 authors
1 vignette
1 internal data file
3 imported packages
11 exported functions (median 45 lines of code)
19 non-exported functions in R (median 50 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	8	47.5
files_inst	5	97.4
files_vignettes	1	61.2
files_tests	8	82.7
loc_R	596	50.6
loc_inst	814	75.3
loc_vignettes	120	29.8
loc_tests	1153	85.0
num_vignettes	1	58.2
data_size_total	627	57.5
data_size_median	627	60.7
n_fns_r	30	39.4
n_fns_r_exported	11	48.6
n_fns_r_not_exported	19	38.3
n_fns_per_file_r	1	22.3
num_params_per_fn	3	29.2
loc_per_fn_r	49	89.2
loc_per_fn_r_exp	45	76.4
loc_per_fn_r_not_exp	50	90.0
rel_whitespace_R	16	48.6
rel_whitespace_inst	19	75.9
rel_whitespace_vignettes	46	39.0
rel_whitespace_tests	23	86.3
doclines_per_fn_exp	33	38.9
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	0	0.0	TRUE

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

4. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
18451911757	pages build and deployment	success	ce9213	75	2025-10-13
18451863126	pkgcheck	success	ca872d	28	2025-10-13
18451863140	pkgdown.yaml	success	ca872d	74	2025-10-13
18451863129	R-CMD-check.yaml	success	ca872d	30	2025-10-13
18451863151	test-coverage.yaml	success	ca872d	30	2025-10-13

3b. `goodpractice` results

`R CMD check` with rcmdcheck

R CMD check generated the following check_fail:

cyclocomp

Test coverage with covr

Package coverage: 95.37

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
rdt	59
rga	49
weibull_to_rga	47
duane	30
plot.duane	22
plot.rga	21

Static code analyses with lintr

lintr found no issues with this package!

5. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following function name is duplicated in other packages:

- rdt from rankdifferencetest

Package Versions

package	version
pkgstats	0.2.0.68
pkgcheck	0.1.2.233
srr	0.1.4.9

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

Oct 13 '25 15:10 ropensci-review-bot

Thanks @paulgovan for your submission, which definitely seems in scope. Before we proceed, however, a couple of notes which I'll notate to make further discussion easier:

MP1 I note that all of your examples, and indeed the hard-code notation with your actual code, suggests that you envision exclusive application to time series data. Do you think it would be advantageous for you to adapt the package to accept time-series claseed data as input? The tsbox pacakge would allow arbitrary choice of formats, but that to me would make more sense that effectively hand-coded temporal inputs like in your examples.

The use of classed inputs would also allow a host of pre-processing data checks and procedures to be applied prior to your main calculations, importantly including imputation of missing values, and checking assumptions regarding regularity. I suspect complying with our standards for time series as well as your current compliance with Regression standards would greatly improve both the robustness and the flexibility of the package.
MP2 I wonder whether you might consider Standard G3.1 to be applicable to your package?

G3.1 Statistical software which relies on covariance calculations should enable users to choose between different algorithms for calculating covariances, and should not rely solely on covariances from the stats::cov function.

Many of your calculations implicitly rely on standard Pearson-type covariances through calls to stats::lm(), and could perhaps be improved through replacing those with methods using more robust covariance calculations like those listed under G3.1. Passing stats::lm() results directly to stats::logLik() seems okay in your case because your models are all univariate (time-only), but the whole pipeline of covariance assumptions may be worth thinking about there?

Note that a detailed answer to that question is likely domain-specific, and many domains may never have considered potential impacts of lack of robustness in covariance calculations. If this is the case for your package, then that's okay. But as with time-series extension above, I suspect that modification to accommodate more flexible assumptions regarding covariance structures is likely to significantly improve the package.

MP3 A more general comment is that I found no indication in the README about envisions areas of application, and so was confused by Reliability Growth Analysis, which is a term unfamiliar to me. I had to read the vignette to understand what the package was trying to do. It's important that your README should contain sufficient information for anybody to understand what your package does. And I was still a bit unsure in the vignette, and found myself mostly dependent on the actual code examples to understand what ReliaGrowR actually does. I think it would help everybody - and most importantly reviewers - for you to more clearly identify how your package is intended to be applied, and what problems it is intended to solve.

Finally, and this is just a suggestion which you should feel entirely free to ignore:

MP4 Your input checks and assertions are fabulous and very comprehensive. But they all rely on base-R expressions, which may be slower than some alternative approaches to input assertion? Again, I'm not sure of your envisioned or typical area of application, but for high-frequency usage, I personally lean towards using checkmate, as all assertions are direct C-calls, and often faster than alternatives. Benchmark if you like, but I suspect using checkmate would likely speed up your assertions. It can also make reading code easier, as you don't need to hard-code error messages, yet they still retain the full context like your current hard-coded ones.

I only make those comments in the hope that they'll help improve your package before we proceed to review. Once i had read enough to understand, I was impressed by the package, and the code looks great!

Oct 18 '25 13:10 mpadge

Hi @mpadge, thanks for taking the time to go over the package. I appreciate the detailed notes and suggestions.

MP1: I’m not familiar with tsbox, but I’ll take some time to explore it. One consideration is that reliability data generally includes both a time component and a coupled failure component. In most use cases, I expect users to manage their data in a standard tabular format (e.g., data.frame, CSV file), so it’s not entirely clear whether handling time-series data separately from failure data would add much benefit. That said, my goal is definitely to make data entry as easy as possible. At the same time, I try to keep dependencies to a minimum for better portability (see also my response to MP4). A potential compromise could be to include some guidance on preparing or cleaning data prior to running an RGA.

MP2: That’s a great point — especially if the package is ever extended to handle covariates, which is less common in practice, but still a possible use case. I’ll take another look at G3.1 and consider options for including more robust covariance calculations.

MP3: Agreed. I originally wrote the README with the reliability community in mind, but I can see how that may be unclear to new users. I’ll plan to add a short introduction explaining what RGA is, along with more context in the introductory example.

MP4: I wasn’t familiar with checkmate, but I’ll definitely look into it. As I mentioned in MP1, I try to minimize dependencies to keep the package lightweight and portable, but I’m open to adding them when the performance or readability benefits are significant. The idea of faster assertions is appealing, so I’ll experiment with it.

Oct 20 '25 14:10 paulgovan

Hi @paulgovan . Because this package falls within our statistic packages, we need to find a statistic-specific editor to make the final call. We're working on that, but at the moment our stat editors are overbooked. I just wanted to make you aware that this might take a bit longer than usual (sorry about that!), but we are trying to get someone assigned.

Oct 21 '25 17:10 ldecicco-USGS

Thanks for the heads up @ldecicco-USGS. In the meantime, I hope to address @mpadge's previous comments.

Oct 21 '25 18:10 paulgovan

Hi @mpadge, I wanted to follow up on your earlier points:

MP1 I explored using tsbox, but I’m still not convinced it would add much benefit since the package doesn’t operate strictly on time-series data. That said, I’ve added a short vignette (link) that demonstrates some common data manipulation tasks. While some examples are specific to reliability data, others are more general to help users who may have limited experience with R.

MP2 I’ve updated the documentation for G3.1 to clarify that this standard would apply if the package is extended in the future to include models with covariates.

MP3 The README now includes a brief introduction to Reliability Growth Analysis (RGA) and a revised example with more context to clarify the intended application.

MP4 I agree that checkmate is a strong option. For now, I kept the base-R assertions since I value their explicitness, but I may revisit this in the future if performance becomes a bottleneck.

Thanks again for your feedback!

Oct 31 '25 16:10 paulgovan

Presubmission Inquiry - ReliaGrowR

Scope

'srr' standards compliance:

Checks for ReliaGrowR (v0.2)

1. rOpenSci Statistical Standards (srr package)

2. Package Dependencies

3. Statistical Properties

3a. Network visualisation

4. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

5. Other Checks

Editor-in-Chief Instructions:

1. rOpenSci Statistical Standards (`srr` package)

4. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck