hdcuremodels
Submitting Author Name: Kellie J. Archer Submitting Author Github Handle: @kelliejarcher Repository: https://github.com/kelliejarcher/hdcuremodels Version submitted: 0.0.2 Submission type: Stats Badge grade: bronze/silver/gold (select one) Editor: @tdhock Reviewers: TBD
Archive: TBD Version accepted: TBD Language: en
- Paste the full DESCRIPTION file inside a code block below:
Package: hdcuremodels
Title: Penalized Mixture Cure Models for High-Dimensional Data
Version: 0.0.2
Date: 2025-03-11
Authors@R:
c(person("Han", "Fu", role = "aut"), person(c("Kellie J."), "Archer", email=
"[email protected]", role = c("aut","cre"), comment = c(ORCID="0000-0003-1555-5781")))
Description: Provides functions for fitting various penalized parametric and semi-parametric mixture cure models with different penalty functions, testing for a significant cure fraction, and testing for sufficient follow-up as described in Fu et al (2022)<doi:10.1002/sim.9513> and Archer et al (2024)<doi:10.1186/s13045-024-01553-6>. False discovery rate controlled variable selection is provided using model-X knock-offs.
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (>= 4.2.0)
Imports: doParallel,
flexsurv,
flexsurvcure,
foreach,
ggplot2,
ggpubr,
glmnet,
knockoff,
mvnfast,
parallel,
plyr,
methods,
survival
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.3.2
Suggests:
knitr,
rmarkdown,
roxygen2,
testthat (>= 3.0.0)
VignetteBuilder: knitr
LazyData: true
URL: https://github.com/kelliejarcher/hdcuremodels
BugReports: https://github.com/kelliejarcher/hdcuremodels/issues
Config/testthat/edition: 3
Scope
-
Please indicate which of our statistical package categories this package falls under. (Please check one or more appropriate boxes below):
Statistical Packages
- [ ] Bayesian and Monte Carlo Routines
- [ ] Dimensionality Reduction, Clustering, and Unsupervised Learning
- [ ] Machine Learning
- [x] Regression and Supervised Learning
- [ ] Exploratory Data Analysis (EDA) and Summary Statistics
- [ ] Spatial Analyses
- [ ] Time Series Analyses
- [ ] Probability Distributions
Pre-submission Inquiry
- [x] A pre-submission inquiry has been approved in issue#690
General Information
-
Who is the target audience and what are scientific applications of this package? Analysts who model time-to-event outcomes when some subjects either experience long-term survival or are not susceptible to the event of interest (simplistically, cured).
-
Paste your responses to our General Standard G1.1 The first implementation of a novel algorithm, describing whether your software is:
- The first implementation of a novel algorithm; or
- The first implementation within R of an algorithm which has previously been implemented in other languages or contexts; or
- An improvement on other implementations of similar algorithms in R.
Please include hyperlinked references to all other relevant software.
-
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? Not applicable.
Badging
-
What grade of badge are you aiming for? (silver) Silver
-
If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold) Have a demonstrated generality of usage beyond one single envisioned use case.
Technical checks
Confirm each of the following by checking the box.
- [x] I have read the rOpenSci packaging guide.
- [x] I have read the author guide and I expect to maintain this package for at least 2 years or have another maintainer identified.
- [x] I/we have read the Statistical Software Peer Review Guide for Authors.
- [x] I/we have run
autotestchecks on the package, and ensured no tests fail. - [x] The
srr_stats_pre_submit()function confirms this package may be submitted. - [x] The
pkgcheck()function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.
This package:
- [x] does not violate the Terms of Service of any service it interacts with.
- [x] has a CRAN and OSI accepted license.
- [x] contains a README with instructions for installing the development version.
Publication options
- [x] Do you intend for this package to go on CRAN? I submitted 0.0.1 version of hdcuremodels last June and then learned about ROpenSci. I will not submit a new version to CRAN until after the ROpenSci review.
- [ ] Do you intend for this package to go on Bioconductor?
Code of conduct
- [x] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.
:rocket:
The following problem was found in your submission template:
- 'statsgrade' variable must be one of [bronze, silver, gold] Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.
:wave:
Checks for hdcuremodels (v0.0.2)
git hash: c7a555d9
- :heavy_check_mark: Package is already on CRAN.
- :heavy_check_mark: has a 'codemeta.json' file.
- :heavy_check_mark: has a 'contributing' file.
- :heavy_check_mark: uses 'roxygen2'.
- :heavy_check_mark: 'DESCRIPTION' has a URL field.
- :heavy_check_mark: 'DESCRIPTION' has a BugReports field.
- :heavy_check_mark: Package has at least one HTML vignette
- :heavy_check_mark: All functions have examples.
- :heavy_check_mark: Package has continuous integration checks.
- :heavy_check_mark: Package coverage is 77.5%.
- :heavy_check_mark: R CMD check found no errors.
- :heavy_check_mark: R CMD check found no warnings.
Package License: MIT + file LICENSE
1. rOpenSci Statistical Standards (srr package)
This package is in the following category:
- Regression and Supervised Learning
:heavy_check_mark: All applicable standards [v0.2.0] have been documented in this package (283 complied with; 61 N/A standards)
Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.
2. Package Dependencies
Details of Package Dependency Usage (click to open)
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
| type | package | ncalls |
|---|---|---|
| internal | base | 1268 |
| internal | stats | 281 |
| internal | hdcuremodels | 70 |
| internal | graphics | 36 |
| internal | utils | 19 |
| imports | methods | 20 |
| imports | knockoff | 13 |
| imports | flexsurv | 7 |
| imports | survival | 7 |
| imports | glmnet | 4 |
| imports | mvnfast | 3 |
| imports | parallel | 2 |
| imports | flexsurvcure | 1 |
| imports | ggpubr | 1 |
| imports | doParallel | NA |
| imports | foreach | NA |
| imports | ggplot2 | NA |
| imports | plyr | NA |
| suggests | knitr | NA |
| suggests | rmarkdown | NA |
| suggests | roxygen2 | NA |
| suggests | testthat | NA |
| linking_to | NA | NA |
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.
base
ncol (90), list (82), rep (79), exp (70), drop (67), c (58), matrix (50), sum (46), log (42), which (42), length (40), is.null (32), if (30), dim (25), data.frame (23), return (20), abs (18), nrow (18), t (17), T (17), cbind (16), max (16), sample (16), gamma (15), sapply (15), replace (14), for (13), names (13), which.max (12), pmax (11), as.numeric (10), seq_len (9), subset (9), apply (8), cumsum (8), match.call (8), mean (8), strsplit (8), substitute (8), colSums (7), F (7), pmin (7), which.min (7), ifelse (6), parse (6), paste (6), seq_along (6), summary (6), attr (5), diag (5), eval (5), grep (5), merge (5), rbind (5), sqrt (5), as.character (4), as.data.frame (4), as.list (4), call (4), dimnames (4), match (4), nchar (4), numeric (4), order (4), parent.frame (4), rowMeans (4), substr (4), trimws (4), unique (4), missing (3), rank (3), round (3), rowSums (3), table (3), try (3), as.vector (2), by (2), colMeans (2), diff (2), sort (2), unname (2), choose (1), colnames (1), environment (1), expand.grid (1), gsub (1), warning (1)
stats
time (115), coef (24), optim (16), AIC (13), uniroot (13), df (12), BIC (11), sigma (9), sd (7), step (6), family (5), formula (5), as.formula (4), model.matrix (4), model.response (4), offset (4), rnorm (4), var (4), aggregate (3), glm (3), rexp (3), dist (2), knots (2), model.frame (2), runif (2), rbinom (1), rweibull (1), splinefun (1), terms (1)
hdcuremodels
self_scale (16), l1_negloglik_inc (6), cure_estimate (3), exp_cure (3), exp_negloglik_lat (3), mcp_scad_negloglik_inc (3), weib.cure.negloglik (3), AUC_msi (2), cox_l1 (2), cure.em (2), exp_negloglik (2), extract_rhs_values (2), get_cox_lambda_max (2), select_model (2), auc_mcm (1), C.stat (1), concordance_mcm (1), cureem (1), curegmifs (1), cv_cureem (1), cv_curegmifs (1), cv.em.fdr (1), cv.em.inner (1), cv.em.nofdr (1), cv.gmifs.fdr (1), cv.gmifs.inner (1), cv.gmifs.nofdr (1), exp_update (1), generate_cure_data (1), mcp_penalty (1), mcp_scad_negloglik_lat (1), sim_cure (1), weib.cure.update (1)
graphics
par (19), text (14), frame (3)
methods
is (20)
utils
data (19)
knockoff
create.second_order (7), knockoff.threshold (6)
flexsurv
pgengamma (5), rgompertz (2)
survival
coxph (3), survfit (2), Surv (1), survreg (1)
glmnet
glmnet (4)
mvnfast
rmvn (3)
parallel
makeCluster (2)
flexsurvcure
flexsurvcure (1)
ggpubr
ggarrange (1)
NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.
3. Statistical Properties
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
Details of statistical properties (click to open)
The package has:
- code in R (100% in 18 files) and
- 2 authors
- 1 vignette
- 2 internal data files
- 13 imported packages
- 15 exported functions (median 96 lines of code)
- 117 non-exported functions in R (median 32 lines of code)
Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:
-
loc= "Lines of Code" -
fn= "function" -
exp/not_exp= exported / not exported
All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function
The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.
| measure | value | percentile | noteworthy |
|---|---|---|---|
| files_R | 18 | 76.8 | |
| files_vignettes | 1 | 61.7 | |
| files_tests | 18 | 93.9 | |
| loc_R | 3786 | 91.7 | |
| loc_vignettes | 276 | 58.3 | |
| loc_tests | 1088 | 85.0 | |
| num_vignettes | 1 | 58.7 | |
| data_size_total | 842035 | 93.9 | |
| data_size_median | 421017 | 96.6 | TRUE |
| n_fns_r | 132 | 81.1 | |
| n_fns_r_exported | 15 | 58.5 | |
| n_fns_r_not_exported | 117 | 84.9 | |
| n_fns_per_file_r | 4 | 64.0 | |
| num_params_per_fn | 4 | 51.1 | |
| loc_per_fn_r | 36 | 82.2 | |
| loc_per_fn_r_exp | 96 | 91.5 | |
| loc_per_fn_r_not_exp | 32 | 80.2 | |
| rel_whitespace_R | 3 | 52.6 | |
| rel_whitespace_vignettes | 36 | 60.5 | |
| rel_whitespace_tests | 6 | 60.8 | |
| doclines_per_fn_exp | 51 | 63.8 | |
| doclines_per_fn_not_exp | 0 | 0.0 | TRUE |
| fn_call_network_size | 101 | 78.4 |
3a. Network visualisation
Click to see the interactive network visualisation of calls between objects in package
4. goodpractice and other checks
Details of goodpractice checks (click to open)
3a. Continuous Integration Badges
GitHub Workflow Results
| id | name | conclusion | sha | run_number | date |
|---|---|---|---|---|---|
| 13799638498 | pkgcheck | success | c7a555 | 8 | 2025-03-11 |
| 13799638500 | R-CMD-check.yaml | success | c7a555 | 5 | 2025-03-11 |
3b. goodpractice results
R CMD check with rcmdcheck
R CMD check generated the following check_fails:
- cyclocomp
- no_description_date
- no_import_package_as_a_whole
Test coverage with covr
Package coverage: 77.51
Cyclocomplexity with cyclocomp
The following functions have cyclocomplexity >= 15:
| function | cyclocomplexity |
|---|---|
| cv_cureem | 57 |
| coef.mixturecure | 52 |
| plot.mixturecure | 46 |
| cureem | 43 |
| cv.em.nofdr | 41 |
| cv_curegmifs | 39 |
| concordance_mcm | 37 |
| curegmifs | 37 |
| inits_check | 28 |
| cv.gmifs.nofdr | 26 |
| cox_l1 | 25 |
| cox_mcp_scad | 24 |
| predict.mixturecure | 24 |
| select_model | 24 |
| nonzerocure_test | 22 |
| C.stat | 17 |
| cure.em | 17 |
| exp_EM | 17 |
| weib_EM | 17 |
| generate_cure_data | 15 |
Static code analyses with lintr
lintr found the following 585 potential issues:
| message | number of times |
|---|---|
| Avoid 1:length(...) expressions, use seq_len. | 2 |
| Avoid 1:ncol(...) expressions, use seq_len. | 19 |
| Avoid 1:nrow(...) expressions, use seq_len. | 1 |
| Avoid library() and require() calls in packages | 2 |
| Avoid using sapply, consider vapply instead, that's type safe | 11 |
| Lines should not be more than 80 characters. This line is 100 characters. | 4 |
| Lines should not be more than 80 characters. This line is 101 characters. | 1 |
| Lines should not be more than 80 characters. This line is 102 characters. | 6 |
| Lines should not be more than 80 characters. This line is 103 characters. | 9 |
| Lines should not be more than 80 characters. This line is 104 characters. | 13 |
| Lines should not be more than 80 characters. This line is 106 characters. | 4 |
| Lines should not be more than 80 characters. This line is 107 characters. | 5 |
| Lines should not be more than 80 characters. This line is 108 characters. | 12 |
| Lines should not be more than 80 characters. This line is 109 characters. | 5 |
| Lines should not be more than 80 characters. This line is 110 characters. | 18 |
| Lines should not be more than 80 characters. This line is 111 characters. | 2 |
| Lines should not be more than 80 characters. This line is 113 characters. | 3 |
| Lines should not be more than 80 characters. This line is 114 characters. | 1 |
| Lines should not be more than 80 characters. This line is 115 characters. | 9 |
| Lines should not be more than 80 characters. This line is 116 characters. | 8 |
| Lines should not be more than 80 characters. This line is 117 characters. | 3 |
| Lines should not be more than 80 characters. This line is 118 characters. | 14 |
| Lines should not be more than 80 characters. This line is 119 characters. | 1 |
| Lines should not be more than 80 characters. This line is 120 characters. | 1 |
| Lines should not be more than 80 characters. This line is 121 characters. | 11 |
| Lines should not be more than 80 characters. This line is 122 characters. | 1 |
| Lines should not be more than 80 characters. This line is 123 characters. | 1 |
| Lines should not be more than 80 characters. This line is 124 characters. | 1 |
| Lines should not be more than 80 characters. This line is 127 characters. | 1 |
| Lines should not be more than 80 characters. This line is 128 characters. | 2 |
| Lines should not be more than 80 characters. This line is 129 characters. | 7 |
| Lines should not be more than 80 characters. This line is 130 characters. | 7 |
| Lines should not be more than 80 characters. This line is 131 characters. | 4 |
| Lines should not be more than 80 characters. This line is 132 characters. | 3 |
| Lines should not be more than 80 characters. This line is 133 characters. | 1 |
| Lines should not be more than 80 characters. This line is 135 characters. | 1 |
| Lines should not be more than 80 characters. This line is 136 characters. | 5 |
| Lines should not be more than 80 characters. This line is 139 characters. | 1 |
| Lines should not be more than 80 characters. This line is 141 characters. | 10 |
| Lines should not be more than 80 characters. This line is 142 characters. | 7 |
| Lines should not be more than 80 characters. This line is 143 characters. | 1 |
| Lines should not be more than 80 characters. This line is 144 characters. | 2 |
| Lines should not be more than 80 characters. This line is 149 characters. | 3 |
| Lines should not be more than 80 characters. This line is 150 characters. | 11 |
| Lines should not be more than 80 characters. This line is 151 characters. | 17 |
| Lines should not be more than 80 characters. This line is 153 characters. | 1 |
| Lines should not be more than 80 characters. This line is 154 characters. | 1 |
| Lines should not be more than 80 characters. This line is 157 characters. | 1 |
| Lines should not be more than 80 characters. This line is 158 characters. | 1 |
| Lines should not be more than 80 characters. This line is 160 characters. | 2 |
| Lines should not be more than 80 characters. This line is 161 characters. | 7 |
| Lines should not be more than 80 characters. This line is 162 characters. | 12 |
| Lines should not be more than 80 characters. This line is 163 characters. | 5 |
| Lines should not be more than 80 characters. This line is 167 characters. | 2 |
| Lines should not be more than 80 characters. This line is 171 characters. | 1 |
| Lines should not be more than 80 characters. This line is 172 characters. | 2 |
| Lines should not be more than 80 characters. This line is 173 characters. | 2 |
| Lines should not be more than 80 characters. This line is 176 characters. | 2 |
| Lines should not be more than 80 characters. This line is 178 characters. | 1 |
| Lines should not be more than 80 characters. This line is 182 characters. | 3 |
| Lines should not be more than 80 characters. This line is 183 characters. | 1 |
| Lines should not be more than 80 characters. This line is 184 characters. | 2 |
| Lines should not be more than 80 characters. This line is 191 characters. | 5 |
| Lines should not be more than 80 characters. This line is 195 characters. | 6 |
| Lines should not be more than 80 characters. This line is 197 characters. | 1 |
| Lines should not be more than 80 characters. This line is 205 characters. | 1 |
| Lines should not be more than 80 characters. This line is 207 characters. | 1 |
| Lines should not be more than 80 characters. This line is 212 characters. | 1 |
| Lines should not be more than 80 characters. This line is 217 characters. | 2 |
| Lines should not be more than 80 characters. This line is 218 characters. | 1 |
| Lines should not be more than 80 characters. This line is 225 characters. | 1 |
| Lines should not be more than 80 characters. This line is 226 characters. | 2 |
| Lines should not be more than 80 characters. This line is 227 characters. | 1 |
| Lines should not be more than 80 characters. This line is 228 characters. | 2 |
| Lines should not be more than 80 characters. This line is 242 characters. | 1 |
| Lines should not be more than 80 characters. This line is 248 characters. | 1 |
| Lines should not be more than 80 characters. This line is 249 characters. | 1 |
| Lines should not be more than 80 characters. This line is 253 characters. | 1 |
| Lines should not be more than 80 characters. This line is 263 characters. | 1 |
| Lines should not be more than 80 characters. This line is 265 characters. | 2 |
| Lines should not be more than 80 characters. This line is 269 characters. | 5 |
| Lines should not be more than 80 characters. This line is 278 characters. | 1 |
| Lines should not be more than 80 characters. This line is 279 characters. | 1 |
| Lines should not be more than 80 characters. This line is 281 characters. | 1 |
| Lines should not be more than 80 characters. This line is 283 characters. | 9 |
| Lines should not be more than 80 characters. This line is 286 characters. | 1 |
| Lines should not be more than 80 characters. This line is 291 characters. | 5 |
| Lines should not be more than 80 characters. This line is 293 characters. | 5 |
| Lines should not be more than 80 characters. This line is 305 characters. | 5 |
| Lines should not be more than 80 characters. This line is 316 characters. | 3 |
| Lines should not be more than 80 characters. This line is 317 characters. | 1 |
| Lines should not be more than 80 characters. This line is 320 characters. | 1 |
| Lines should not be more than 80 characters. This line is 321 characters. | 1 |
| Lines should not be more than 80 characters. This line is 330 characters. | 2 |
| Lines should not be more than 80 characters. This line is 333 characters. | 5 |
| Lines should not be more than 80 characters. This line is 337 characters. | 1 |
| Lines should not be more than 80 characters. This line is 343 characters. | 1 |
| Lines should not be more than 80 characters. This line is 357 characters. | 1 |
| Lines should not be more than 80 characters. This line is 362 characters. | 1 |
| Lines should not be more than 80 characters. This line is 387 characters. | 1 |
| Lines should not be more than 80 characters. This line is 391 characters. | 1 |
| Lines should not be more than 80 characters. This line is 428 characters. | 1 |
| Lines should not be more than 80 characters. This line is 439 characters. | 1 |
| Lines should not be more than 80 characters. This line is 81 characters. | 33 |
| Lines should not be more than 80 characters. This line is 82 characters. | 31 |
| Lines should not be more than 80 characters. This line is 83 characters. | 10 |
| Lines should not be more than 80 characters. This line is 84 characters. | 15 |
| Lines should not be more than 80 characters. This line is 85 characters. | 13 |
| Lines should not be more than 80 characters. This line is 86 characters. | 12 |
| Lines should not be more than 80 characters. This line is 87 characters. | 12 |
| Lines should not be more than 80 characters. This line is 88 characters. | 8 |
| Lines should not be more than 80 characters. This line is 89 characters. | 6 |
| Lines should not be more than 80 characters. This line is 90 characters. | 13 |
| Lines should not be more than 80 characters. This line is 91 characters. | 5 |
| Lines should not be more than 80 characters. This line is 92 characters. | 2 |
| Lines should not be more than 80 characters. This line is 93 characters. | 2 |
| Lines should not be more than 80 characters. This line is 94 characters. | 5 |
| Lines should not be more than 80 characters. This line is 95 characters. | 9 |
| Lines should not be more than 80 characters. This line is 96 characters. | 14 |
| Lines should not be more than 80 characters. This line is 97 characters. | 1 |
| Lines should not be more than 80 characters. This line is 98 characters. | 2 |
| Lines should not be more than 80 characters. This line is 99 characters. | 3 |
| Use <-, not =, for assignment. | 7 |
Package Versions
| package | version |
|---|---|
| pkgstats | 0.2.0.54 |
| pkgcheck | 0.1.2.122 |
| srr | 0.1.3.26 |
Editor-in-Chief Instructions:
This package is in top shape and may be passed on to a handling editor
@kelliejarcher thanks so much for following up with a full submission.
Here I share some preliminary checks to support conversations with potential handling editors. If any of this feedback feels useful, now it's a great time to incorporate it.
We'll come back ASAP.
EiC pre-checks
Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [ ] Is the reference index page clear (grouped by topic if necessary)?
Lacks a pkgdown website. The file names in R/ suggest an alpha sorted reference might not group functionality in a meaningful way.
- [x] Are vignettes readable, sufficiently detailed and not just perfunctory?
Consider renaming the file to match the package name so that pkgdown automatically adds a "Get started" tab on the website:
mv vignettes/hdcuremodels-vignette.Rmd vignettes/hdcuremodels.Rmd
- [x] Fit: The package meets criteria for fit and overlap.
- [X] Installation instructions: Are installation instructions clear enough for human users?
Consider adding instructions to install the development version
- [x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
I see
set.seed()and no cleanup. Consider usingwithr::local_seed()for auto-cleanup. I see multiple calls toexpect_error()with no string, class, or snapshot to match. Consider making the expectation more specific.
- [x] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
- [x] License: The package has a CRAN or OSI accepted license.
- [x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?
This is my first ROpenSci submission. I previously made some revisions based on comments below. Can you please explain what the next steps will be?
Thanks for any information, Kellie
@kelliejarcher thanks for following up. I re-announced the package among our editors. Once we find an available handling editor they will guide you through the rest of the process.
BTW, ideally, to keep this thread easier to follow it's best to try respond directly through the GitHub interface. Responding to the email notification duplicates the message you're responding to 😸
Here I mark the end of my EiC rotation and leave a short note for the next EiC.
- This is a stats submission.
- The package was announced twice but still lacks a handling editor.
I have been waiting to submit a peer-reviewed manuscript describing this R package to incorporate feedback from ROpenSci. After almost 3 months, I have not heard whether anyone has volunteered to review this package. Can you please let me know if there is something else I need to do or if I should simply forgo the ROpenSci review? I am new to this forum so am unclear on the process. Thanks for any information.
@kelliejarcher I'm so sorry, this one somehow slipped through the gaps. I've taken over from @maurolepore for this EiC rotation, and will get things happening here asap.
@ropensci-review-bot assign @tdhock as editor
Assigned! @tdhock is now the editor
@ropensci-review-bot seeking reviewers
Please add this badge to the README of your package repository:
[](https://github.com/ropensci/software-review/issues/692)
Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news
When I added this badge and updated github, the pgkcheck failed but it seems to be due to an R package that I am not explicitly using, as the error is ✖ Failed to build Rdsdp 1.0.6 (3.1s) Do you know how I can correct this? I am using R 4.5.0 and I checked that Rdsdp 1.0.6 is installed on my machine. Sorry for the naive question, I am new to pgkcheck and ROpenSci processes.
Package Review
- Briefly describe any working relationship you have (had) with the package authors.
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work.
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
- [x] Installation instructions: for the development version of package and any non-standard dependencies in README
- [x] Vignette(s): demonstrating major functionality that runs successfully locally
- [x] Function Documentation: for all exported functions
- [x] Examples: (that run successfully locally) for all exported functions
- [x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL,BugReportsandMaintainer(which may be autogenerated viaAuthors@R).
Functionality
- [x] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software have been confirmed.
- [x] Performance: Any performance claims of the software have been confirmed.
- [x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
Estimated hours spent reviewing:
- [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Review Comments
- The functions in hdcuremodels-vignette are well documented, but I find for each function, the long explanation paragraph of parameters hard to follow and would personally prefer bullet points.
- In the introduction of hdcuremodels-vignette, the cure survival function could be explained more clearly and in detail. For example, in $S_u(t, \mathbf{w} | Y=1)$, I assume $Y$ is the indicator {0,1} for {cured, susceptible}. It would also help to show the size of each component, like the sizes of $\mathbf{x}$ and $\mathbf{w}$.
- In the data examples of hdcuremodels-vignette, it would be helpful to explain the hyperparameters like
aandrhomore clearly, maybe including their ranges and effects (for example, what happens whenrhois 0 or 1, what happens whenais 0). - The
generate_cure_datafunction requires that the total number of covariates $j$ is divisible by the number of true relevant covariates $n_{\text{true}}$, otherwise it causes an error. This limitation should be clearly documented or handled within the function for better usability. - When I use function
generate_cure_data, I notice columns like X1.1 and X2.1 always appear — could you please clarify why? I was expecting the dataset to have shape(n, j + 2). - In the lollipop plot (Beta vs. Step), I guess each line represents a coefficient path that becomes stable as a straight line in the end eventually. However, if I want to focus on specific coefficients, it’s hard to identify them by color in the plot. Could there be a way to highlight or label particular coefficients?
When I added this badge and updated github, the pgkcheck failed but it seems to be due to an R package that I am not explicitly using, as the error is ✖ Failed to build Rdsdp 1.0.6 (3.1s) Do you know how I can correct this? I am using R 4.5.0 and I checked that Rdsdp 1.0.6 is installed on my machine. Sorry for the naive question, I am new to pgkcheck and ROpenSci processes.
@kelliejarcher Those errors clearly come from the {Rdsdp} package itself, which is also generating lots of current warnings on CRAN. Since it's not even a direct dependency of your package, and does not have a public-facing git instance, you'll have to wait for a CRAN update for that to be fixed. (For reference: current version = 1.0.6).
@lamtung16 Thank you for your review of the hdcuremodels package. I have made your requested changes which are now on github. A few comments/questions:
- I added your name as a reviewer in the DESCRIPTION file. Please verify I have your name and e-mail correctly identified.
- For longer parameter descriptions in each help file, I changed to bulleted lists. Let me know if you would prefer the short lists (when there are 2 to 3 options) in any function also changed to a bulleted list.
- I added a paragraph to the beginning of the vignette to more clearly define the notation being used.
- In the vignette I explain what a and rho represent and what it means when they are 0 or rho is near 1.
- I modified the generate_cure_data function to (a) not require the total number of covariates j to be divisible by the number of true relevant covariates n_true and (b) rename the columns to clearly identify which variables are the unpenalized covariates (prefix is U - the nonp default is 2) and which variables are penalized.
- The plot function now has a label parameter that when TRUE, the variable labels are displayed in a legend for the trace plot.
Let me know if you have further comments. Thanks again!
@lamtung16 Thank you for your review of the hdcuremodels package. I have made your requested changes which are now on github. A few comments/questions:
- I added your name as a reviewer in the DESCRIPTION file. Please verify I have your name and e-mail correctly identified.
- For longer parameter descriptions in each help file, I changed to bulleted lists. Let me know if you would prefer the short lists (when there are 2 to 3 options) in any function also changed to a bulleted list.
- I added a paragraph to the beginning of the vignette to more clearly define the notation being used.
- In the vignette I explain what a and rho represent and what it means when they are 0 or rho is near 1.
- I modified the generate_cure_data function to (a) not require the total number of covariates j to be divisible by the number of true relevant covariates n_true and (b) rename the columns to clearly identify which variables are the unpenalized covariates (prefix is U - the nonp default is 2) and which variables are penalized.
- The plot function now has a label parameter that when TRUE, the variable labels are displayed in a legend for the trace plot.
Let me know if you have further comments. Thanks again!
That’s a good improvement 👍😃 — thank you! I have a few comments:
- In the DESCRIPTION file, my email should be [email protected], not [email protected].
- In
generate_cure_data,nonp = 0does not work? Is it supposed to be not working? - In
generate_cure_data, the total number of features isj + nonp? AreU1, U2, ...noise? - When I run
data <- generate_cure_data(n = 20, j = 3, n_true = 2), it doesn’t work. I’m not sure why. - When I try the following code:
data <- generate_cure_data(n = 1000, j = 4, n_true = 2, nonp = 1)
fitem <- cureem(Surv(Time, Censor) ~ .,
data = data$training,
x_latency = data$training
)
plot(fitem, label = TRUE)
I'm not quite sure how to interpret this plot (can I say X1 and X2 are true features and X3 and X4 are noises?). It would be really helpful if you could provide a few simple example plots like this one, along with some explanation — for instance, something like: "This dataset has 2 true features and 2 noise variables. As the penalty increases, the coefficients for the noise variables shrink to zero, while the true features remain. This shows the model's ability to distinguish signal from noise." Having that kind of context would make the plot much more intuitive.
Honestly, I've been trying to make an example of a dataset having 3 features (2 true features and 1 noise). Then from the coefficients plot, when I have the big regularization parameter(s), I can see one coef shrinks to 0, the other two reach non-zero.
But overall, you guys have done a great job 👍
Thanks again for your review of our package and helpful comments to improve the functions. The following changes have been made:
- Your email address has been updated in the DESCRIPTION file.
- The <generate_cure_data> function has been revised to allow <nonp = 0> and also to ensure it works for any combination of
, and <n_true> and performs a check that is greater than <n_true>. - The help for <generate_cure_data> has been updated to better explain that covariates prefixed with "U" are unpenalized covariates (for example, suppose you want to coerce age and sex into the model but penalize all gene expression values). The vignette has also been expanded to fully explain all parameters of the <generate_cure_data> function. The "U" covariates are not associated with the outcome so it is correct that they are noise, but so are j - n_true of the penalized covariates.
- Regarding your code example, the help file and vignette now better explain how to identify which penalized predictors are truly associated with the outcome. The colors and line types in the plot function were changed so that it is easier to identify the incidence ("I_") and latency ("L_") associations for each variable.
- I added a few more generic methods to extract some interesting components from the resulting mixturecure object including dim, family, formula, logLik, and nobs.
- This package passes devtools::check() but when uploading to GitHub it still gets the "Failed to build Rdsdp 1.0.6" but that seems to be due to a problem with Rdsdp 1.0.6 though I noticed https://cran.r-project.org/web/checks/check_results_Rdsdp.html indicates everything is OK with that package.
Thanks again for your review of our package and helpful comments to improve the functions. The following changes have been made:
- Your email address has been updated in the DESCRIPTION file.
- The <generate_cure_data> function has been revised to allow <nonp = 0> and also to ensure it works for any combination of , and <n_true> and performs a check that is greater than <n_true>.
- The help for <generate_cure_data> has been updated to better explain that covariates prefixed with "U" are unpenalized covariates (for example, suppose you want to coerce age and sex into the model but penalize all gene expression values). The vignette has also been expanded to fully explain all parameters of the <generate_cure_data> function. The "U" covariates are not associated with the outcome so it is correct that they are noise, but so are j - n_true of the penalized covariates.
- Regarding your code example, the help file and vignette now better explain how to identify which penalized predictors are truly associated with the outcome. The colors and line types in the plot function were changed so that it is easier to identify the incidence ("I_") and latency ("L_") associations for each variable.
- I added a few more generic methods to extract some interesting components from the resulting mixturecure object including dim, family, formula, logLik, and nobs.
- This package passes devtools::check() but when uploading to GitHub it still gets the "Failed to build Rdsdp 1.0.6" but that seems to be due to a problem with Rdsdp 1.0.6 though I noticed https://cran.r-project.org/web/checks/check_results_Rdsdp.html indicates everything is OK with that package.
I checked, they look great 😃
Thanks for your review, I really appreciate your comments that improved this package. Is the package ready to submit to CRAN and the manuscript to a journal?
Thanks for your review, I really appreciate your comments that improved this package. Is the package ready to submit to CRAN and the manuscript to a journal?
I don't have the right to decide, it's up to the editor, but I think it's good for first manuscript submission 😺.
We are supposed to find two different reviewers, and I am still trying to find a second one. Please tell me if you have any suggestions of colleagues who you think may be willing and available to review.
I doubt if any of these have reviewed for ROpenSci but here are a few names:
Panagiotis Papastamoulis @.@.> Sharon Xiangwen Xie @.@.> Yingwei Paul Peng @.@.>
Hope that helps, Kellie
From: Toby Dylan Hocking @.> Date: Wednesday, July 9, 2025 at 10:12 PM To: ropensci/software-review @.> Cc: Archer, Kellie @.>, Mention @.> Subject: Re: [ropensci/software-review] hdcuremodels (Issue #692) tdhock left a comment (ropensci/software-review#692) We are supposed to find two different reviewers, and I am still trying to find a second one. Please tell me if you have any suggestions of colleagues who you think may be willing and available
[Image removed by sender.]tdhock left a comment (ropensci/software-review#692)https://urldefense.com/v3/__https:/github.com/ropensci/software-review/issues/692*issuecomment-3055024338__;Iw!!KGKeukY!wRDZWGrxu8bnW2ARIpoyoViCFCRFyiF4ikNmeFzZu3agsGLKVPkrVcPov-vcYr41WaaTRdGLjuJxNpbYOh3PBiDbhg$
We are supposed to find two different reviewers, and I am still trying to find a second one. Please tell me if you have any suggestions of colleagues who you think may be willing and available to review.
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ropensci/software-review/issues/692*issuecomment-3055024338__;Iw!!KGKeukY!wRDZWGrxu8bnW2ARIpoyoViCFCRFyiF4ikNmeFzZu3agsGLKVPkrVcPov-vcYr41WaaTRdGLjuJxNpbYOh3PBiDbhg$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AUKATYSVXA6JTPUXZOHJ5DD3HXDXLAVCNFSM6AAAAABY2LXBYWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANJVGAZDIMZTHA__;!!KGKeukY!wRDZWGrxu8bnW2ARIpoyoViCFCRFyiF4ikNmeFzZu3agsGLKVPkrVcPov-vcYr41WaaTRdGLjuJxNpbYOh0uJRfCeg$. You are receiving this because you were mentioned.Message ID: @.***>
@tdhock Any progress on finding another reviewer here?
I asked a few others, and I was expecting a review a few weeks ago, so I sent an email to ask again.
title: "review" output: rmarkdown::md_document: pandoc_args: [ "--wrap=none" ]
Package Review
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work.
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
- [x] Installation instructions: for the development version of package and any non-standard dependencies in README
- [x] Vignette(s): demonstrating major functionality that runs successfully locally
- [x] Function Documentation: for all exported functions
- [x] Examples: (that run successfully locally) for all exported functions
- [x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL,BugReportsandMaintainer(which may be autogenerated viaAuthors@R).
Functionality
- [ ] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software have been confirmed.
- [x] Performance: Any performance claims of the software have been confirmed.
- [x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
Please, see the Review Comments section for a problem I encountered during the installation on my Linux/Ubuntu machine.
Estimated hours spent reviewing:
- [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Review Comments
Congratulations for the package, I really appreciate the effort invested. The package is useful since it can handle a large number of covariates in a mixture cure rate model. Below I list some issues I experienced during the installation, and finally some general comments that may help the authors to further expand the utility of the package.
- Regarding the installation, there was a problem installing the package dependencies in Ubuntu version 24, due to the fact that the compiler search paths do not include
/usr/share/R/include, which is where Ubuntu 24.04 puts the fileR.h. In particular, the problem was due to the Rdsdp package. In order to install this dependency on my Ubuntu 24, I had first to run in terminal
export CPATH=/usr/share/R/include:$CPATH
export LIBRARY_PATH=$(R RHOME)/lib:$LIBRARY_PATH
and then the hdcuremodels package was successfully installed. This is unusual, because it is the first time I had to do something similar for a package that compiles C or Fortran code. The log file of the unsuccessful installation is attached below in the file output_Rdsdp.log.
- When running
devtools::check()various warnings are generated, like:
ℹ Updating hdcuremodels documentation
ℹ Loading hdcuremodels
✖ auc_mcm.R:44: @srrstats is not a known tag.
✖ auc_mcm.R:45: @srrstats is not a known tag.
✖ auc_mcm.R:46: @srrstats is not a known tag.
(output truncated) otherwise the check is successful.
- When running
devtools::test()the following warning is generated
Warning (test-formula.R:11:3): formula function works correctly
`failure_message` is missing, with no default.
Backtrace:
▆
1. └─testthat::expect(is.call(formula(fit))) at test-formula.R:11:3
-
The package provides a
generate_cure_data()function in order to simulate synthetic datasets. It would be helpful to return the true (latent) status of each observation (cured or susceptible). -
The log-likelihood surface of cure rate models may exhibit multiple modes, see e.g. Papastamoulis and Milienos (Test, 2024), and this in turn may result to sub-optimal inferences. I was wondering whether the package can take this into account (possibly by running the EM algorithm by multiple starts and selecting the best one, by means of the criteria used by the authors).
Thank you for reviewing our package. Would you please provide how you would like your name to appear in the DESCRIPTION file along with your email address, to acknowledge your review?
Here is an itemized response to your comments.
-
This installation problem is rather confusing because this package issues no notes or warnings on CRAN https://cran.r-project.org/web/packages/hdcuremodels/index.html Nevertheless, I tried to remedy this by creating a src directory with a Makevars file to include the lines export CPATH=/usr/share/R/include:$CPATH export LIBRARY_PATH=$(R RHOME)/lib:$LIBRARY_PATH but because I am not using an external code (it is only called by the knockoff package that depends on Rdsdp) my package does not pass devtools::check() with this addition. Do you know if there is a change I can make to my package so it passes this check but avoids this problem for Ubuntu 24 users? I thought about adding a line SystemRequirements: DSDP library needed when using the optional FDR control via knockoff R package (bundled in Rdsdp) but wasn't sure if there is something else that can be done.
-
These are Roxygen tags that do not involve any code. The warnings generated likely resulted because the srr package was not in Suggests in the DESCRIPTION file and this package does not exist on CRAN. I have added the following to the DESCRIPTION file:
Suggests: srr (among the others) Remotes: ropensci-review-tools/srr.
and I added a utils.R file containing
#' @keywords internal "_PACKAGE"
if (getRversion() >= "2.15.1") { utils::globalVariables(c( "srrstats", "srrstatsNA", "srrstatsTODO" )) }
so these are defined as global variables and R won't complain. Let me know if this message still persists.
-
That line in test-formula has been corrected to expect_true(is.call(formula(fit)))
-
Great idea, we should have thought of that! We now output the true underlying cure status for the training and testing data.frames as vectors training_y and testing_y in the object returned from generate_cure_data(). The help file and vignette have been updated to reflect this change.
-
We used our generate_cure_data function to generate cure status data using a sample size of 625 allocating 80% to the training data so that the training data consists of 500 observations, with a cure rate of ~35% and 50 covariates, 3 of which are truly related to incidence and latency, selected independently. No unpenalized covariates were included. We then used cureem to fit a penalized Cox model and varied the starting values for itct (intercept of incidence portion) and survprob (numeric vector for the latency survival probabilities). The coefficient of variation for the coefficient estimates was very small, indicating that the algorithm is insensitive to starting values. The code is
set.seed(16) data <- generate_cure_data(n = 625, j = 50, nonp = 0, n_true = 3, train_prop = 0.80, a = 1, rho = 0.4, itct_mean = 1) training <- data$training testing <- data$testing
fit_b0 <- numeric() fit_b <- matrix(nrow = 100, ncol = 50) fit_beta <- matrix(nrow = 100, ncol = 50) for (i in 1:100) { fit <- cureem(Surv(Time, Censor) ~ . , data = training, x_latency = training, inits = list(itct = runif(1, -1, 1), survprob = runif(500, 0, 1) )) fit_coef <- coef(fit) fit_b0[i] <- fit_coef$b0 fit_b[i,] <- fit_coef$beta_inc fit_beta[i,] <- fit_coef$beta_lat print(i) }
cv <- function(x) { sd(x)/mean(x)} data$parameters$nonzero_b cv(fit_b[,15]) cv(fit_b[,25]) cv(fit_b[,45]) data$parameters$nonzero_beta cv(fit_beta[,11]) cv(fit_beta[,31]) cv(fit_beta[,38])
I am satisfied with the responses received.
Panagiotis Papastamoulis, papapast [at] yahoo.gr
Thank you so much. I have updated the DESCRIPTION file to acknowledge your review.