software-review hdcuremodels

Submitting Author Name: Kellie J. Archer Submitting Author Github Handle: @kelliejarcher Repository: https://github.com/kelliejarcher/hdcuremodels Version submitted: 0.0.2 Submission type: Stats Badge grade: bronze/silver/gold (select one) Editor: @tdhock Reviewers: TBD

Archive: TBD Version accepted: TBD Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: hdcuremodels
Title: Penalized Mixture Cure Models for High-Dimensional Data
Version: 0.0.2
Date: 2025-03-11
Authors@R: 
    c(person("Han", "Fu", role = "aut"), person(c("Kellie J."), "Archer", email=
    "[email protected]", role = c("aut","cre"), comment = c(ORCID="0000-0003-1555-5781")))
Description: Provides functions for fitting various penalized parametric and semi-parametric mixture cure models with different penalty functions, testing for a significant cure fraction, and testing for sufficient follow-up as described in Fu et al (2022)<doi:10.1002/sim.9513> and Archer et al (2024)<doi:10.1186/s13045-024-01553-6>. False discovery rate controlled variable selection is provided using model-X knock-offs. 
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (>= 4.2.0)
Imports: doParallel,
         flexsurv,
         flexsurvcure,
         foreach,
         ggplot2,
         ggpubr,
         glmnet,
         knockoff,
         mvnfast,
         parallel,
         plyr,
         methods,
         survival
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.3.2
Suggests: 
    knitr,
    rmarkdown,
    roxygen2,
    testthat (>= 3.0.0)
VignetteBuilder: knitr
LazyData: true
URL: https://github.com/kelliejarcher/hdcuremodels
BugReports: https://github.com/kelliejarcher/hdcuremodels/issues
Config/testthat/edition: 3

Scope

Please indicate which of our statistical package categories this package falls under. (Please check one or more appropriate boxes below):

Statistical Packages
- [ ] Bayesian and Monte Carlo Routines
- [ ] Dimensionality Reduction, Clustering, and Unsupervised Learning
- [ ] Machine Learning
- [x] Regression and Supervised Learning
- [ ] Exploratory Data Analysis (EDA) and Summary Statistics
- [ ] Spatial Analyses
- [ ] Time Series Analyses
- [ ] Probability Distributions

Pre-submission Inquiry

[x] A pre-submission inquiry has been approved in issue#690

General Information

Who is the target audience and what are scientific applications of this package? Analysts who model time-to-event outcomes when some subjects either experience long-term survival or are not susceptible to the event of interest (simplistically, cured).
Paste your responses to our General Standard G1.1 The first implementation of a novel algorithm, describing whether your software is:
- The first implementation of a novel algorithm; or
- The first implementation within R of an algorithm which has previously been implemented in other languages or contexts; or
- An improvement on other implementations of similar algorithms in R.
Please include hyperlinked references to all other relevant software.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? Not applicable.

Badging

What grade of badge are you aiming for? (silver) Silver
If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold) Have a demonstrated generality of usage beyond one single envisioned use case.

Technical checks

Confirm each of the following by checking the box.

[x] I have read the rOpenSci packaging guide.
[x] I have read the author guide and I expect to maintain this package for at least 2 years or have another maintainer identified.
[x] I/we have read the Statistical Software Peer Review Guide for Authors.
[x] I/we have run autotest checks on the package, and ensured no tests fail.
[x] The srr_stats_pre_submit() function confirms this package may be submitted.
[x] The pkgcheck() function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.

This package:

[x] does not violate the Terms of Service of any service it interacts with.
[x] has a CRAN and OSI accepted license.
[x] contains a README with instructions for installing the development version.

Publication options

[x] Do you intend for this package to go on CRAN? I submitted 0.0.1 version of hdcuremodels last June and then learned about ROpenSci. I will not submit a new version to CRAN until after the ROpenSci review.
[ ] Do you intend for this package to go on Bioconductor?

Code of conduct

[x] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Mar 12 '25 02:03 kelliejarcher

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

Mar 12 '25 02:03 ropensci-review-bot

:rocket:

The following problem was found in your submission template:

'statsgrade' variable must be one of [bronze, silver, gold] Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

:wave:

Mar 12 '25 02:03 ropensci-review-bot

Checks for hdcuremodels (v0.0.2)

git hash: c7a555d9

:heavy_check_mark: Package is already on CRAN.
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 77.5%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Package License: MIT + file LICENSE

1. rOpenSci Statistical Standards (`srr` package)

This package is in the following category:

Regression and Supervised Learning

:heavy_check_mark: All applicable standards [v0.2.0] have been documented in this package (283 complied with; 61 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.

2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	1268
internal	stats	281
internal	hdcuremodels	70
internal	graphics	36
internal	utils	19
imports	methods	20
imports	knockoff	13
imports	flexsurv	7
imports	survival	7
imports	glmnet	4
imports	mvnfast	3
imports	parallel	2
imports	flexsurvcure	1
imports	ggpubr	1
imports	doParallel	NA
imports	foreach	NA
imports	ggplot2	NA
imports	plyr	NA
suggests	knitr	NA
suggests	rmarkdown	NA
suggests	roxygen2	NA
suggests	testthat	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

ncol (90), list (82), rep (79), exp (70), drop (67), c (58), matrix (50), sum (46), log (42), which (42), length (40), is.null (32), if (30), dim (25), data.frame (23), return (20), abs (18), nrow (18), t (17), T (17), cbind (16), max (16), sample (16), gamma (15), sapply (15), replace (14), for (13), names (13), which.max (12), pmax (11), as.numeric (10), seq_len (9), subset (9), apply (8), cumsum (8), match.call (8), mean (8), strsplit (8), substitute (8), colSums (7), F (7), pmin (7), which.min (7), ifelse (6), parse (6), paste (6), seq_along (6), summary (6), attr (5), diag (5), eval (5), grep (5), merge (5), rbind (5), sqrt (5), as.character (4), as.data.frame (4), as.list (4), call (4), dimnames (4), match (4), nchar (4), numeric (4), order (4), parent.frame (4), rowMeans (4), substr (4), trimws (4), unique (4), missing (3), rank (3), round (3), rowSums (3), table (3), try (3), as.vector (2), by (2), colMeans (2), diff (2), sort (2), unname (2), choose (1), colnames (1), environment (1), expand.grid (1), gsub (1), warning (1)

stats

time (115), coef (24), optim (16), AIC (13), uniroot (13), df (12), BIC (11), sigma (9), sd (7), step (6), family (5), formula (5), as.formula (4), model.matrix (4), model.response (4), offset (4), rnorm (4), var (4), aggregate (3), glm (3), rexp (3), dist (2), knots (2), model.frame (2), runif (2), rbinom (1), rweibull (1), splinefun (1), terms (1)

hdcuremodels

self_scale (16), l1_negloglik_inc (6), cure_estimate (3), exp_cure (3), exp_negloglik_lat (3), mcp_scad_negloglik_inc (3), weib.cure.negloglik (3), AUC_msi (2), cox_l1 (2), cure.em (2), exp_negloglik (2), extract_rhs_values (2), get_cox_lambda_max (2), select_model (2), auc_mcm (1), C.stat (1), concordance_mcm (1), cureem (1), curegmifs (1), cv_cureem (1), cv_curegmifs (1), cv.em.fdr (1), cv.em.inner (1), cv.em.nofdr (1), cv.gmifs.fdr (1), cv.gmifs.inner (1), cv.gmifs.nofdr (1), exp_update (1), generate_cure_data (1), mcp_penalty (1), mcp_scad_negloglik_lat (1), sim_cure (1), weib.cure.update (1)

graphics

par (19), text (14), frame (3)

methods

is (20)

utils

data (19)

knockoff

create.second_order (7), knockoff.threshold (6)

flexsurv

pgengamma (5), rgompertz (2)

survival

coxph (3), survfit (2), Surv (1), survreg (1)

glmnet

glmnet (4)

mvnfast

rmvn (3)

parallel

makeCluster (2)

flexsurvcure

flexsurvcure (1)

ggpubr

ggarrange (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 18 files) and
2 authors
1 vignette
2 internal data files
13 imported packages
15 exported functions (median 96 lines of code)
117 non-exported functions in R (median 32 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	18	76.8
files_vignettes	1	61.7
files_tests	18	93.9
loc_R	3786	91.7
loc_vignettes	276	58.3
loc_tests	1088	85.0
num_vignettes	1	58.7
data_size_total	842035	93.9
data_size_median	421017	96.6	TRUE
n_fns_r	132	81.1
n_fns_r_exported	15	58.5
n_fns_r_not_exported	117	84.9
n_fns_per_file_r	4	64.0
num_params_per_fn	4	51.1
loc_per_fn_r	36	82.2
loc_per_fn_r_exp	96	91.5
loc_per_fn_r_not_exp	32	80.2
rel_whitespace_R	3	52.6
rel_whitespace_vignettes	36	60.5
rel_whitespace_tests	6	60.8
doclines_per_fn_exp	51	63.8
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	101	78.4

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

4. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
13799638498	pkgcheck	success	c7a555	8	2025-03-11
13799638500	R-CMD-check.yaml	success	c7a555	5	2025-03-11

3b. `goodpractice` results

`R CMD check` with rcmdcheck

R CMD check generated the following check_fails:

cyclocomp
no_description_date
no_import_package_as_a_whole

Test coverage with covr

Package coverage: 77.51

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
cv_cureem	57
coef.mixturecure	52
plot.mixturecure	46
cureem	43
cv.em.nofdr	41
cv_curegmifs	39
concordance_mcm	37
curegmifs	37
inits_check	28
cv.gmifs.nofdr	26
cox_l1	25
cox_mcp_scad	24
predict.mixturecure	24
select_model	24
nonzerocure_test	22
C.stat	17
cure.em	17
exp_EM	17
weib_EM	17
generate_cure_data	15

Static code analyses with lintr

lintr found the following 585 potential issues:

message	number of times
Avoid 1:length(...) expressions, use seq_len.	2
Avoid 1:ncol(...) expressions, use seq_len.	19
Avoid 1:nrow(...) expressions, use seq_len.	1
Avoid library() and require() calls in packages	2
Avoid using sapply, consider vapply instead, that's type safe	11
Lines should not be more than 80 characters. This line is 100 characters.	4
Lines should not be more than 80 characters. This line is 101 characters.	1
Lines should not be more than 80 characters. This line is 102 characters.	6
Lines should not be more than 80 characters. This line is 103 characters.	9
Lines should not be more than 80 characters. This line is 104 characters.	13
Lines should not be more than 80 characters. This line is 106 characters.	4
Lines should not be more than 80 characters. This line is 107 characters.	5
Lines should not be more than 80 characters. This line is 108 characters.	12
Lines should not be more than 80 characters. This line is 109 characters.	5
Lines should not be more than 80 characters. This line is 110 characters.	18
Lines should not be more than 80 characters. This line is 111 characters.	2
Lines should not be more than 80 characters. This line is 113 characters.	3
Lines should not be more than 80 characters. This line is 114 characters.	1
Lines should not be more than 80 characters. This line is 115 characters.	9
Lines should not be more than 80 characters. This line is 116 characters.	8
Lines should not be more than 80 characters. This line is 117 characters.	3
Lines should not be more than 80 characters. This line is 118 characters.	14
Lines should not be more than 80 characters. This line is 119 characters.	1
Lines should not be more than 80 characters. This line is 120 characters.	1
Lines should not be more than 80 characters. This line is 121 characters.	11
Lines should not be more than 80 characters. This line is 122 characters.	1
Lines should not be more than 80 characters. This line is 123 characters.	1
Lines should not be more than 80 characters. This line is 124 characters.	1
Lines should not be more than 80 characters. This line is 127 characters.	1
Lines should not be more than 80 characters. This line is 128 characters.	2
Lines should not be more than 80 characters. This line is 129 characters.	7
Lines should not be more than 80 characters. This line is 130 characters.	7
Lines should not be more than 80 characters. This line is 131 characters.	4
Lines should not be more than 80 characters. This line is 132 characters.	3
Lines should not be more than 80 characters. This line is 133 characters.	1
Lines should not be more than 80 characters. This line is 135 characters.	1
Lines should not be more than 80 characters. This line is 136 characters.	5
Lines should not be more than 80 characters. This line is 139 characters.	1
Lines should not be more than 80 characters. This line is 141 characters.	10
Lines should not be more than 80 characters. This line is 142 characters.	7
Lines should not be more than 80 characters. This line is 143 characters.	1
Lines should not be more than 80 characters. This line is 144 characters.	2
Lines should not be more than 80 characters. This line is 149 characters.	3
Lines should not be more than 80 characters. This line is 150 characters.	11
Lines should not be more than 80 characters. This line is 151 characters.	17
Lines should not be more than 80 characters. This line is 153 characters.	1
Lines should not be more than 80 characters. This line is 154 characters.	1
Lines should not be more than 80 characters. This line is 157 characters.	1
Lines should not be more than 80 characters. This line is 158 characters.	1
Lines should not be more than 80 characters. This line is 160 characters.	2
Lines should not be more than 80 characters. This line is 161 characters.	7
Lines should not be more than 80 characters. This line is 162 characters.	12
Lines should not be more than 80 characters. This line is 163 characters.	5
Lines should not be more than 80 characters. This line is 167 characters.	2
Lines should not be more than 80 characters. This line is 171 characters.	1
Lines should not be more than 80 characters. This line is 172 characters.	2
Lines should not be more than 80 characters. This line is 173 characters.	2
Lines should not be more than 80 characters. This line is 176 characters.	2
Lines should not be more than 80 characters. This line is 178 characters.	1
Lines should not be more than 80 characters. This line is 182 characters.	3
Lines should not be more than 80 characters. This line is 183 characters.	1
Lines should not be more than 80 characters. This line is 184 characters.	2
Lines should not be more than 80 characters. This line is 191 characters.	5
Lines should not be more than 80 characters. This line is 195 characters.	6
Lines should not be more than 80 characters. This line is 197 characters.	1
Lines should not be more than 80 characters. This line is 205 characters.	1
Lines should not be more than 80 characters. This line is 207 characters.	1
Lines should not be more than 80 characters. This line is 212 characters.	1
Lines should not be more than 80 characters. This line is 217 characters.	2
Lines should not be more than 80 characters. This line is 218 characters.	1
Lines should not be more than 80 characters. This line is 225 characters.	1
Lines should not be more than 80 characters. This line is 226 characters.	2
Lines should not be more than 80 characters. This line is 227 characters.	1
Lines should not be more than 80 characters. This line is 228 characters.	2
Lines should not be more than 80 characters. This line is 242 characters.	1
Lines should not be more than 80 characters. This line is 248 characters.	1
Lines should not be more than 80 characters. This line is 249 characters.	1
Lines should not be more than 80 characters. This line is 253 characters.	1
Lines should not be more than 80 characters. This line is 263 characters.	1
Lines should not be more than 80 characters. This line is 265 characters.	2
Lines should not be more than 80 characters. This line is 269 characters.	5
Lines should not be more than 80 characters. This line is 278 characters.	1
Lines should not be more than 80 characters. This line is 279 characters.	1
Lines should not be more than 80 characters. This line is 281 characters.	1
Lines should not be more than 80 characters. This line is 283 characters.	9
Lines should not be more than 80 characters. This line is 286 characters.	1
Lines should not be more than 80 characters. This line is 291 characters.	5
Lines should not be more than 80 characters. This line is 293 characters.	5
Lines should not be more than 80 characters. This line is 305 characters.	5
Lines should not be more than 80 characters. This line is 316 characters.	3
Lines should not be more than 80 characters. This line is 317 characters.	1
Lines should not be more than 80 characters. This line is 320 characters.	1
Lines should not be more than 80 characters. This line is 321 characters.	1
Lines should not be more than 80 characters. This line is 330 characters.	2
Lines should not be more than 80 characters. This line is 333 characters.	5
Lines should not be more than 80 characters. This line is 337 characters.	1
Lines should not be more than 80 characters. This line is 343 characters.	1
Lines should not be more than 80 characters. This line is 357 characters.	1
Lines should not be more than 80 characters. This line is 362 characters.	1
Lines should not be more than 80 characters. This line is 387 characters.	1
Lines should not be more than 80 characters. This line is 391 characters.	1
Lines should not be more than 80 characters. This line is 428 characters.	1
Lines should not be more than 80 characters. This line is 439 characters.	1
Lines should not be more than 80 characters. This line is 81 characters.	33
Lines should not be more than 80 characters. This line is 82 characters.	31
Lines should not be more than 80 characters. This line is 83 characters.	10
Lines should not be more than 80 characters. This line is 84 characters.	15
Lines should not be more than 80 characters. This line is 85 characters.	13
Lines should not be more than 80 characters. This line is 86 characters.	12
Lines should not be more than 80 characters. This line is 87 characters.	12
Lines should not be more than 80 characters. This line is 88 characters.	8
Lines should not be more than 80 characters. This line is 89 characters.	6
Lines should not be more than 80 characters. This line is 90 characters.	13
Lines should not be more than 80 characters. This line is 91 characters.	5
Lines should not be more than 80 characters. This line is 92 characters.	2
Lines should not be more than 80 characters. This line is 93 characters.	2
Lines should not be more than 80 characters. This line is 94 characters.	5
Lines should not be more than 80 characters. This line is 95 characters.	9
Lines should not be more than 80 characters. This line is 96 characters.	14
Lines should not be more than 80 characters. This line is 97 characters.	1
Lines should not be more than 80 characters. This line is 98 characters.	2
Lines should not be more than 80 characters. This line is 99 characters.	3
Use <-, not =, for assignment.	7

Package Versions

package	version
pkgstats	0.2.0.54
pkgcheck	0.1.2.122
srr	0.1.3.26

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

Mar 12 '25 02:03 ropensci-review-bot

@kelliejarcher thanks so much for following up with a full submission.

Here I share some preliminary checks to support conversations with potential handling editors. If any of this feedback feels useful, now it's a great time to incorporate it.

We'll come back ASAP.

EiC pre-checks

Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,

[x] Is the case for the package well made?
[ ] Is the reference index page clear (grouped by topic if necessary)?

Lacks a pkgdown website. The file names in R/ suggest an alpha sorted reference might not group functionality in a meaningful way.

[x] Are vignettes readable, sufficiently detailed and not just perfunctory?

Consider renaming the file to match the package name so that pkgdown automatically adds a "Get started" tab on the website: mv vignettes/hdcuremodels-vignette.Rmd vignettes/hdcuremodels.Rmd

[x] Fit: The package meets criteria for fit and overlap.
[X] Installation instructions: Are installation instructions clear enough for human users?

Consider adding instructions to install the development version

[x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?

I see set.seed() and no cleanup. Consider using withr::local_seed() for auto-cleanup. I see multiple calls to expect_error() with no string, class, or snapshot to match. Consider making the expectation more specific.

[x] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
[x] License: The package has a CRAN or OSI accepted license.
[x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Mar 20 '25 13:03 maurolepore

This is my first ROpenSci submission. I previously made some revisions based on comments below. Can you please explain what the next steps will be?

Thanks for any information, Kellie

Apr 22 '25 18:04 kelliejarcher

@kelliejarcher thanks for following up. I re-announced the package among our editors. Once we find an available handling editor they will guide you through the rest of the process.

BTW, ideally, to keep this thread easier to follow it's best to try respond directly through the GitHub interface. Responding to the email notification duplicates the message you're responding to 😸

Apr 24 '25 01:04 maurolepore

Here I mark the end of my EiC rotation and leave a short note for the next EiC.

This is a stats submission.
The package was announced twice but still lacks a handling editor.

May 04 '25 22:05 maurolepore

I have been waiting to submit a peer-reviewed manuscript describing this R package to incorporate feedback from ROpenSci. After almost 3 months, I have not heard whether anyone has volunteered to review this package. Can you please let me know if there is something else I need to do or if I should simply forgo the ROpenSci review? I am new to this forum so am unclear on the process. Thanks for any information.

Jun 02 '25 18:06 kelliejarcher

@kelliejarcher I'm so sorry, this one somehow slipped through the gaps. I've taken over from @maurolepore for this EiC rotation, and will get things happening here asap.

Jun 03 '25 06:06 mpadge

@ropensci-review-bot assign @tdhock as editor

Jun 04 '25 08:06 mpadge

Assigned! @tdhock is now the editor

Jun 04 '25 08:06 ropensci-review-bot

@ropensci-review-bot seeking reviewers

Jun 06 '25 07:06 tdhock

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/692_status.svg)](https://github.com/ropensci/software-review/issues/692)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

Jun 06 '25 07:06 ropensci-review-bot

When I added this badge and updated github, the pgkcheck failed but it seems to be due to an R package that I am not explicitly using, as the error is ✖ Failed to build Rdsdp 1.0.6 (3.1s) Do you know how I can correct this? I am using R 4.5.0 and I checked that Rdsdp 1.0.6 is installed on my machine. Sorry for the naive question, I am new to pgkcheck and ROpenSci processes.

Jun 09 '25 13:06 kelliejarcher

Package Review

Briefly describe any working relationship you have (had) with the package authors.
[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work.

Documentation

The package includes all the following forms of documentation:

[x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s): demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions
[x] Examples: (that run successfully locally) for all exported functions
[x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software have been confirmed.
[x] Performance: Any performance claims of the software have been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing:

[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

The functions in hdcuremodels-vignette are well documented, but I find for each function, the long explanation paragraph of parameters hard to follow and would personally prefer bullet points.
In the introduction of hdcuremodels-vignette, the cure survival function could be explained more clearly and in detail. For example, in $S_u(t, \mathbf{w} | Y=1)$, I assume $Y$ is the indicator {0,1} for {cured, susceptible}. It would also help to show the size of each component, like the sizes of $\mathbf{x}$ and $\mathbf{w}$.
In the data examples of hdcuremodels-vignette, it would be helpful to explain the hyperparameters like a and rho more clearly, maybe including their ranges and effects (for example, what happens when rho is 0 or 1, what happens when a is 0).
The generate_cure_data function requires that the total number of covariates $j$ is divisible by the number of true relevant covariates $n_{\text{true}}$, otherwise it causes an error. This limitation should be clearly documented or handled within the function for better usability.
When I use function generate_cure_data, I notice columns like X1.1 and X2.1 always appear — could you please clarify why? I was expecting the dataset to have shape (n, j + 2).
In the lollipop plot (Beta vs. Step), I guess each line represents a coefficient path that becomes stable as a straight line in the end eventually. However, if I want to focus on specific coefficients, it’s hard to identify them by color in the plot. Could there be a way to highlight or label particular coefficients?

Jun 09 '25 15:06 lamtung16

When I added this badge and updated github, the pgkcheck failed but it seems to be due to an R package that I am not explicitly using, as the error is ✖ Failed to build Rdsdp 1.0.6 (3.1s) Do you know how I can correct this? I am using R 4.5.0 and I checked that Rdsdp 1.0.6 is installed on my machine. Sorry for the naive question, I am new to pgkcheck and ROpenSci processes.

@kelliejarcher Those errors clearly come from the {Rdsdp} package itself, which is also generating lots of current warnings on CRAN. Since it's not even a direct dependency of your package, and does not have a public-facing git instance, you'll have to wait for a CRAN update for that to be fixed. (For reference: current version = 1.0.6).

Jun 10 '25 10:06 mpadge

@lamtung16 Thank you for your review of the hdcuremodels package. I have made your requested changes which are now on github. A few comments/questions:

I added your name as a reviewer in the DESCRIPTION file. Please verify I have your name and e-mail correctly identified.
For longer parameter descriptions in each help file, I changed to bulleted lists. Let me know if you would prefer the short lists (when there are 2 to 3 options) in any function also changed to a bulleted list.
I added a paragraph to the beginning of the vignette to more clearly define the notation being used.
In the vignette I explain what a and rho represent and what it means when they are 0 or rho is near 1.
I modified the generate_cure_data function to (a) not require the total number of covariates j to be divisible by the number of true relevant covariates n_true and (b) rename the columns to clearly identify which variables are the unpenalized covariates (prefix is U - the nonp default is 2) and which variables are penalized.
The plot function now has a label parameter that when TRUE, the variable labels are displayed in a legend for the trace plot.

Let me know if you have further comments. Thanks again!

Jun 23 '25 16:06 kelliejarcher

@lamtung16 Thank you for your review of the hdcuremodels package. I have made your requested changes which are now on github. A few comments/questions:

I added your name as a reviewer in the DESCRIPTION file. Please verify I have your name and e-mail correctly identified.

For longer parameter descriptions in each help file, I changed to bulleted lists. Let me know if you would prefer the short lists (when there are 2 to 3 options) in any function also changed to a bulleted list.

I added a paragraph to the beginning of the vignette to more clearly define the notation being used.

In the vignette I explain what a and rho represent and what it means when they are 0 or rho is near 1.

I modified the generate_cure_data function to (a) not require the total number of covariates j to be divisible by the number of true relevant covariates n_true and (b) rename the columns to clearly identify which variables are the unpenalized covariates (prefix is U - the nonp default is 2) and which variables are penalized.

The plot function now has a label parameter that when TRUE, the variable labels are displayed in a legend for the trace plot.

Let me know if you have further comments. Thanks again!

That’s a good improvement 👍😃 — thank you! I have a few comments:

In the DESCRIPTION file, my email should be [email protected], not [email protected].
In generate_cure_data, nonp = 0 does not work? Is it supposed to be not working?
In generate_cure_data, the total number of features is j + nonp? Are U1, U2, ... noise?
When I run data <- generate_cure_data(n = 20, j = 3, n_true = 2), it doesn’t work. I’m not sure why.
When I try the following code:

data <- generate_cure_data(n = 1000, j = 4, n_true = 2, nonp = 1)
fitem <- cureem(Surv(Time, Censor) ~ .,
  data = data$training,
  x_latency = data$training
)
plot(fitem, label = TRUE)

I'm not quite sure how to interpret this plot (can I say X1 and X2 are true features and X3 and X4 are noises?). It would be really helpful if you could provide a few simple example plots like this one, along with some explanation — for instance, something like: "This dataset has 2 true features and 2 noise variables. As the penalty increases, the coefficients for the noise variables shrink to zero, while the true features remain. This shows the model's ability to distinguish signal from noise." Having that kind of context would make the plot much more intuitive.

Honestly, I've been trying to make an example of a dataset having 3 features (2 true features and 1 noise). Then from the coefficients plot, when I have the big regularization parameter(s), I can see one coef shrinks to 0, the other two reach non-zero.

But overall, you guys have done a great job 👍

Jun 30 '25 15:06 lamtung16

Thanks again for your review of our package and helpful comments to improve the functions. The following changes have been made:

Your email address has been updated in the DESCRIPTION file.
The <generate_cure_data> function has been revised to allow <nonp = 0> and also to ensure it works for any combination of , and <n_true> and performs a check that is greater than <n_true>.
The help for <generate_cure_data> has been updated to better explain that covariates prefixed with "U" are unpenalized covariates (for example, suppose you want to coerce age and sex into the model but penalize all gene expression values). The vignette has also been expanded to fully explain all parameters of the <generate_cure_data> function. The "U" covariates are not associated with the outcome so it is correct that they are noise, but so are j - n_true of the penalized covariates.
Regarding your code example, the help file and vignette now better explain how to identify which penalized predictors are truly associated with the outcome. The colors and line types in the plot function were changed so that it is easier to identify the incidence ("I_") and latency ("L_") associations for each variable.
I added a few more generic methods to extract some interesting components from the resulting mixturecure object including dim, family, formula, logLik, and nobs.
This package passes devtools::check() but when uploading to GitHub it still gets the "Failed to build Rdsdp 1.0.6" but that seems to be due to a problem with Rdsdp 1.0.6 though I noticed https://cran.r-project.org/web/checks/check_results_Rdsdp.html indicates everything is OK with that package.

Jul 03 '25 13:07 kelliejarcher

Thanks again for your review of our package and helpful comments to improve the functions. The following changes have been made:

Your email address has been updated in the DESCRIPTION file.

The <generate_cure_data> function has been revised to allow <nonp = 0> and also to ensure it works for any combination of , and <n_true> and performs a check that is greater than <n_true>.

The help for <generate_cure_data> has been updated to better explain that covariates prefixed with "U" are unpenalized covariates (for example, suppose you want to coerce age and sex into the model but penalize all gene expression values). The vignette has also been expanded to fully explain all parameters of the <generate_cure_data> function. The "U" covariates are not associated with the outcome so it is correct that they are noise, but so are j - n_true of the penalized covariates.

Regarding your code example, the help file and vignette now better explain how to identify which penalized predictors are truly associated with the outcome. The colors and line types in the plot function were changed so that it is easier to identify the incidence ("I_") and latency ("L_") associations for each variable.

I added a few more generic methods to extract some interesting components from the resulting mixturecure object including dim, family, formula, logLik, and nobs.

This package passes devtools::check() but when uploading to GitHub it still gets the "Failed to build Rdsdp 1.0.6" but that seems to be due to a problem with Rdsdp 1.0.6 though I noticed https://cran.r-project.org/web/checks/check_results_Rdsdp.html indicates everything is OK with that package.

I checked, they look great 😃

Jul 09 '25 11:07 lamtung16

Thanks for your review, I really appreciate your comments that improved this package. Is the package ready to submit to CRAN and the manuscript to a journal?

Jul 09 '25 11:07 kelliejarcher

Thanks for your review, I really appreciate your comments that improved this package. Is the package ready to submit to CRAN and the manuscript to a journal?

I don't have the right to decide, it's up to the editor, but I think it's good for first manuscript submission 😺.

Jul 09 '25 11:07 lamtung16

We are supposed to find two different reviewers, and I am still trying to find a second one. Please tell me if you have any suggestions of colleagues who you think may be willing and available to review.

Jul 10 '25 02:07 tdhock

I doubt if any of these have reviewed for ROpenSci but here are a few names:

Panagiotis Papastamoulis @.@.> Sharon Xiangwen Xie @.@.> Yingwei Paul Peng @.@.>

Hope that helps, Kellie

From: Toby Dylan Hocking @.> Date: Wednesday, July 9, 2025 at 10:12 PM To: ropensci/software-review @.> Cc: Archer, Kellie @.>, Mention @.> Subject: Re: [ropensci/software-review] hdcuremodels (Issue #692) tdhock left a comment (ropensci/software-review#692) We are supposed to find two different reviewers, and I am still trying to find a second one. Please tell me if you have any suggestions of colleagues who you think may be willing and available

[Image removed by sender.]tdhock left a comment (ropensci/software-review#692)https://urldefense.com/v3/__https:/github.com/ropensci/software-review/issues/692*issuecomment-3055024338__;Iw!!KGKeukY!wRDZWGrxu8bnW2ARIpoyoViCFCRFyiF4ikNmeFzZu3agsGLKVPkrVcPov-vcYr41WaaTRdGLjuJxNpbYOh3PBiDbhg$

We are supposed to find two different reviewers, and I am still trying to find a second one. Please tell me if you have any suggestions of colleagues who you think may be willing and available to review.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ropensci/software-review/issues/692*issuecomment-3055024338__;Iw!!KGKeukY!wRDZWGrxu8bnW2ARIpoyoViCFCRFyiF4ikNmeFzZu3agsGLKVPkrVcPov-vcYr41WaaTRdGLjuJxNpbYOh3PBiDbhg$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AUKATYSVXA6JTPUXZOHJ5DD3HXDXLAVCNFSM6AAAAABY2LXBYWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANJVGAZDIMZTHA__;!!KGKeukY!wRDZWGrxu8bnW2ARIpoyoViCFCRFyiF4ikNmeFzZu3agsGLKVPkrVcPov-vcYr41WaaTRdGLjuJxNpbYOh0uJRfCeg$. You are receiving this because you were mentioned.Message ID: @.***>

Jul 10 '25 11:07 kelliejarcher

@tdhock Any progress on finding another reviewer here?

Aug 29 '25 08:08 mpadge

I asked a few others, and I was expecting a review a few weeks ago, so I sent an email to ask again.

Aug 29 '25 14:08 tdhock

title: "review" output: rmarkdown::md_document: pandoc_args: [ "--wrap=none" ]

Package Review

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work.

Documentation

The package includes all the following forms of documentation:

[x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s): demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions
[x] Examples: (that run successfully locally) for all exported functions
[x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

[ ] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software have been confirmed.
[x] Performance: Any performance claims of the software have been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Please, see the Review Comments section for a problem I encountered during the installation on my Linux/Ubuntu machine.

Estimated hours spent reviewing:

[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Congratulations for the package, I really appreciate the effort invested. The package is useful since it can handle a large number of covariates in a mixture cure rate model. Below I list some issues I experienced during the installation, and finally some general comments that may help the authors to further expand the utility of the package.

Regarding the installation, there was a problem installing the package dependencies in Ubuntu version 24, due to the fact that the compiler search paths do not include /usr/share/R/include, which is where Ubuntu 24.04 puts the file R.h. In particular, the problem was due to the Rdsdp package. In order to install this dependency on my Ubuntu 24, I had first to run in terminal

   export CPATH=/usr/share/R/include:$CPATH
   export LIBRARY_PATH=$(R RHOME)/lib:$LIBRARY_PATH

and then the hdcuremodels package was successfully installed. This is unusual, because it is the first time I had to do something similar for a package that compiles C or Fortran code. The log file of the unsuccessful installation is attached below in the file output_Rdsdp.log.

When running devtools::check() various warnings are generated, like:

ℹ Updating hdcuremodels documentation
ℹ Loading hdcuremodels
✖ auc_mcm.R:44: @srrstats is not a known tag.
✖ auc_mcm.R:45: @srrstats is not a known tag.
✖ auc_mcm.R:46: @srrstats is not a known tag.

(output truncated) otherwise the check is successful.

When running devtools::test() the following warning is generated

Warning (test-formula.R:11:3): formula function works correctly
`failure_message` is missing, with no default.
Backtrace:
    ▆
 1. └─testthat::expect(is.call(formula(fit))) at test-formula.R:11:3

The package provides a generate_cure_data() function in order to simulate synthetic datasets. It would be helpful to return the true (latent) status of each observation (cured or susceptible).
The log-likelihood surface of cure rate models may exhibit multiple modes, see e.g. Papastamoulis and Milienos (Test, 2024), and this in turn may result to sub-optimal inferences. I was wondering whether the package can take this into account (possibly by running the EM algorithm by multiple starts and selecting the best one, by means of the criteria used by the authors).

output_Rdsdp.log

Sep 07 '25 18:09 mqbssppe

Thank you for reviewing our package. Would you please provide how you would like your name to appear in the DESCRIPTION file along with your email address, to acknowledge your review?

Here is an itemized response to your comments.

This installation problem is rather confusing because this package issues no notes or warnings on CRAN https://cran.r-project.org/web/packages/hdcuremodels/index.html Nevertheless, I tried to remedy this by creating a src directory with a Makevars file to include the lines export CPATH=/usr/share/R/include:$CPATH export LIBRARY_PATH=$(R RHOME)/lib:$LIBRARY_PATH but because I am not using an external code (it is only called by the knockoff package that depends on Rdsdp) my package does not pass devtools::check() with this addition. Do you know if there is a change I can make to my package so it passes this check but avoids this problem for Ubuntu 24 users? I thought about adding a line SystemRequirements: DSDP library needed when using the optional FDR control via knockoff R package (bundled in Rdsdp) but wasn't sure if there is something else that can be done.
These are Roxygen tags that do not involve any code. The warnings generated likely resulted because the srr package was not in Suggests in the DESCRIPTION file and this package does not exist on CRAN. I have added the following to the DESCRIPTION file:

Suggests: srr (among the others) Remotes: ropensci-review-tools/srr.

and I added a utils.R file containing

#' @keywords internal "_PACKAGE"

if (getRversion() >= "2.15.1") { utils::globalVariables(c( "srrstats", "srrstatsNA", "srrstatsTODO" )) }

so these are defined as global variables and R won't complain. Let me know if this message still persists.

That line in test-formula has been corrected to expect_true(is.call(formula(fit)))
Great idea, we should have thought of that! We now output the true underlying cure status for the training and testing data.frames as vectors training_y and testing_y in the object returned from generate_cure_data(). The help file and vignette have been updated to reflect this change.
We used our generate_cure_data function to generate cure status data using a sample size of 625 allocating 80% to the training data so that the training data consists of 500 observations, with a cure rate of ~35% and 50 covariates, 3 of which are truly related to incidence and latency, selected independently. No unpenalized covariates were included. We then used cureem to fit a penalized Cox model and varied the starting values for itct (intercept of incidence portion) and survprob (numeric vector for the latency survival probabilities). The coefficient of variation for the coefficient estimates was very small, indicating that the algorithm is insensitive to starting values. The code is

set.seed(16) data <- generate_cure_data(n = 625, j = 50, nonp = 0, n_true = 3, train_prop = 0.80, a = 1, rho = 0.4, itct_mean = 1) training <- data$training testing <- data$testing

fit_b0 <- numeric() fit_b <- matrix(nrow = 100, ncol = 50) fit_beta <- matrix(nrow = 100, ncol = 50) for (i in 1:100) { fit <- cureem(Surv(Time, Censor) ~ . , data = training, x_latency = training, inits = list(itct = runif(1, -1, 1), survprob = runif(500, 0, 1) )) fit_coef <- coef(fit) fit_b0[i] <- fit_coef$b0 fit_b[i,] <- fit_coef$beta_inc fit_beta[i,] <- fit_coef$beta_lat print(i) }

cv <- function(x) { sd(x)/mean(x)} data$parameters$nonzero_b cv(fit_b[,15]) cv(fit_b[,25]) cv(fit_b[,45]) data$parameters$nonzero_beta cv(fit_beta[,11]) cv(fit_beta[,31]) cv(fit_beta[,38])

Sep 12 '25 15:09 kelliejarcher

I am satisfied with the responses received.

Panagiotis Papastamoulis, papapast [at] yahoo.gr

Sep 16 '25 08:09 mqbssppe

Thank you so much. I have updated the DESCRIPTION file to acknowledge your review.

Sep 16 '25 11:09 kelliejarcher

hdcuremodels

Scope

Pre-submission Inquiry

General Information

Badging

Technical checks

Publication options

Code of conduct

Checks for hdcuremodels (v0.0.2)

1. rOpenSci Statistical Standards (srr package)

2. Package Dependencies

3. Statistical Properties

3a. Network visualisation

4. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

Editor-in-Chief Instructions:

EiC pre-checks

Package Review

Documentation

Functionality

Review Comments

title: "review" output: rmarkdown::md_document: pandoc_args: [ "--wrap=none" ]

Package Review

Documentation

Functionality

Review Comments

1. rOpenSci Statistical Standards (`srr` package)

4. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck