software-review galamm: Generalized Additive Latent and Mixed Models

galamm: Generalized Additive Latent and Mixed Models

Open osorensen opened this issue 10 months ago • 10 comments

Submitting Author Name: Øystein Sørensen Submitting Author Github Handle: @osorensen Repository: https://github.com/LCBC-UiO/galamm Version submitted: 0.1.1.9000 Submission type: Stats Badge grade: gold Editor: TBD Reviewers: TBD

Archive: TBD Version accepted: TBD Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: galamm
Title: Generalized Additive Latent and Mixed Models
Version: 0.1.1.9000
Authors@R: c(
    person(given = "Øystein",
           family = "Sørensen",
           role = c("aut", "cre"),
           email = "[email protected]",
           comment = c(ORCID = "0000-0003-0724-3542")),
    person(given = "Douglas", family = "Bates", role = "ctb"),       
    person(given = "Ben", family = "Bolker", role = "ctb"),
    person(given = "Martin", family = "Maechler", role = "ctb"),
    person(given = "Allan", family = "Leal", role = "ctb"),
    person(given = "Fabian", family = "Scheipl", role = "ctb"),
    person(given = "Steven", family = "Walker", role = "ctb"),
    person(given = "Simon", family = "Wood", role = "ctb")
           )
Description: Estimates generalized additive latent and
    mixed models using maximum marginal likelihood, 
    as defined in Sorensen et al. (2023) 
    <doi:10.1007/s11336-023-09910-z>, which is an extension of Rabe-Hesketh and
    Skrondal (2004)'s unifying framework for multilevel latent variable 
    modeling <doi:10.1007/BF02295939>. Efficient computation is done using sparse 
    matrix methods, Laplace approximation, and automatic differentiation. The 
    framework includes generalized multilevel models with heteroscedastic 
    residuals, mixed response types, factor loadings, smoothing splines, 
    crossed random effects, and combinations thereof. Syntax for model 
    formulation is close to 'lme4' (Bates et al. (2015) 
    <doi:10.18637/jss.v067.i01>) and 'PLmixed' (Rockwood and Jeon (2019) 
    <doi:10.1080/00273171.2018.1516541>).
License: GPL (>= 3)
URL: https://github.com/LCBC-UiO/galamm, https://lcbc-uio.github.io/galamm/
BugReports: https://github.com/LCBC-UiO/galamm/issues
Encoding: UTF-8
Imports: 
    lme4,
    Matrix,
    memoise,
    methods,
    mgcv,
    nlme,
    Rcpp,
    Rdpack,
    stats
Depends:
    R (>= 3.5.0)
LinkingTo:
    Rcpp,
    RcppEigen
LazyData: true
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.2.3
Suggests:
    covr,
    gamm4,
    knitr,
    PLmixed,
    rmarkdown,
    testthat (>= 3.0.0)
Config/testthat/edition: 3
VignetteBuilder: knitr, rmarkdown
RdMacros: Rdpack
NeedsCompilation: yes
SystemRequirements: C++17

Scope

Please indicate which of our statistical package categories this package falls under. (Please check one appropriate box below):

Statistical Packages
- [ ] Bayesian and Monte Carlo Routines
- [ ] Dimensionality Reduction, Clustering, and Unsupervised Learning
- [ ] Machine Learning
- [x] Regression and Supervised Learning
- [ ] Exploratory Data Analysis (EDA) and Summary Statistics
- [ ] Spatial Analyses
- [ ] Time Series Analyses

Pre-submission Inquiry

[x] A pre-submission inquiry has been approved in issue 614

General Information

Who is the target audience and what are scientific applications of this package? The target audience is applied statisticians and quantitative scientists, particularly those working on the social sciences. The package is motivated by longitudinal studies in cognitive neuroscience, but it is applicable wherever a measurement model (of factor analysis type) needs to be combined with hierarchical modeling.
Paste your responses to our General Standard G1.1 here, describing whether your software is:

This is the first implementation of the algorithm developed in Sørensen, Fjell, and Walhovd (2023).

(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? Not applicable.

Badging

What grade of badge are you aiming for? (bronze, silver, gold) gold
If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold)

"Compliance with a good number of standards beyond those identified as minimally necessary.": I have attempted to comply with all standards for regression software outlined in the Online Book for Statistical Software. I have used srr to point out which parts of the code I think address each of the standards.
"Demonstrating excellence in compliance with multiple standards from at least two broad sub-categories.": I have tried to comply with all the standards in 6.1.1 - 6.1.5 of the Standards Chapter.
"Have a demonstrated generality of usage beyond one single envisioned use case.": The software supports generality of usage, and the vignettes describe several such use cases.

Technical checks

Confirm each of the following by checking the box.

[x] I have read the rOpenSci packaging guide.
[x] I have read the author guide and I expect to maintain this package for at least 2 years or have another maintainer identified.
[x] I/we have read the Statistical Software Peer Review Guide for Authors.
[ ] I/we have run autotest checks on the package, and ensured no tests fail. Running autotest gives some errors, but they were waived in the pre-review issue.
[x] The srr_stats_pre_submit() function confirms this package may be submitted.
[x] The pkgcheck() function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.

This package:

[x] does not violate the Terms of Service of any service it interacts with.
[x] has a CRAN and OSI accepted license.
[x] contains a README with instructions for installing the development version.

Publication options

[x] Do you intend for this package to go on CRAN? The package is on CRAN. I am aware that rOpenSci recommends waiting with submitting to CRAN, but the package has some users already, and having pre-compiled binaries on CRAN makes it easier for them to install it, rather than having to set up a toolchain required for install from source. I hence opted to send it to CRAN.
[ ] Do you intend for this package to go on Bioconductor?

Code of conduct

[x] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Oct 20 '23 08:10 osorensen

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

Oct 20 '23 08:10 ropensci-review-bot

:rocket:

Editor check started

:wave:

Oct 20 '23 08:10 ropensci-review-bot

Checks for galamm (v0.1.1.9000)

git hash: 26cfad15

:heavy_check_mark: Package is already on CRAN.
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 98.4%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.
:eyes: Function names are duplicated in other packages

(Checks marked with :eyes: may be optionally addressed.)

Package License: GPL (>= 3)

1. rOpenSci Statistical Standards (`srr` package)

:heavy_check_mark: All applicable standards [v0.2.0] have been documented in this package (217 complied with; 0 N/A standards)

2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	439
internal	galamm	76
internal	utils	30
internal	graphics	6
imports	stats	79
imports	lme4	11
imports	Matrix	11
imports	mgcv	8
imports	methods	3
imports	nlme	3
imports	memoise	1
imports	Rcpp	NA
imports	Rdpack	NA
suggests	covr	NA
suggests	gamm4	NA
suggests	knitr	NA
suggests	PLmixed	NA
suggests	rmarkdown	NA
suggests	testthat	NA
linking_to	Rcpp	NA
linking_to	RcppEigen	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (31), seq_along (22), for (19), lapply (19), length (18), c (17), names (15), ncol (15), attr (14), seq_len (14), vapply (14), if (13), drop (11), rep (11), as.numeric (10), is.null (9), nrow (9), integer (8), unlist (8), factor (7), paste (7), qr (7), diff (6), max (6), seq (6), all.vars (5), any (5), matrix (5), numeric (5), cbind (4), colnames (4), logical (4), sqrt (4), beta (3), eval (3), grepl (3), levels (3), Map (3), match.call (3), Reduce (3), return (3), row.names (3), scale (3), unique (3), by (2), col (2), data.frame (2), diag (2), do.call (2), ifelse (2), lengths (2), order (2), parent.frame (2), qr.R (2), rank (2), rbind (2), abs (1), array (1), as.character (1), as.integer (1), as.logical (1), as.matrix (1), assign (1), backsolve (1), deparse (1), deparse1 (1), dim (1), environment (1), getOption (1), inherits (1), intersect (1), is.infinite (1), is.nan (1), min (1), parse (1), pmax (1), qr.qty (1), regexpr (1), rep.int (1), rowSums (1), setdiff (1), split (1), sum (1), t (1), tabulate (1), which (1)

stats

deviance (9), pf (8), formula (6), as.formula (4), BIC (4), family (4), logLik (4), model.matrix (4), weights (4), quantile (3), terms (3), AIC (2), nobs (2), rf (2), terms.formula (2), contrasts (1), D (1), delete.response (1), df (1), gaussian (1), getCall (1), model.frame (1), model.response (1), na.action (1), optim (1), pchisq (1), pnorm (1), qnorm (1), reformulate (1), smooth (1), start (1), update (1), vcov (1)

galamm

extractor (3), factor_finder (3), find_parm_inds (3), fn (3), gr (3), mlwrapper (3), define_factor_mappings (2), extend_lambda (2), extract_name (2), find_k (2), gam.setup (2), gamm4 (2), gamm4.setup (2), interpret.gam0 (2), set_initial_values (2), setup_factor (2), anova.galamm (1), coef.galamm (1), confint.galamm (1), deviance.galamm (1), extract_optim_parameters (1), extract_optim_parameters.galamm (1), factor_loadings (1), factor_loadings.galamm (1), family.galamm (1), fitted.galamm (1), fixef.galamm (1), formula.galamm (1), galamm (1), galamm_control (1), gam.side (1), gamm4.wrapup (1), llikAIC (1), logLik.galamm (1), mappingunwrapping (1), marginal_likelihood (1), new_galamm_control (1), nobs.galamm (1), plot_smooth (1), plot_smooth.galamm (1), plot.galamm (1), predict.galamm (1), print.summary.galamm (1), print.VarCorr.galamm (1), ranef.galamm (1), release_questions (1), residuals.galamm (1), setup_family (1), setup_response_object (1), sl (1), squeeze_mappings (1), t2l (1), VarCorr.galamm (1), variable.summary (1)

utils

data (30)

lme4

findbars (3), nobars (3), lFormula (2), mkReTrms (2), .prt.VC (1)

Matrix

t (4), chol (2), Matrix (2), solve (2), Diagonal (1)

mgcv

new.name (2), smooth2random (2), Rrank (1), s (1), smoothCon (1), t2 (1)

graphics

par (3), abline (2), text (1)

methods

as (3)

nlme

fixef (1), ranef (1), VarCorr (1)

memoise

memoise (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in C++ (4% in 2 files), C/C++ Header (66% in 18 files) and R (29% in 30 files)
1 authors
9 vignettes
8 internal data files
9 imported packages
31 exported functions (median 6 lines of code)
81 non-exported functions in R (median 16 lines of code)
618 C/C++ functions (median 4 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	30	89.3
files_src	2	79.1
files_inst	18	99.6
files_vignettes	9	99.2
files_tests	10	90.7
loc_R	1777	81.8
loc_src	252	31.9
loc_inst	4014	86.1
loc_vignettes	1732	96.3	TRUE
loc_tests	2479	95.4	TRUE
num_vignettes	9	99.6	TRUE
data_size_total	265405	88.8
data_size_median	13688	80.9
n_fns_r	112	79.1
n_fns_r_exported	31	79.2
n_fns_r_not_exported	81	79.5
n_fns_src	618	96.1	TRUE
n_fns_per_file_r	2	39.7
n_fns_per_file_src	24	95.1	TRUE
num_params_per_fn	2	11.9
loc_per_fn_r	12	36.1
loc_per_fn_r_exp	6	10.5
loc_per_fn_r_not_exp	16	52.7
loc_per_fn_src	4	2.0	TRUE
rel_whitespace_R	18	80.9
rel_whitespace_src	14	29.1
rel_whitespace_inst	24	85.7
rel_whitespace_vignettes	51	99.2	TRUE
rel_whitespace_tests	11	88.9
doclines_per_fn_exp	42	52.8
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	1302	98.5	TRUE

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

4. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
6584930507	lint	success	26cfad	495	2023-10-20
6584967181	pages build and deployment	success	2fba91	144	2023-10-20
6584930514	pkgdown	success	26cfad	350	2023-10-20
6584930510	R-CMD-check	success	26cfad	558	2023-10-20
6584930523	test-coverage	success	26cfad	248	2023-10-20

3b. `goodpractice` results

`R CMD check` with rcmdcheck

R CMD check generated the following note:

checking installed package size ... NOTE installed size is 34.0Mb sub-directories of 1Mb or more: doc 2.0Mb libs 30.7Mb

R CMD check generated the following check_fail:

rcmdcheck_reasonable_installed_size

Test coverage with covr

Package coverage: 98.35

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
galamm	45
gam.setup	44
gamm4.wrapup	44
interpret.gam0	29
define_factor_mappings	17
galamm_control	17

Static code analyses with lintr

lintr found the following 296 potential issues:

message	number of times
Avoid library() and require() calls in packages	10
Lines should not be more than 80 characters.	286

5. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following 2 function names are duplicated in other packages:

- plot_smooth from itsadug
- sl from reinsureR

Package Versions

package	version
pkgstats	0.1.3.9
pkgcheck	0.1.2.10
srr	0.0.1.194

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

Oct 20 '23 09:10 ropensci-review-bot

👋 @noamross, I just wanted to ask: what's the status of this submission? Is rOpenSci interested in reviewing it?

Dec 21 '23 10:12 osorensen

@osorensen, thanks for following up. My apologies, I think this package fell between cracks in our editor hand-off. I'll follow up later today.

Dec 21 '23 11:12 noamross

@osorensen Apologies once again, with recent organisational changes this once again fell through the cracks. We are now finally on it. How are you positioned if we finally get the process started now?

Apr 25 '24 07:04 mpadge

@mpadge, a paper describing the package is currently under review for a journal, so I think my best option now is the withdraw the submission to ropensci. I can maybe just to that by closing this issue?

Apr 25 '24 08:04 osorensen

@osorensen We'd still like to work with you to get this through our review process. How about one of the following options:

We put the submission on hold here, until you let us know when your paper has passed through review. We'll then re-start the official review process straight after then. Or,
We start the review process anyway. Our reviews are generally completed within just a few months at most. Especially given our slow response thus far, we'd ensure that our process would be completed well before a typical manuscript review. (Some good examples for recent stats submissions are #550 and #571, both done in around 2 months.) All changes and software improvements during our review could be used to support your responses to manuscript reviews.

Note that if your submission is to Journal of Statistical Software, then our system has been developed in collaboration with their processes, and they would likely welcome you using the results of a review here to support their own process.

Apr 25 '24 09:04 mpadge

Thanks @mpadge, I go for option 1 then, and will ping you here once I've got a final decision on the paper.

Apr 27 '24 10:04 osorensen

@osorensen We'd still like to work with you to get this through our review process. How about one of the following options:

We put the submission on hold here, until you let us know when your paper has passed through review. We'll then re-start the official review process straight after then. Or,

We start the review process anyway. Our reviews are generally completed within just a few months at most. Especially given our slow response thus far, we'd ensure that our process would be completed well before a typical manuscript review. (Some good examples for recent stats submissions are Submission - melt: Multiple Empirical Likelihood Tests #550 and waywiser: Ergonomic Methods for Assessing Spatial Models #571, both done in around 2 months.) All changes and software improvements during our review could be used to support your responses to manuscript reviews.

Note that if your submission is to Journal of Statistical Software, then our system has been developed in collaboration with their processes, and they would likely welcome you using the results of a review here to support their own process.

@mpadge, I just stumbled upon this notification. Thank you for mentioning my package review case. The peer review process has significantly enhanced the package's quality in a very short period of time, and I also believe this has expedited its review for publication in the Journal of Statistical Software.

May 03 '24 22:05 markean

@mpadge, the software paper is now published in Multivariate Behavioral Research, https://doi.org/10.1080/00273171.2024.2385336. We can therefore start this review now.

Aug 14 '24 19:08 osorensen

software-review software-review copied to clipboard

galamm: Generalized Additive Latent and Mixed Models

Archive: TBD Version accepted: TBD Language: en

Scope

Pre-submission Inquiry

General Information

Badging

Technical checks

Publication options

Code of conduct

Checks for galamm (v0.1.1.9000)

1. rOpenSci Statistical Standards (srr package)

2. Package Dependencies

3. Statistical Properties

3a. Network visualisation

4. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

5. Other Checks

Editor-in-Chief Instructions:

software-review
software-review copied to clipboard

1. rOpenSci Statistical Standards (`srr` package)

4. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck