Contributions
Contributions copied to clipboard
Submission of scDiagnostics to Bioconductor
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
- Repository: https://github.com/ccb-hms/scDiagnostics
Confirm the following by editing each check box to '[x]'
-
[x ] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
-
[x ] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
-
[x ] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
-
[x ] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
-
[x ] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
-
[x ] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
-
[x ] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
- [ x] The 'devel' branch for new packages and features.
- [ x] The stable 'release' branch, made available every six months, for bug fixes.
- [x ] Bioconductor version control using Git (optionally via GitHub).
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.
Hi @AnthonyChristidis
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Type: Package
Package: scDiagnostics
Title: Cell type annotation diagnostics
Version: 0.99.0
Authors@R: c(
person("Anthony", "Christidis", role = c("aut", "cre"),
email = "[email protected]"),
person("Andrew", "Ghazi", role = "aut"),
person("Smriti", "Chawla", role = "aut"),
person("Nitesh", "Turaga", role = "ctb"),
person("Ludwig", "Geistlinger", role = "aut"),
person("Robert", "Gentleman", role = "aut")
)
Description: The scDiagnostics package provides diagnostic plots to
assess the quality of cell type assignments from single cell gene
expression profiles. The implemented functionality allows to
assess the reliability of cell type annotations, investigate gene
expression patterns, and explore relationships between different
cell types in query and reference datasets allowing users to
detect potential misalignments between reference and query
datasets. The package also provides visualization capabilities for
diagnositics purposes.
License: Artistic-2.0
URL: https://github.com/ccb-hms/scDiagnostics
BugReports: https://github.com/ccb-hms/scDiagnostics/issues
Depends:
R (>= 4.4.0)
Imports:
SingleCellExperiment,
isotree,
methods,
ggplot2,
RColorBrewer,
gridExtra,
SummarizedExperiment,
stats,
utils,
ranger,
BiocNeighbors,
Hotelling,
rlang
Suggests:
AUCell,
BiocStyle,
corrplot,
knitr,
Matrix,
rmarkdown,
scran,
scRNAseq,
SingleR,
celldex,
ComplexHeatmap,
scuttle,
scater,
testthat (>= 3.0.0)
VignetteBuilder:
knitr
biocViews:
Annotation,
Classification,
Clustering,
GeneExpression,
RNASeq,
SingleCell,
Software,
Transcriptomics
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.1
Config/testthat/edition: 3
Hi we would like to request that this submission is assigned to @HelenaLC as reviewer, if possible, given that she is an expert in single-cell analysis and has written diagnostic functionality for CyTOF data that seems highly relevant here.
Also: this package has been accepted for a package demo at the upcoming Bioc2024 conference. We understand that there are several constraints to the review process, but it would be great if we would be able to proceed with the review process in a timely manner so that the package would be available from Bioconductor in time for the package demo at the conference.
I'm getting an ERROR trying to R CMD build ...
4: do.call(FUN, c(list(x = x, k = k, nu = nu, nv = nv, center = center, scale = scale, BPPARAM = BPPARAM, ...), ARGS(BSPARAM)))
5: runSVD(x, k = k, nu = nu, nv = nv, center = center, scale = scale, BPPARAM = BPPARAM, ..., BSPARAM = BSPARAMFUN(deferred = bsdeferred(BSPARAM), fold = bsfold(BSPARAM)))
6: runSVD(x, k = k, nu = nu, nv = nv, center = center, scale = scale, BPPARAM = BPPARAM, ..., BSPARAM = BSPARAMFUN(deferred = bsdeferred(BSPARAM), fold = bsfold(BSPARAM)))
7: runSVD(x, k = rank, nu = ifelse(get.pcs, rank, 0), nv = ifelse(get.rotation, rank, 0), center = center, scale = scale, ...)
8: runSVD(x, k = rank, nu = ifelse(get.pcs, rank, 0), nv = ifelse(get.rotation, rank, 0), center = center, scale = scale, ...)
9: .local(x, ...)
10: runPCA(x, rank = ncomponents, BSPARAM = BSPARAM, BPPARAM = BPPARAM)
11: runPCA(x, rank = ncomponents, BSPARAM = BSPARAM, BPPARAM = BPPARAM)
12: .calculate_pca(mat, transposed = !is.null(dimred), ...)
13: .local(x, ...)
14: calculatePCA(y, ...)
15: calculatePCA(y, ...)
16: .local(x, ...)
17: runPCA(ref_data)
18: runPCA(ref_data)
19: eval(expr, envir, enclos)
20: eval(expr, envir, enclos)
21: eval_with_user_handlers(expr, envir, enclos, user_handlers)
22: withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers))
23: withCallingHandlers(withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), warning = wHandler, error = eHandler, message = mHandler)
24: handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), warning = wHandler, error = eHandler, message = mHandler))
25: timing_fn(handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), warning = wHandler, error = eHandler, message = mHandler)))
26: evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos, debug = debug, last = i == length(out), use_try = stop_on_error != 2L, keep_warning = keep_warning, keep_message = keep_message, log_echo = log_echo, log_warning = log_warning, output_handler = output_handler, include_timing = include_timing)
27: evaluate::evaluate(...)
28: evaluate(code, envir = env, new_device = FALSE, keep_warning = if (is.numeric(options$warning)) TRUE else options$warning, keep_message = if (is.numeric(options$message)) TRUE else options$message, stop_on_error = if (is.numeric(options$error)) options$error else { if (options$error && options$include) 0L else 2L }, output_handler = knit_handlers(options$render, options))
29: in_dir(input_dir(), expr)
30: in_input_dir(evaluate(code, envir = env, new_device = FALSE, keep_warning = if (is.numeric(options$warning)) TRUE else options$warning, keep_message = if (is.numeric(options$message)) TRUE else options$message, stop_on_error = if (is.numeric(options$error)) options$error else { if (options$error && options$include) 0L else 2L }, output_handler = knit_handlers(options$render, options)))
31: eng_r(options)
32: block_exec(params)
33: call_block(x)
34: process_group(group)
35: withCallingHandlers(if (tangle) process_tangle(group) else process_group(group), error = function(e) if (xfun::pkg_available("rlang", "1.0.0")) rlang::entrace(e))
36: xfun:::handle_error(withCallingHandlers(if (tangle) process_tangle(group) else process_group(group), error = function(e) if (xfun::pkg_available("rlang", "1.0.0")) rlang::entrace(e)), function(e, loc) { setwd(wd) write_utf8(res, output %n% stdout()) paste0("\nQuitting from lines ", loc) }, if (labels[i] != "") sprintf(" [%s]", labels[i]), get_loc)
37: process_file(text, output)
38: knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
39: rmarkdown::render(file, encoding = encoding, quiet = quiet, envir = globalenv(), output_dir = getwd(), ...)
40: vweave_rmarkdown(...)
41: engine$weave(file, quiet = quiet, encoding = enc)
42: doTryCatch(return(expr), name, parentenv, handler)
43: tryCatchOne(expr, names, parentenv, handlers[[1L]])
44: tryCatchList(expr, classes, parentenv, handlers)
45: tryCatch({ engine$weave(file, quiet = quiet, encoding = enc) setwd(startdir) output <- find_vignette_product(name, by = "weave", engine = engine) if (!have.makefile && vignette_is_tex(output)) { texi2pdf(file = output, clean = FALSE, quiet = quiet) output <- find_vignette_product(name, by = "texi2pdf", engine = engine) } outputs <- c(outputs, output)}, error = function(e) { thisOK <<- FALSE fails <<- c(fails, file) message(gettextf("Error: processing vignette '%s' failed with diagnostics:\n%s", file, conditionMessage(e)))})
46: tools::buildVignettes(dir = ".", tangle = TRUE)
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)
I was able to build with R 4.4, Bioc 3.19. Will try with 3.20 container.
Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.0.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: c812f48e7a068bfc6d0ac6653578cf18d4d7d4ff
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: scDiagnostics_0.99.1.tar.gz Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.1.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 2455219c38148c30adae28bdf11ab2982d08a5b4
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: scDiagnostics_0.99.2.tar.gz Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.2.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: e4ddfd6b9de05731c56f89479292851fbbabd45d
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.3.tar.gz macOS 12.7.1 Monterey: scDiagnostics_0.99.3.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 81ce49044abd984190b8eb5381e3734f3fff55e1
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.4.tar.gz macOS 12.7.1 Monterey: scDiagnostics_0.99.4.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 5280d5240b366d6a5174f506e951cd7c0200c3d9
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS, skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.5.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 516db8afee63ef5a91fabf4e22dc7987d2626bc0
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: scDiagnostics_0.99.6.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: scDiagnostics_0.99.6.tar.gz Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.6.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.
Hi there, glad to see this coming through; here my first impressions. While these are mostly stylistic comments and should be simple enough to address, I feel the "crucial" bit is simplifying examples (see below). I think these should let the user have a quick peak at how to execute the function, what it returns etc., but they currently take quite some time to execute, and require installing most "suggested" packages... the vignette can be used for a more lengthly (biologically meaningful) demo, which it does nicely already. Please report back with what has/not been address how/why not, and any questions you might have; thanks!
misc
-
[ ] please consider implementing unit test; we strongly encourage them!
-
[ ] consider adding a package help page (accessible via
?scDiagnostics) with an overview of key functionality and links to main functions -
[ ] there are some inconsistencies in vertical code alignment vs. 4-space tab indentation; this is generally fine, however, to ease readability, it'd be nice to stick to an 80-character limit in examples especially, specifically when variable & function names are long (e.g., the
calculateAveragePairwiseCorrelationexample doesn't show with my standard RStudio panel sizing) -
[ ] it would be nice to start out with a clean-ish CHECK/BiocCheck report (tends to only get worse over time...); in most cases, R CMD CHECK NOTES stating "no visible binding" can be addressed by using
aes(.data$x, ...)(instead ofaes(x)) and adding@importFrom ggplot2 .data
code
-
[ ] many examples have quite long runtimes (>5s & up to 1min), largely attributable to loading the data (from online) and some heavy-ish computations; perhaps worth considering adding example data under
inst/extdata, alongside a piece of documentation and a script underinst/scriptsthat recapitulates how that was retrieved/subsetted/processed? I personally find it quite tedious to have to download something and wait many seconds to see how a function works/what it returns (unless this is really necessary)... e.g., stashing a reference & query scRNA-seq data (subset) and correspondingSingleRresults would suffice for most examples, I think? And then, perhaps, only showing the full workflow (normalization, transfer, selection, reduction, ...) for selected examples but not, say, every visualization? Importantly, this would also help to not require users to install variousSuggests:to run examples (c.f.scuttle/scaterexamples which rely on toy data mostly) -
[ ] avoid
1:...; useseq_len/along()instead -
[ ] not sure where all this applies, but when matching an argument against a predefined set of character strings (e.g.,
plot.plotGeneSetScores.R,projectPCA.Retc.), code can be simplified usingmatch.arg()and (optionally)switch(), e.g.:
foo <- \(se_object, method=c("PCA", "TSNE", "UMAP")) {
arg <- match.arg(arg) # aleady checks validity
mtx <- switch(arg,
PCA={...}, # do something special
reducedDim(obj, arg) # otherwise just this
})
...
}
-
[ ] not sure where/if this would break any function, but maybe worth adding a more validity checks on "simple" function arguments everywhere (e.g., is scalar, is logical/character/numeric etc.); e.g., passing 2 values to
xxx_colinplotQCvsAnnotation()would break line 114 and the plotting due to renaming at line 117 -
[ ] fwiw...
# plotQCvsAnnotation.R:
qc_stats <- colData(query_data)[, qc_col]
cell_type_scores <- colData(query_data)[, score_col]
cell_labels <- colData(query_data)[[label_col]]
data <- data.frame(
QCStats = qc_stats,
Scores = cell_type_scores,
CellType = cell_labels)
# can be written as...
qc_stats <- query_data[[qc_col]]
cell_type_scores <- query_data[[score_col]]
cell_labels <- query_data[[label_col]]
data <- ...
# or...
cols <- c(qc_col, score_col, label_col)
data <- colData(query_data)[, cols]
names(data) <- c("QCStats", "Scores", "CellType")
- preallocate-and-fill should try and use
v/m/lapply()rather than afor-loop; e.g.,calculateHotellingPValue():
# Create a list to store p-values for each cell type
p_values <- rep(NA, length(unique_cell_types))
names(p_values) <- unique_cell_types
for (cell_type in unique_cell_types) {
# Subset principal component scores for current cell type
ref_subset_scores <- pca_output$ref[which(cell_type == reference_data[[ref_cell_type_col]]), pc_subset]
query_subset_scores <- pca_output$query[which(cell_type == query_data[[query_cell_type_col]]), pc_subset]
# Calculate the p-value
hotelling_output <- Hotelling::hotelling.test(x = ref_subset_scores, y = query_subset_scores)
# Store the result
p_values[cell_type] <- hotelling_output$pval
}
# can be written as... (mock code)
names(unique_cell_types) <- unique_cell_types
idx_ref <- split(ncol(ref), ref[[ref_typ_col]])
idx_que <- split(ncol(que), que[[que_typ_col]])
p_values <- vapply(unique_cell_types, \(.) {
ref_sco <- pca_out$ref[idx_ref[[.]], pc_subset]
que_sco <- pca_out$que[idx_que[[.]], pc_subset]
hotelling_output <- Hotelling::hotelling.test(x=ref_sco, y=que_sco)
return(hotelling_output$pval)
}, numeric(1))
docs
- [ ] repeating affiliations can be implemented as:
author:
- name: Peter Pan
affiliation:
- &foo Neverland
- name: Captain Hook
affiliation:
- *foo
-
[ ] please replace GitHub by normal Bioc installation instructions, i.e.,
BiocManager::install("ccb-hms/scDiagnostics")>"scDiagnostics"(also in the README!) -
[ ] code chunk at lines 320+ is largely commented out code; also line 842? Commented out code should be removed for release
The following are mostly stylistic comments, which I'm aware are not strictly required; however, I feel that proper usage of text-style options and code formatting aren't negligible for legibility...
-
[ ] you can hyperlink external packages using
BiocStylefunctions, e.g., "[remotes](https://...)" > "`r BiocStyle::CRANpkg("remotes")`" and "AUCell" > "`r BiocStyle::Biocpkg("AUCell")`" -
[ ] "scDiagnostics" (the package) is sometimes in code-style and sometimes in plain text; be consistent
-
[ ] when introducing example datasets (lines 102+ and 165+), perhaps worth hyperlinking to the source and/or DOI and/or adding proper citations (@AuthorYear), to make things easier for the reader?
-
[ ] to ease readability, I'd suggests using code-style (and maybe parentheses) to distinguish variable/class/slot/function/argument etc. names from plain text; e.g., "colData of" >
colData"plotGeneExpressionDimred" >plotGeneExpressionDimred() -
[ ] for readability, especially in the vignette, I'd strongly recommend sticking to the recommended 80-character limit and refraining from vertical code alignment (using 4-space tab indentation instead), as the latter introduces a lot of superfluous white space, especially when variable and function names are lengthy; note that this will mostly be displayed online (in the browser), so it's nice to not require horizontal scrolling (e.g., code chunk at lines 434+ is hard to parse)
Thank you so much, @HelenaLC, for your prompt and thorough review - these comments are very useful and addressing them will only make the package better.
I was already aware of some of the issues you brought up and had been working on them for a while, so I hope to be able to submit the update shortly with all these points addressed.
I will let you know if there is anything I would like to consult with you before submitting the updated package for your follow-up review.
Any update on this? Thanks.
Hi @HelenaLC, thanks for reaching out. I recently came back from the BioC2024 conference in Grand Rapids (MI). Ludwig and I will meet shortly to discuss some final touch-ups based on the feedback we received before re-submitting the package for final review.
From my end, I incorporated all of your feedback and improved the functionality of the package further. I am also in the process of finalizing the multiple vignettes for the package (one "Get Started" and a few specialized vignettes that bunches functions with a similar purpose together).
I expect that I will submit the revised version of the package next week by Friday latest. I will keep in touch and notify you when I submit to provide some specifications/details of the revised submission.
Hi @HelenaLC. Just letting you know I will be submitting next week, by Friday latest. Sorry for the delay and thanks for your patience on this!
Received a valid push on git.bioconductor.org; starting a build for commit id: dbfc2e5cf1656506243ce4d969c2b611cf166c87
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): scDiagnostics_0.99.7.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scDiagnostics to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hi @HelenaLC , I'm glad to finally announce that I submitted the revised version of scDiagnostics, in which I addressed all of your comments to the best of my knowledge.
misc:
- I have now added datasets to the package that have been curated/processed and the user can now use them to test the functions in a much more accessible manner (they already contain all the data/metadata to run the examples, i.e. the reducedDims data, the colData, etc). So this should also have simplified the examples for each function by reducing the preamble code of the examples.
- I have implemented unit tests for every single exported function in the package, including the S3 plot methods.
- I have added a ?scDiagnostics help page which now contains an overview of key functionality and also some links to main functions.
- I have fixed most of the 4-spac tab indentation issues, although some remain.
- I fixed all "no visible binding" issues through the use of the "rlang" package.
code:
- I have fixed the code examples such that all are executed below 10 seconds on any reasonable machine.
- I have removed the use of 1:... whenever possible. Some remain where the code would be odd if seq_len()/along() would be used.
- I have used match.arg() whenever possible to match an argument against a predefined set of character string.
- On top of adding more validity checks, I have created an internal argumentCheck() function that is recycled for almost every function in the package to check every single argument passed by the user.
docs:
- Please see the pkgdown website for an overview of the package docs.
- For now, since scDiagnostics is not on Bioconductor yet, I still recommend to install the package from the ccb-hms GitHub repository, but as soon as you let me know I can change it to Bioconductor, I will do so.
- Note that there are now seven vignettes, including one "Getting Started with scDiagnostics". Each of the other vignettes bunch the functions together around a common theme. In the pkgdown website I also organized the functions in the Reference tab by common purpose/theme.
- I added hyperlinks whenever possible throughout the documentation.
- I added citations for the documentation of the datasets, also also references/citations for the key statistical methods/implementations used throughout the package.
- I incorporated BioC-style code as much as possible, including style to distinguish variables, classes, functions, slots, etc.
- As mentioned I tried to fix the 80-character limit for each line as much as possible. There are still some improvements that can be made, for for the sake of not making some of the functions too long, I kept some of these long lines as is (but let me know if you would still prefer to prioritize line length over function length).
Thanks again for your thorough review and all your help on this. I look forward to your feedback and eventual publication of this work on Bioconductor.
Thanks Anthony, it's come along nicely -- build report is much cleaner, very much appreciate putting validity checks & unit tests in place, and examples run a whole lot better (with relevant data in place) -- couple minor points below ... also, not sure I'm missing something, but I don't see any of my previous comments under docs having been addressed? E.g., link to datasets, BiocStyle, affiliations -- all still looks much the same to me?
-
[ ]
data-rawscripts should go underinst/script/according to guidelines -
[ ] re
seq_len/along()-- I see your point with1:10(fixed number), but something like1:n_top_varsshould useseq_len(); please double check for such cases (package-wide search for "1:[a-z]" should do the trick) -
[ ] re
match.arg()-- you can avoidif (!x %in% c(...)), this is already checked for (e.g.,plotGeneSetScores() lines 51+) ... also here, for dimension reductions, might be worth adding one check that it's inreducedDimNames()upfront rather than 3 independent checks & stop messages... -
(this isn't a real review comment, rather perhaps helping with better coding practice in the future :) to given one example, in
calculateHotellingPValue(), when you repeat the same (long) sequence of characters > 2 times, code is a lot less cluttered using an intermediate variable... e.g.,
observed_t2 <- hotellingT2(
as.matrix(cell_list[[cell_type]][ref_ind, paste0("PC", pc_subset)]),
as.matrix(cell_list[[cell_type]][!ref_ind, paste0("PC", pc_subset)]))
...
perm_t2[perm_id] <- hotellingT2(
as.matrix(cell_list[[cell_type]][ref_sample_id, paste0("PC", pc_subset)]),
as.matrix(cell_list[[cell_type]][-ref_sample_id, paste0("PC", pc_subset)]))
# could be:
x <- as.matrix(cell_list[[cell_type]][, paste0("PC", pc_subset)])
observed_t2 <- hotellingT2(x[ref_ind, ], x[!ref_ind, ])
...
perm_t2[perm_id] <- hotellingT2(x[ref_sample_id, ], x[-ref_sample_id, ])