Contributions icon indicating copy to clipboard operation
Contributions copied to clipboard

CDI

Open jfanglovestats opened this issue 4 years ago • 50 comments
trafficstars

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

  • Repository: https://github.com/jichunxie/CDI

Confirm the following by editing each check box to '[x]'

  • [x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.

  • [x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.

  • [x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.

  • [x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.

  • [x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.

  • [x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

  • [x] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

  • [x] The 'devel' branch for new packages and features.
  • [x] The stable 'release' branch, made available every six months, for bug fixes.
  • [x] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

jfanglovestats avatar Sep 17 '21 21:09 jfanglovestats

Hi @jfanglovestats

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: CDI
Version: 0.99.0
Date: 2021-07-06
Title: Clustering Deviation Index (CDI)
Description: Single-cell RNA-sequencing (scRNA-seq) is widely used to explore cellular variation. The analysis of scRNA-seq data often starts from clustering cells into subpopulations. This initial step has a high impact on downstream analyses, and hence it is important to be accurate. However, there have not been unsupervised metric designed for scRNA-seq to evaluate clustering performance. Hence, we propose clustering deviation index (CDI), an unsupervised metric based on the modeling of scRNA-seq UMI counts to evaluate clustering of cells.  
Authors@R: c(person("Jiyuan", "Fang", email = "[email protected]", role = c("cre", "aut")), person("Jichun", "Xie", email = "[email protected]", role = c("ctb")))
biocViews: SingleCell, Software, Clustering, Visualization, Sequencing
URL: https://gitlab.oit.duke.edu/jichunxie/xie-lab-software_cdi
BugReports: https://gitlab.oit.duke.edu/jichunxie/xie-lab-software_cdi/-/issues
Imports: matrixStats, Seurat, stats, BiocParallel, ggplot2, reshape2, stringr, grDevices, ggsci
RoxygenNote: 7.1.1
Depends: R(>= 3.4)
Suggests: knitr, rmarkdown
VignetteBuilder: knitr
License: GPL-3
LazyData: true
Encoding: UTF-8

bioc-issue-bot avatar Sep 17 '21 21:09 bioc-issue-bot

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot avatar Sep 22 '21 18:09 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Sep 22 '21 18:09 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: c68dbd7561e63ea73954a06cd3f7b0e2d71ba312

bioc-issue-bot avatar Oct 03 '21 00:10 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Oct 03 '21 00:10 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Oct 05 '21 17:10 bioc-issue-bot

Hi @jfanglovestats

Thanks for submitting CDI :tada:! Below is my review of your package. Please reply here if anything is unclear or needs any further explanation.

What next?

Please address the comments as best as you can. When you are ready for me to check the package again please reply to let me know with a summary of changes you have made or any other responses.

Luke

Review

Key: :rotating_light: Required :warning: Recommended :green_circle: Optional

General package development

  • [ ] :warning: Please check the notes from BiocCheck::BiocCheck() in the build report and address as many as possible

DESCRIPTION file

  • [ ] :green_circle: It is recommended to add ORCiDs to the Authors@R field
  • [ ] :warning: Bioconductor recommends setting LazyData: FALSE
  • [ ] :warning: Bioconductor recommends depending on the current R version (R (>=4.1))
  • [ ] :green_circle: You might also want to add "RNASeq" to your biocViews field

NAMESPACE file

  • [ ] :rotating_light: Function names should be camelCase (starting with a lowercase letter) or snake_case. UpperCamelCase is usually reserved for object classes. (CDILinePlot() can be an exception because the starting captial is part of an acronym)
  • [ ] :rotating_light: Please only import the required functions from {ggplot2} and {ggsci} using importFrom() rather then the whole packages using import()

Package data

  • [ ] :rotating_light: The dataset documentation is very minimal. Please make sure it includes what the data is, where it comes from and the data structure.
  • [ ] :green_circle: It is common to put all the data documentation in a single data.R file. Not essential but might make things a bit tidier.

Documentation

Vignette

  • [ ] :rotating_light: Please use the BiocStyle package template for formatting
  • [ ] :rotating_light: Please rename the Abstract section to Introduction
  • [ ] :rotating_light: Please add an Installation section with Bioconductor installation instructions. This code chunk should NOT be evaluated.
  • [ ] :rotating_light: Please delete the .html file from the vignettes/ directory
  • [ ] :warning: It is recommended to only use a single core in vignettes as you don't know what system the user will be running it on.
  • [ ] :green_circle: There are a lot of blank lines in the .Rmd file, you might want to tidy some of these up.

Man pages

  • [ ] :warning: It is recommended to add a package man page
  • [ ] :rotating_light: The first line of your {roxygen2} function documentation is the function title. This should be one (or a few) words naming the function rather than a whole sentence.
  • [ ] :green_circle: Your internal functions currently don't have any documentation. This doesn't affect users but it can be very useful to have some for maintaining the package.
  • [ ] :green_circle: You might want to think about using the @details tag in your function documentation when you want to explain more about how the function works or what parameters do (it is great to have all this information though :+1:)

Unit tests

  • [ ] :rotating_light: There are currently no unit tests. Please add some to the package.

Code

R

  • [ ] :warning: Please check that you use names for indexing rather than numbers whenever possible
  • [ ] :rotating_light: Please use <- instead of = for assignment in the code, examples and vignette
  • [ ] :rotating_light: Please use message(), warning() or stop() where appropriate instead of cat() or print()
  • [ ] :warning: Currently the package does not interact with any standard Bioconductor data structures. I would strongly recommend supporting the SingleCellExperiment object which is the standard Bioconductor container for storing scRNA-seq data. This would make your package compatible with other scRNA-seq analysis packages. You might also want to consider storing the example data for the vignette as a SingleCellExperiment.
  • [ ] :warning: It is recommended to check that arguments for exported functions are valid. This can avoid a lot of issues for users.
  • [ ] :rotating_light: I think the for loops in CalculateCDI() and CandidateBenchmarkHeatmap() can be replaced with an apply function (please check for others as well)
  • [ ] :rotating_light: Rather than having an argument for the number of cores you should let users supply a BiocParallel::Param() object directly. This lets them have more control over how the parallelisation is done.
  • [ ] :green_circle: I found some of the code confusing because of long lines, complex statements in conditionals and nested if statements. You may want to see if any of it can be rewritten to be a bit clearer. For example in CalculateCDI() all of the code is inside an if statement where the else part just prints a message. It would be clearer to have the argument check at the top of the function with an error if it is incorrect, then the rest of the function could follow (I can explain more if that doesn't make sense).

lazappi avatar Oct 07 '21 18:10 lazappi

Hi Luke,

Thanks for your detailed comments. I will work on them soon and reply to you.

Best, Jiyuan

From: Luke Zappia @.> Reply-To: Bioconductor/Contributions @.> Date: Thursday, October 7, 2021 at 2:11 PM To: Bioconductor/Contributions @.> Cc: jfanglovestats @.>, Mention @.***> Subject: Re: [Bioconductor/Contributions] CDI (#2297)

Hi @jfanglovestatshttps://github.com/jfanglovestats

Thanks for submitting CDI 🎉! Below is my review of your package. Please reply here if anything is unclear or needs any further explanation.

What next?

Please address the comments as best as you can. When you are ready for me to check the package again please reply to let me know with a summary of changes you have made or any other responses.

Luke

Review

Key: 🚨 Required ⚠️ Recommended 🟢 Optional

General package development

  • ⚠️ Please check the notes from BiocCheck::BiocCheck() in the build report and address as many as possible

DESCRIPTION file

  • 🟢 It is recommended to add ORCiDs to the @.*** field
  • ⚠️ Bioconductor recommends setting LazyData: FALSE
  • ⚠️ Bioconductor recommends depending on the current R version (R (>=4.1))
  • 🟢 You might also want to add "RNASeq" to your biocViews field

NAMESPACE file

  • 🚨 Function names should be camelCase (starting with a lowercase letter) or snake_case. UpperCamelCase is usually reserved for object classes. (CDILinePlot() can be an exception because the starting captial is part of an acronym)
  • 🚨 Please only import the required functions from {ggplot2} and {ggsci} using importFrom() rather then the whole packages using import()

Package data

  • 🚨 The dataset documentation is very minimal. Please make sure it includes what the data is, where it comes from and the data structure.
  • 🟢 It is common to put all the data documentation in a single data.R file. Not essential but might make things a bit tidier.

Documentation Vignette

  • 🚨 Please use the BiocStyle package template for formatting
  • 🚨 Please rename the Abstract section to Introduction
  • 🚨 Please add an Installation section with Bioconductor installation instructions. This code chunk should NOT be evaluated.
  • 🚨 Please delete the .html file from the vignettes/ directory
  • ⚠️ It is recommended to only use a single core in vignettes as you don't know what system the user will be running it on.
  • 🟢 There are a lot of blank lines in the .Rmd file, you might want to tidy some of these up.

Man pages

  • ⚠️ It is recommended to add a package man page
  • 🚨 The first line of your {roxygen2} function documentation is the function title. This should be one (or a few) words naming the function rather than a whole sentence.
  • 🟢 Your internal functions currently don't have any documentation. This doesn't affect users but it can be very useful to have some for maintaining the package.
  • 🟢 You might want to think about using the @details tag in your function documentation when you want to explain more about how the function works or what parameters do (it is great to have all this information though 👍)

Unit tests

  • 🚨 There are currently no unit tests. Please add some to the package.

Code R

  • ⚠️ Please check that you use names for indexing rather than numbers whenever possible
  • 🚨 Please use <- instead of = for assignment in the code, examples and vignette
  • 🚨 Please use message(), warning() or stop() where appropriate instead of cat() or print()
  • ⚠️ Currently the package does not interact with any standard Bioconductor data structures. I would strongly recommend supporting the SingleCellExperiment object which is the standard Bioconductor container for storing scRNA-seq data. This would make your package compatible with other scRNA-seq analysis packages. You might also want to consider storing the example data for the vignette as a SingleCellExperiment.
  • ⚠️ It is recommended to check that arguments for exported functions are valid. This can avoid a lot of issues for users.
  • 🚨 I think the for loops in CalculateCDI() and CandidateBenchmarkHeatmap() can be replaced with an apply function (please check for others as well)
  • 🚨 Rather than having an argument for the number of cores you should let users supply a BiocParallel::Param() object directly. This lets them have more control over how the parallelisation is done.
  • 🟢 I found some of the code confusing because of long lines, complex statements in conditionals and nested if statements. You may want to see if any of it can be rewritten to be a bit clearer. For example in CalculateCDI() all of the code is inside an if statement where the else part just prints a message. It would be clearer to have the argument check at the top of the function with an error if it is incorrect, then the rest of the function could follow (I can explain more if that doesn't make sense).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/Contributions/issues/2297#issuecomment-938035098, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE2C5LYZZYC2A5FAA35QOPLUFXPDXANCNFSM5EIMMWMQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jfanglovestats avatar Oct 07 '21 18:10 jfanglovestats

Hi Luke,

Thanks for the review of CDI package. I almost finish revising the package but do have one problem. You mentioned:

🚨 Rather than having an argument for the number of cores you should let users supply a BiocParallel::Param() object directly. This lets them have more control over how the parallelisation is done.

I changed my main function calculate_CDI to have one argument of bioc_parallel_obj:

calculate_CDI( sub_gcmat, cand_lab_df, cell_size_factor, bioc_parallel_obj = MulticoreParam(1), batch_label = NULL, lrt_pval_threshold = 0.01, clustering_method = NULL )

The problem is, if I restart R, only library(CDI) not library(BiocParallel), then the function only works if I don’t change the argument bioc_parallel_obj. If I try to change it to, for example, MulticoreParam(2), it shows

Error in h(simpleError(msg, call)) : error in evaluating the argument 'BPPARAM' in selecting a method for function 'bplapply': could not find function "MulticoreParam".

The BiocParallel is in DESCRIPTION Imports. Adding it to “Suggests” doesn’t help. Do you have any idea how to fix it?

Best, Jiyuan

From: Luke Zappia @.> Reply-To: Bioconductor/Contributions @.> Date: Thursday, October 7, 2021 at 2:11 PM To: Bioconductor/Contributions @.> Cc: jfanglovestats @.>, Mention @.***> Subject: Re: [Bioconductor/Contributions] CDI (#2297)

Hi @jfanglovestatshttps://github.com/jfanglovestats

Thanks for submitting CDI 🎉! Below is my review of your package. Please reply here if anything is unclear or needs any further explanation.

What next?

Please address the comments as best as you can. When you are ready for me to check the package again please reply to let me know with a summary of changes you have made or any other responses.

Luke

Review

Key: 🚨 Required ⚠️ Recommended 🟢 Optional

General package development

  • ⚠️ Please check the notes from BiocCheck::BiocCheck() in the build report and address as many as possible

DESCRIPTION file

  • 🟢 It is recommended to add ORCiDs to the @.*** field
  • ⚠️ Bioconductor recommends setting LazyData: FALSE
  • ⚠️ Bioconductor recommends depending on the current R version (R (>=4.1))
  • 🟢 You might also want to add "RNASeq" to your biocViews field

NAMESPACE file

  • 🚨 Function names should be camelCase (starting with a lowercase letter) or snake_case. UpperCamelCase is usually reserved for object classes. (CDILinePlot() can be an exception because the starting captial is part of an acronym)
  • 🚨 Please only import the required functions from {ggplot2} and {ggsci} using importFrom() rather then the whole packages using import()

Package data

  • 🚨 The dataset documentation is very minimal. Please make sure it includes what the data is, where it comes from and the data structure.
  • 🟢 It is common to put all the data documentation in a single data.R file. Not essential but might make things a bit tidier.

Documentation Vignette

  • 🚨 Please use the BiocStyle package template for formatting
  • 🚨 Please rename the Abstract section to Introduction
  • 🚨 Please add an Installation section with Bioconductor installation instructions. This code chunk should NOT be evaluated.
  • 🚨 Please delete the .html file from the vignettes/ directory
  • ⚠️ It is recommended to only use a single core in vignettes as you don't know what system the user will be running it on.
  • 🟢 There are a lot of blank lines in the .Rmd file, you might want to tidy some of these up.

Man pages

  • ⚠️ It is recommended to add a package man page
  • 🚨 The first line of your {roxygen2} function documentation is the function title. This should be one (or a few) words naming the function rather than a whole sentence.
  • 🟢 Your internal functions currently don't have any documentation. This doesn't affect users but it can be very useful to have some for maintaining the package.
  • 🟢 You might want to think about using the @details tag in your function documentation when you want to explain more about how the function works or what parameters do (it is great to have all this information though 👍)

Unit tests

  • 🚨 There are currently no unit tests. Please add some to the package.

Code R

  • ⚠️ Please check that you use names for indexing rather than numbers whenever possible
  • 🚨 Please use <- instead of = for assignment in the code, examples and vignette
  • 🚨 Please use message(), warning() or stop() where appropriate instead of cat() or print()
  • ⚠️ Currently the package does not interact with any standard Bioconductor data structures. I would strongly recommend supporting the SingleCellExperiment object which is the standard Bioconductor container for storing scRNA-seq data. This would make your package compatible with other scRNA-seq analysis packages. You might also want to consider storing the example data for the vignette as a SingleCellExperiment.
  • ⚠️ It is recommended to check that arguments for exported functions are valid. This can avoid a lot of issues for users.
  • 🚨 I think the for loops in CalculateCDI() and CandidateBenchmarkHeatmap() can be replaced with an apply function (please check for others as well)
  • 🚨 Rather than having an argument for the number of cores you should let users supply a BiocParallel::Param() object directly. This lets them have more control over how the parallelisation is done.
  • 🟢 I found some of the code confusing because of long lines, complex statements in conditionals and nested if statements. You may want to see if any of it can be rewritten to be a bit clearer. For example in CalculateCDI() all of the code is inside an if statement where the else part just prints a message. It would be clearer to have the argument check at the top of the function with an error if it is incorrect, then the rest of the function could follow (I can explain more if that doesn't make sense).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/Contributions/issues/2297#issuecomment-938035098, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE2C5LYZZYC2A5FAA35QOPLUFXPDXANCNFSM5EIMMWMQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jfanglovestats avatar Nov 03 '21 17:11 jfanglovestats

I think you probably need to prefix the function with the package name (eg. bioc_parallel_obj = BiocParallel::MulticoreParam(1)). A couple of other minor suggestions:

  • I would use SerialParam() as the default rather than MulticoreParam(1), this is probably the clearest way and should avoid any overhead with setting up parallel processing when you are only using one core
  • Arguments for this are commonly called BPPARAM rather than bioc_parallel_obj, not a big deal but might help for consistency

lazappi avatar Nov 04 '21 15:11 lazappi

Received a valid push on git.bioconductor.org; starting a build for commit id: 821426a5613f5569af14d39d0ef4c3c4f3961ba2

bioc-issue-bot avatar Dec 06 '21 21:12 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Dec 06 '21 21:12 bioc-issue-bot

@lazappi The windows ERROR is on Bioconductor side. Please ignore this ERROR when re-reviewing the package since it does not appear on the other platforms.

lshep avatar Dec 07 '21 13:12 lshep

Hi Luke,

Thanks a lot for your helpful comments and suggestions! We followed them to revise our package. Here we provide a point-to-point response document. Our responses are colored as blue under your comment. If the comment is addressed, we check the box in front of the comment.

Thanks!

Best, @jfanglovestats

Review

Key: 🚨 Required ⚠️ Recommended 🟢 Optional ☑️Revised in the code ✍️reply to the comments

General package development

  • ☑️⚠️ Please check the notes from BiocCheck::BiocCheck() in the build report and address as many as possible DESCRIPTION file

  • ✍️🟢 It is recommended to add ORCiDs to the @.*** field Response: Some of the authors don’t have ORCiDs.

  • ✍️ ⚠️ Bioconductor recommends setting LazyData: FALSE Response: As our datasets are only used in vignette, changing to” LazyData: FALSE” generate warning in “check()” as the following: Variables with usage in documentation object 'one_batch_matrix' but not in code: ‘one_batch_matrix’.

  • ☑️⚠️ Bioconductor recommends depending on the current R version (R (>=4.1))

  • ☑️🟢 You might also want to add "RNASeq" to your biocViews field NAMESPACE file

  • ☑️ 🚨 Function names should be camelCase (starting with a lowercase letter) or snake_case. UpperCamelCase is usually reserved for object classes. (CDILinePlot() can be an exception because the starting capitial is part of an acronym)

  • ☑️ 🚨 Please only import the required functions from {ggplot2} and {ggsci} using importFrom() rather than the whole packages using import() Package data

  • ☑️🚨 The dataset documentation is very minimal. Please make sure it includes what the data is, where it comes from and the data structure. Response: added a @Note tag to describe the simulation setting; included more information in the @format tag.

  • ☑️🟢 It is common to put all the data documentation in a single data.R file. Not essential but might make things a bit tidier. Documentation Vignette

  • ☑️🚨 Please use the BiocStyle package template for formatting

  • ☑️🚨 Please rename the Abstract section to Introduction

  • ☑️🚨 Please add an Installation section with Bioconductor installation instructions. This code chunk should NOT be evaluated.

  • ☑️🚨 Please delete the .html file from the vignettes/ directory

  • ☑️⚠️ It is recommended to only use a single core in vignettes as you don't know what system the user will be running it on.

  • ☑️🟢 There are a lot of blank lines in the .Rmd file, you might want to tidy some of these up. Man pages

  • ☑️⚠️ It is recommended to add a package man page

  • ☑️🚨 The first line of your {roxygen2} function documentation is the function title. This should be one (or a few) words naming the function rather than a whole sentence.

  • ☑️✍️🟢 Your internal functions currently don't have any documentation. This doesn't affect users but it can be very useful to have some for maintaining the package. Response: We added comment lines at the beginning for each internal function to describe what the function is written for.

  • ✍️🟢 You might want to think about using the @details tag in your function documentation when you want to explain more about how the function works or what parameters do (it is great to have all this information though 👍)

  • Response: Thanks for your suggestion! I didn’t do it because @details will put information of all parameters together as paragraphs. I would hope to put the description of each parameter separately. Unit tests

  • ☑️🚨 There are currently no unit tests. Please add some to the package. Code R

  • ☑️✍️ ⚠️ Please check that you use names for indexing rather than numbers whenever possible Response: heatmap axes have been changed from 1,2,…, 5 to cluster1, cluster2, …, cluster5. For input labels, as we deal with standard outputs from clustering methods, which are usually number indices, we still allow such inputs.

  • ☑️🚨 Please use <- instead of = for assignment in the code, examples and vignette

  • ☑️🚨 Please use message(), warning() or stop() where appropriate instead of cat() or print()

  • ☑️⚠️ Currently the package does not interact with any standard Bioconductor data structures. I would strongly recommend supporting the SingleCellExperiment object which is the standard Bioconductor container for storing scRNA-seq data. This would make your package compatible with other scRNA-seq analysis packages. You might also want to consider storing the example data for the vignette as a SingleCellExperiment. Response: the main functions calculate_CDI() and feature_gene_selection() now allow inputs of Seurat object.

  • ☑️⚠️ It is recommended to check that arguments for exported functions are valid. This can avoid a lot of issues for users.

  • ☑️✍️🚨 I think the for loops in CalculateCDI() and CandidateBenchmarkHeatmap() can be replaced with an apply function (please check for others as well) Response: I changed for loop to apply for CalculateCDI(). For CandidateBenchmarkHeatmap(), the for loop is much easier to write because each step I need to use the result from the previous step. In addition, as mentioned in https://www.bioconductor.org/developers/how-to/efficient-code/, if the code takes a fraction of a second to evaluate, there is no sense in trying for further improvement.

  • ☑️✍️🚨 Rather than having an argument for the number of cores you should let users supply a BiocParallel::Param() object directly. This lets them have more control over how the parallelization is done. More: I think you probably need to prefix the function with the package name (eg. bioc_parallel_obj = BiocParallel::MulticoreParam(1)). A couple of other minor suggestions:

 *   I would use SerialParam() as the default rather than MulticoreParam(1), this is probably the clearest way and should avoid any overhead with setting up parallel processing when you are only using one core
 *   Arguments for this are commonly called BPPARAM rather than bioc_parallel_obj, not a big deal but might help for consistency

Response: To pass BiocParallel::Param() as an argument of our function calculate_CDI(), users need to first install BiocParallel. Such implicit dependence might lead to undesired error messages. To make the parallel setting flexible, I added “…” after “ncore” to pass additional arguments to MulticoreParam(). User can still control the parallel computing flexibly, and they do not need to install BiocParallel outside. When MulticoreParam is created/used in Windows it defaults to SerialParam which is the equivalent of using a single worker. (https://www.rdocumentation.org/packages/BiocParallel/versions/1.6.2/topics/MulticoreParam-class). I have changed the name to BPPARAM.

  • ☑️🟢 I found some of the code confusing because of long lines, complex statements in conditionals and nested if statements. You may want to see if any of it can be rewritten to be a bit clearer. For example in CalculateCDI() all of the code is inside an if statement where the else part just prints a message. It would be clearer to have the argument check at the top of the function with an error if it is incorrect, then the rest of the function could follow (I can explain more if that doesn't make sense).

jfanglovestats avatar Dec 07 '21 13:12 jfanglovestats

Hi @jfanglovestats. This is looking pretty good but I have a few more comments.

  • It is great to support Seurat objects but {Seurat} is not a Bioconductor package. To be compatible with the Bioconductor ecosystem you should really support SingleCellExperiment objects.
  • I'm not sure I understand your argument. If you are already using MulticoreParam you should be depending on {BiocParallel} so users will already have it installed (I'm not sure how checks are currently passing without you importing {BiocParallel}). Letting the user supply the param object means they can choose exactly how parallel processing is done, not just be limited to MultiCoreParam.
  • If your example dataset is simulated using R code it could be helpful to include the code itself in the documentation. Another way to do this is using a data-raw folder, see https://r-pkgs.org/data.html.
  • It would be great to have some more tests. Ideally every exported function should be tested for valid input as well as things you know should cause an error.
  • There is still a note from {BiocCheck} about long lines. Not a big deal but maybe worth checking to see if you can tidy any of those up.

lazappi avatar Dec 10 '21 15:12 lazappi

Hi @lazappihttps://github.com/lazappi,

Thanks a lot for your reply. May I ask some questions to clarify the comments?

  1. SingleCellExperiment object.

I tried to incorporate this type of object first. However, I met a problem when extracting the desired parts of the object. It seems SingleCellExperiment object has a flexible format, so different users could store the raw counts and gene names in different slots. For Seurat, it is quite standard to store the raw counts in @.@. Do you have some suggestions to address this problem without using Seurat?

  1. MulticoreParam. Sorry I didn’t make it clear. Importing BiocParallel is necessary. What I am concerned is the following. Suppose CDI has a function argument PARAM taking the BiocParallel object. Suppose a user has installed CDI package in their computer but haven’t run library(BiocParallel). Then the following code will generate an error:
Calculate_CDI(sub_gcmat = X, cand_lab_df = labs, BPPARAM = SerialParam())

Even though CDI has imported BiocParallel, passing SerialParam() as an argument needs the user to install and library BiocParallel package. What I wrote in the current version does not need users to explicitly “library(BioParallel)”. Users can still pass all MulticoreParam arguments to the function. For MulticoreParam, I use it because from BiocParallel document, it is “the most efficient and least troublesome way to parallelize code. MulticoreParam uses ’forked’ processes with ’copy-on-change’ semantics – memory is only copied when it is changed.” When it is not available (like for Windows or when ncore = 1), MulticoreParam dispatches to Serial Param.

  1. Include R code for simulation Thanks for your suggestions! I was looking for a place to put the codes for simulation.

  2. Long lines

Thanks for your suggestions. I will try to make them shorter. I use long variable names, so it might be hard to make all lines short.

Best, Jiyuan

From: Luke Zappia @.> Reply-To: Bioconductor/Contributions @.> Date: Friday, December 10, 2021 at 10:36 AM To: Bioconductor/Contributions @.> Cc: jfanglovestats @.>, Mention @.***> Subject: Re: [Bioconductor/Contributions] CDI (#2297)

Hi @jfanglovestatshttps://github.com/jfanglovestats. This is looking pretty good but I have a few more comments.

  • It is great to support Seurat objects but {Seurat} is not a Bioconductor package. To be compatible with the Bioconductor ecosystem you should really support SingleCellExperiment objects.
  • I'm not sure I understand your argument. If you are already using MulticoreParam you should be depending on {BiocParallel} so users will already have it installed (I'm not sure how checks are currently passing without you importing {BiocParallel}). Letting the user supply the param object means they can choose exactly how parallel processing is done, not just be limited to MultiCoreParam.
  • If your example dataset is simulated using R code it could be helpful to include the code itself in the documentation. Another way to do this is using a data-raw folder, see https://r-pkgs.org/data.html.
  • It would be great to have some more tests. Ideally every exported function should be tested for valid input as well as things you know should cause an error.
  • There is still a note from {BiocCheck} about long lines. Not a big deal but maybe worth checking to see if you can tidy any of those up.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/Contributions/issues/2297#issuecomment-991072416, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE2C5L75I774CKV4LL6QNYLUQIM7PANCNFSM5EIMMWMQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jfanglovestats avatar Dec 10 '21 17:12 jfanglovestats

I tried to incorporate this type of object first. However, I met a problem when extracting the desired parts of the object. It seems SingleCellExperiment object has a flexible format, so different users could store the raw counts and gene names in different slots. For Seurat, it is quite standard to store the raw counts in @.@. Do you have some suggestions to address this problem without using Seurat?

Expression matrices will be stored in the assays slot of a SingleCellExperiment. These can be accessed with assay(sce, "assay_name"). There are also special accessors for when the assay is called "counts" (counts(sce)) or "logcounts" (logcounts(sce)). The best practice is to let the user select which assay they want to use by providing an argument to your functions. You will first need to import the relevant functions from the {SingleCellExperiment} or {SummarziedExperiment} packages. The gene/feature names can be accessed simply using rownames(sce).

Sorry I didn’t make it clear. Importing BiocParallel is necessary. What I am concerned is the following. Suppose CDI has a function argument PARAM taking the BiocParallel object. Suppose a user has installed CDI package in their computer but haven’t run library(BiocParallel). Then the following code will generate an error: Calculate_CDI(sub_gcmat = X, cand_lab_df = labs, BPPARAM = SerialParam()) Even though CDI has imported BiocParallel, passing SerialParam() as an argument needs the user to install and library BiocParallel package. What I wrote in the current version does not need users to explicitly “library(BioParallel)”. Users can still pass all MulticoreParam arguments to the function. For MulticoreParam, I use it because from BiocParallel document, it is “the most efficient and least troublesome way to parallelize code. MulticoreParam uses ’forked’ processes with ’copy-on-change’ semantics – memory is only copied when it is changed.” When it is not available (like for Windows or when ncore = 1), MulticoreParam dispatches to Serial Param.

To avoid this error you should add the package prefix with BiocParallel::, so something like:

 Calculate_CDI <- function(sub_gcmat, cand_lab_df, BPPARAM = BiocParallel::SerialParam()) {
     ...
 }

That will let the user use the function without having to run library(BiocParallel) first (as long as it is included as a dependency and installed).

Hope that clears things up a bit. Please ask if you have any further questions.

lazappi avatar Dec 13 '21 08:12 lazappi

@jfanglovestats May we expect any updates soon? We like to see updates in a few weeks time to keep the submission processing moving forward.

lshep avatar Jan 14 '22 14:01 lshep

Hi Ishep,

Thanks for checking. I will update this week.

Best, Jiyuan

From: lshep @.> Reply-To: Bioconductor/Contributions @.> Date: Friday, January 14, 2022 at 9:45 AM To: Bioconductor/Contributions @.> Cc: jfanglovestats @.>, Mention @.***> Subject: Re: [Bioconductor/Contributions] CDI (#2297)

@jfanglovestatshttps://github.com/jfanglovestats May we expect any updates soon? We like to see updates in a few weeks time to keep the submission processing moving forward.

— Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/Contributions/issues/2297#issuecomment-1013188143, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE2C5L74WTOETJMLYMYT7F3UWAZHLANCNFSM5EIMMWMQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>

jfanglovestats avatar Jan 17 '22 03:01 jfanglovestats

Please request this issue be reopened when you are ready with changes and to continue with the review process.

lshep avatar Feb 04 '22 12:02 lshep

This issue is being closed because there has been no progress for an extended period of time. You may reopen the issue when you have the time to actively participate in the review / submission process. Please also keep in mind that a package accepted to Bioconductor requires a commitment on your part to ongoing maintenance.

Thank you for your interest in Bioconductor.

bioc-issue-bot avatar Feb 04 '22 12:02 bioc-issue-bot

@lshep Hope you are doing well. Sorry for the delay in updates. I was distracted by other stuff. I have updated the package to my GitHub repository according to the comments. Could you reopen the issue CDI #2297 so that I can resubmit it to Bioconductor?

jfanglovestats avatar Jun 29 '22 19:06 jfanglovestats

Dear @jfanglovestats ,

We have reopened the issue to continue the review process. Please remember to push a version bump to git.bioconductor.org to trigger a new build.

bioc-issue-bot avatar Jun 30 '22 11:06 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: 7a2929e02c1a04b75df4836b1268d134d59ebdd8

bioc-issue-bot avatar Jun 30 '22 14:06 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Jun 30 '22 14:06 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: 2c322afc6ca4df1a903c9bd490f84fd98847d427

bioc-issue-bot avatar Jun 30 '22 15:06 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Jun 30 '22 15:06 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: 227c0ea9ee56336708c05f3f53c899c22117906f

bioc-issue-bot avatar Jun 30 '22 15:06 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/CDI to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Jun 30 '22 15:06 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: 2a45189d954e01823249943dfc9ac81d866e30c1

bioc-issue-bot avatar Jun 30 '22 15:06 bioc-issue-bot