Contributions icon indicating copy to clipboard operation
Contributions copied to clipboard

bamSliceR: Slicing and Tallied BAMs from GDC and local for Variants Analysis

Open huangyizR opened this issue 1 year ago • 12 comments

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

  • Repository: https://github.com/trichelab/bamSliceR

Confirm the following by editing each check box to '[x]'

  • [x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.

  • [x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.

  • [x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.

  • [x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.

  • [x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.

  • [x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

  • [x] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

  • [x] The 'devel' branch for new packages and features.
  • [x] The stable 'release' branch, made available every six months, for bug fixes.
  • [x] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

huangyizR avatar Sep 20 '24 22:09 huangyizR

Hi @huangyizR

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: bamSliceR
Type: Package
Title: Slicing and Tallied BAMs from GDC and local for Variants Analysis
Version: 0.99.0
Authors@R: c( 
        person("Peter", "Huang", role=c("aut", "cre"), 
      email="[email protected]",
      comment=c(ORCID="0009-0004-5689-0193")),
        person("Lauren", "Harmon", role=c("ctb"),
      email="[email protected]",
      comment=c(ORCID="0000-0002-2248-060")), 
        person("Timothy", "Triche", role=c("ctb"),
      comment=c(ORCID="0000-0001-5665-946X")))
Description: A standardized pipeline for extracting, annotating, and analyzing genomic variants across large controlled-access repositories.
Depends: R (> 4.0), 
         gmapR,
         VariantAnnotation,
         GenomicRanges,
         plyranges,
         GenomicDataCommons, 
         Rsamtools,
         rtracklayer
License: GPL-3
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.2
VignetteBuilder: knitr
biocViews: DataImport, Sequencing
Suggests: 
    knitr,
    rmarkdown,
    pkgdown
Imports: 
    BiocParallel,
    biomaRt,
    Homo.sapiens,
    httr,
    jsonlite,
    stringr,
    BSgenome.Hsapiens.UCSC.hg38,
    TxDb.Hsapiens.UCSC.hg38.knownGene,
    VariantTools,
    data.table,
    maftools, 
    RColorBrewer,
    IRanges,
    S4Vectors,
    methods

bioc-issue-bot avatar Sep 20 '24 22:09 bioc-issue-bot

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot avatar Sep 24 '24 16:09 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/bamSliceR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Sep 24 '24 16:09 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: 61e4a81edc7f338778877e712dd7c7ddda978565

bioc-issue-bot avatar Sep 30 '24 01:09 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): bamSliceR_0.99.1.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/bamSliceR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Sep 30 '24 01:09 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: ee927c0acfb2f20eca5aa7d134ecbb024a5bcfe2

bioc-issue-bot avatar Sep 30 '24 02:09 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): bamSliceR_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/bamSliceR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Sep 30 '24 02:09 bioc-issue-bot

You currently write a file, INPUT_VCF_FILE.vcf, to a users home directory by default. Please use tempdir() instead.

lshep avatar Sep 30 '24 11:09 lshep

Received a valid push on git.bioconductor.org; starting a build for commit id: 5f1a0934ecc58e4af9e15457e60c123f4fb11f9e

bioc-issue-bot avatar Oct 08 '24 01:10 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): bamSliceR_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/bamSliceR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Oct 08 '24 01:10 bioc-issue-bot

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

bioc-issue-bot avatar Oct 16 '24 14:10 bioc-issue-bot

The submitted package is for a useful scenario of extracting portions of BAM files and computing variant read suport. However, many aspects of it do not meet the Bioconductor Developer's Guide which is in the package submission checklist and was ticked but seems to have not been looked at. Major revisions are required.

image

  • The installation process downloads the hg38 reference genome sequence because BSgenome.Hsapiens.UCSC.hg38 is in Imports field of DESCRIPTION. The scope of the package is unclear "Slicing and Tallied BAMs from GDC and local for Variants Analysis". Is it trying to only work with GDC data or also with local BAM files of any reference genome? It would be very limited in functionality if it could not work with any other data than GDC-hosted data. BSgenome.Hsapiens.UCSC.hg38 should be in Suggests and conditionally loaded if it is installed with requireNamespace.
  • Description field in DESCRIPTION file is only one sentence. It needs to be a paragraph. Please refer to Description for the requirements.
  • Inconsistent coding style. Functions use camelCase e.g. annotateWithBAMinfo but parameters use snake case e.g. tallied_reads. Please read Variable Names and make it consistently camelCase.
  • Do not use = for assignment. Use <-. Please read Code Syntax and Efficiency.
  • Examples need to be meaningful, not just for the sole purpose of circumventing BiocCheck.
#' @examples
#' x = 1+1

please read Examples.

  • getGRangesGivenGeneNames ignores a user's input.
getGRangesGivenGeneNames = function(genes = "", exons = TRUE, genome = "hg38", as.character = FALSE, reduce = FALSE, 
                                    txdb = "TxDb.Hsapiens.UCSC.hg38.knownGene", orgdb = "Homo.sapiens")
{
    txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
    TxDb(Homo.sapiens) <- TxDb.Hsapiens.UCSC.hg38.knownGene
  • The default is to arbitarily discard a cancer sample if a patient has more than one. This isn't a sensible default.
keepSampleType = function (gr, sample_type = "primary", keepUniquePatient = TRUE)
   ...
whichdup = duplicated(index)
return (patient_gr[!whichdup] )
  • Why does tallyReads have a parameter BPPARAM and another parallelOnRangesBPPARAM? Explain in tallyReads.Rd why they might need to be different.

  • Every data set in extdata needs to be documented. See Documentation Minimal Requirements. Justify why you need so many RDS files and you don't use ExperimentHub or other Bioconductor packages with BAM files instead.

  • Vignette name is BamSliceR.Bioc2024.Rmd. Please remove the reference to the conference name.

  • A vignette (which is not simply a conference package demo) needs to have an Installation section at the beginning. See Installation.

  • A vignette (which is not simply a conference package demo) can't have any code chunks that are eval = FALSE except in very rare cases. Please see Evaluated Code Chunks. There are six code chunks with eval = FALSE.

  • bamSliceR doesn't work on Windows because gmapR does not support it. Justify the dependency on gmapR. Why are you depending on a primarily read alignment package? Looking at NAMESPACE file, you even import the entire package: import(gmapR). Please refer to Imported Functions and don't do this. Why not Rsamtools pileup instead? It works on all operating systems.

pileup uses PileupParam and ScanBamParam objects to calculate pileup statistics for a BAM file. The result is a data.frame with columns summarizing counts of reads overlapping each genomic position, optionally differentiated on nucleotide, strand, and position within read.

  • Function descriptions are inadequate. They are just copy-and-paste of the title. For example,
\title{Plotting Distribution of Variant Allele Frequency of Variants}
\description{Plotting Distribution of Variant Allele Frequency of Variants}

The title should be one sentence, and it is. But, a Description should expand on it and be one paragraph.

  • The package has a lot of functions but no unit tests to check their correctness. Please see Unit Tests chapter.

DarioS avatar Oct 31 '24 06:10 DarioS

@huangyizR will you be responding to the review?

lshep avatar Jan 14 '25 19:01 lshep

@lshep Yes! Thanks for reminding me about the review. I addressed some of the issues locally. I haven't pushed yet. I will focus on revising in the coming weeks. Thanks!

huangyizR avatar Jan 14 '25 20:01 huangyizR

This issue is being closed because there has been no progress for an extended period of time. You may reopen the issue when you have the time to actively participate in the review / submission process. Please also keep in mind that a package accepted to Bioconductor requires a commitment on your part to ongoing maintenance.

Thank you for your interest in Bioconductor.

bioc-issue-bot avatar Mar 28 '25 12:03 bioc-issue-bot