Contributions icon indicating copy to clipboard operation
Contributions copied to clipboard

ELViS

Open JYLeeBioinfo opened this issue 1 year ago • 1 comments

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

  • Repository: https://github.com/hyochoi/ELViS

Confirm the following by editing each check box to '[x]'

  • [x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.

  • [x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.

  • [x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.

  • [x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.

  • [x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.

  • [x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

  • [x] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

  • [x] The 'devel' branch for new packages and features.
  • [x] The stable 'release' branch, made available every six months, for bug fixes.
  • [x] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

JYLeeBioinfo avatar Oct 30 '24 13:10 JYLeeBioinfo

Hi @JYLeeBioinfo

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: ELViS
Title: An R Package for Estimating Copy Number Levels of Viral Genome Segments Using Base-Resolution Read Depth Profile
Version: 0.99.0
Authors@R: c(
    person("Hyo Young", "Choi", , "[email protected]", role = c("aut", "cph"),
 comment = c(ORCID = "0000-0002-7627-8493")),
    person("Jin-Young", "Lee", , "[email protected]", role = c("aut", "cre", "cph"),
 comment = c(ORCID = "0000-0002-5366-7488")),
    person("Xiaobei", "Zhao", , "[email protected]", role = "ctb",
 comment = c(ORCID = "0000-0002-5277-0846")),
    person("Jeremiah R.", "Holt", , "[email protected]", role = "ctb",
 comment = c(ORCID = "0000-0002-5201-5015")),
    person("Katherine A.", "Hoadley", , "[email protected]", role = "aut",
 comment = c(ORCID = "0000-0002-1216-477X")),
    person("D. Neil", "Hayes", , "[email protected]", role = c("aut", "fnd", "cph"),
 comment = c(ORCID = "0000-0001-6203-7771"))
  )
Description: Base-resolution copy number analysis of viral genome. Utilizes base-resolution read depth data over viral genome to find copy number segments with two-dimensional segmentation approach. Provides publish-ready figures, including histograms of read depths, coverage line plots over viral genome annotated with copy number change events and viral genes, and heatmaps showing multiple types of data with integrative clustering of samples.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
VignetteBuilder: knitr
biocViews: CopyNumberVariation, Coverage, GenomicVariation, BiomedicalInformatics, Sequencing, Normalization, Visualization, Clustering
LazyData: false
BugReports: https://github.com/hyochoi/ELViS/issues
URL: https://github.com/hyochoi/ELViS
Config/testthat/edition: 3
Imports: 
    circlize,
    ComplexHeatmap,
    data.table,
    dplyr,
    GenomicFeatures,
    GenomicRanges,
    ggplot2,
    glue,
    graphics,
    grDevices,
    igraph,
    knitr,
    magrittr,
    memoise,
    parallel,
    patchwork,
    reticulate,
    rmarkdown,
    scales,
    segclust2d,
    stats,
    stringr,
    txdbmaker,
    utils,
    uuid,
    zoo
Depends: R (>= 4.3)
Suggests: 
    Rsamtools,
    BiocManager,
    testthat (>= 3.0.0)

bioc-issue-bot avatar Oct 30 '24 13:10 bioc-issue-bot

asset is not a standard top level directory. What is this and what does it contain?

Bioconductor recommends the use of basilisk over reticulate. Please update.

Vignettes should use a tempdir() for executing their examples and to save any files not a home directory.

vignettes/ELViSPrecisely_Toy_Example.Rmd:68:analysis_dir = "~/ELViS"
vignettes/ELViSPrecisely_Toy_Example.Rmd:206:saveRDS(base_resol_depth,"~/base_resol_depth.rds")

lshep avatar Nov 21 '24 14:11 lshep

Hi Ishep, I appreciate you reaching out to me to check on these issues.

  1. asset directory
  • asset directory contains code snippets that I used during package development, but it should be excluded when distributing this package.
  • I'll remove this
  1. reticulate
  • Thank you for introducing a great package as basilisk! I'll rework the code to use basilisk.
  1. analysis_dir
  • I'll change this from "~/ELViS" to tempdir().

Thank you again for kindly informing me of the adjustments needed. If there's anything that needs correction, please feel free to let me know.

Thank you, Jay

JYLeeBioinfo avatar Nov 21 '24 17:11 JYLeeBioinfo

I have finished 1,3 but it will take some time for "2. reticulate -> basilisk". I'll let you know right after 2. is finished.

JYLeeBioinfo avatar Nov 21 '24 19:11 JYLeeBioinfo

I completely replaced reticulate with basilisk in Process_Bam.R. (2.) Also, reticulate is not in the import section of the DESCRIPTION file anymore.

Thank you again for guiding us to make our package comply with the Bioconductor recommendations.

JYLeeBioinfo avatar Nov 22 '24 04:11 JYLeeBioinfo

Please also provide an inst/scripts directory that describes how the data in inst/extdata was generated. It can be code, pseudo-code, or text but should minimally list any source or licensing information.

You still have a reference in vignette to

tmpdir="./tmpdir"
dir.create(tmpdir,recursive = TRUE)

which will create a tmpdir persistent directory in the current directory. Please use tempdir() or tempfile()

> system.time({
    mtrx_samtools_reticulate__example = 
        get_depth_matrix(
            bam_files = bam_files,target_virus_name = target_virus_name
            ,mode = "samtools_basilisk"
            ,N_cores = N_cores
            ,min_mapq = 30
            ,tmpdir=tmpdir
            ,condaenv = "env_samtools"
            ,condaenv_samtools_version="1.21"
        )
})
The path to samtools not provided.
Default samtools is used : /home/lorikern/.cache/R/basilisk/1.19.0/ELViS/0.99.0/env_samtools/bin/samtools
The path to samtools not provided.
Default samtools is used : /home/lorikern/.cache/R/basilisk/1.19.0/ELViS/0.99.0/env_samtools/bin/samtools
   user  system elapsed 
  0.013   0.029   0.060 
Warning message:
In dir.create(tmpdir) : './tmpdir' already exists

It also seems like your function tries to create the directory and produces a warning that it already exists. Likely in your function you need to check if it exists and only run creation if it doesn't.

lshep avatar Nov 27 '24 19:11 lshep

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot avatar Nov 27 '24 19:11 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): ELViS_0.99.0.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Nov 27 '24 20:11 bioc-issue-bot

Hi @lshep, Thank you again for letting us know the points to improve.

1. tmpdir

  1. I changed this code in the example of the function get_depth_matrix from this
tmpdir="./tmpdir"
dir.create(tmpdir,recursive = TRUE)

to the following code.

tmpdir <- tempdir()
  1. In the same function, I changed this code from this
dir.create(tmpdir)

to the following code

if(!dir.exists(tmpdir)){ dir.create(tmpdir) }
  1. inst/scripts Following your comments, I added a file names README_extdata.txt to inst/scripts directory.

Thank you again for helping us to improve this package! Jay

p.s. I'll fix the errors in the build report and push the repository again. Actually, we checked this package with check() and BiocCheck() both on our server and on github workflow, but BiocGenerics and IRange had not been needed to be listed in the Import section of DESCRIPTION file. But I'll try adding them to Import and see if errors are removed.

JYLeeBioinfo avatar Nov 27 '24 20:11 JYLeeBioinfo

Hi @lshep, I reflected changes I mentioned in the earlier comment and fixed errors that appeared on the build report. And then I pushed to git.bioconductor.org:packages/ELViS.git

image

But, bioc-issue-bot doesn't seem to respond...

Am I doing something wrong..?

JYLeeBioinfo avatar Nov 28 '24 01:11 JYLeeBioinfo

Did you do a valid version bump to 0.99.1? Bioconductor will only recognize changes with a valid version bump

lshep avatar Nov 29 '24 13:11 lshep

Received a valid push on git.bioconductor.org; starting a build for commit id: 6f70f192eeb16411ad90df26fd0bc6bed72039cc

bioc-issue-bot avatar Dec 05 '24 19:12 bioc-issue-bot

Thank you for pointing out the version bump issue. I corrected the version and it seems bot is responding well.

JYLeeBioinfo avatar Dec 05 '24 19:12 JYLeeBioinfo

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): ELViS_0.99.1.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Dec 05 '24 19:12 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: cf2ef205f9af4cee057976ff1525098fce6c45f3

bioc-issue-bot avatar Dec 05 '24 19:12 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): ELViS_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Dec 05 '24 20:12 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: e3036cab51e968ef67f22eb15a55f9e79bbbe9a4

bioc-issue-bot avatar Dec 05 '24 23:12 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): ELViS_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Dec 05 '24 23:12 bioc-issue-bot

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

bioc-issue-bot avatar Dec 10 '24 12:12 bioc-issue-bot

Package 'ELViS' Review

Thank you for submitting your package to Bioconductor. The package passed check and build. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review. Code: Note: please consider; Important: must be addressed.

The NAMESPACE file

  • [ ] Important: Selective imports using importFrom instead of import all with import.

    • in line 15 import(ComplexHeatmap)
    • in line 18 import(RBGL, except=c(transitivity,bfs,dfs))
    • in line 19 import(basilisk)
    • in line 20 import(circlize, except=c(degree))
    • in line 25 import(glue, except=c(trim))
    • in line 27 import(knitr)
    • in line 29 import(memoise)
    • in line 30 import(parallel)
    • in line 31 import(patchwork)
    • in line 32 import(rmarkdown)
    • in line 33 import(scales)
    • in line 34 import(segclust2d)
    • in line 35 import(stringr)
    • in line 36 import(uuid)
    • in line 37 import(zoo, except=c(index,yearmon,yearqtr,"index<-"))
  • [ ] Important: Function names use camelCase or snake_case and do not include ..

    • in line 11 export(norm.fun)

General package development

  • [ ] Important: Consider adding more unit tests. Current unit tests only covered 0.215%
  • [ ] Important: Consider adding input checking. We strongly encourage them. See https://contributions.bioconductor.org/r-code.html#function-arguments
  • [ ] Important: Consider adding instructions for download or creation for the extra data.

R code

  • [ ] Important:is() or inherits() instead of class().
    • In file R/main.R:
      • at line 1132 found ' if (sum(class(variable) %in% c("double","numeric")) > 0) {'
      • at line 1146 found ' if (sum(class(variable) %in% "factor") == 0) variable <- factor(variable)'
      • at line 1155 found ' if (sum(class(variable) %in% c("ordered")) > 0) {'
  • [ ] NOTE: expose overwrite parameter.
    • In file R/main.R:
      • at line 807 found ' ,overwrite=FALSE'
  • [ ] NOTE: :: is not suggested in source code unless you can make sure all the packages are imported. Some people think it is better to keep ::. However, please be aware that you will need to manually double-check the imported items if you make any changes to the DESCRIPTION file during development. My suggestion is to remove one or two repetitions to trigger the dependency check.
  • [ ] NOTE: Vectorize: for loops present, try to replace them by *apply funcitons.
    • In file R/main.R:
      • at line 200 found ' for (sam in seq_len(n) ) {'
      • at line 502 found ' for (sam in which(segupdated_data$segment.K==0)) {'
      • at line 674 found ' for (js in seq_len(nseg)) {'
      • at line 754 found ' for (sam in seq_len(ncol(X))) {'
      • at line 1284 found ' for (js in seq_len(nseg)) {'
      • at line 1384 found ' for( k_tmp in i_grouped_lst[order(vapply(i_grouped_lst,min,0))]){'
      • at line 1442 found ' for (clique in ordered_cliques) {'
      • at line 1515 found ' for (js in state) {'
    • In file R/misc.R:
      • at line 224 found ' for(pkg in c("nord","wesanderson","RColorBrewer")){'
    • In file R/Plotting.R:
      • at line 62 found ' for(ovlp in ovlp_status %>% dplyr::group_split(.data$queryHits)){'
    • In file R/Process_Bam.R:
      • at line 301 found ' for (part in parts) {'
      • at line 331 found ' for(s in all_strings){ sanity_check(s) }'
  • [ ] Important: Use file.path to replace paste
    • In file R/Process_Bam.R:
      • at line 112 found ' PATH = paste(env_dir,"bin",sep="/"),'
      • at line 113 found ' LD_LIBRARY_PATH = paste(env_dir,"lib",sep="/")'
      • at line 579 found ' PATH = paste(env_dir,"bin",sep="/"),'
      • at line 580 found ' LD_LIBRARY_PATH = paste(env_dir,"lib",sep="/")'
  • [ ] Important: Remove unused code.
    • In file R/Plotting.R:
      • at line 199 found ' theme(legend.position = "none") #+ xlim(0,NROW(mtrx_for_plotting)+1)'
  • [ ] Important: Need explanation the reason to use head or tail in following code.
    • In file R/main.R:
      • at line 1185 found ' rollYquant <- c(rep(rollYquant0[1],half.width), rollYquant0, rep(tail(rollYquant0,1),half.width-1))'
  • [ ] NOTE: Try to check the edge condition when using seq.int or seq_len. For example using seq.int(min(5, nrow(data))) to replace seq.int(5)
    • In file R/main.R:
      • at line 1393 found ' segtable2[nseg2,seq_len(3)] <- segment.table[nseg2,seq_len(3)]'
  • [ ] NOTE: Functional programming: code repetition.
    • repetition in coord_to_lst, get_depth_matrix_Rsamtools_gp, and get_depth_samtools_gp
      • in coord_to_lst
        • line 1:{
        • line 2: coord_lst <- coord %>% str_replace_all(",", "") %>% str_split(":|-",
        • line 3: simplify = TRUE) %>% as.list() %>% structure(names = c("chr",
        • line 4: "start", "end")) %>% within({
        • line 5: start <- as.numeric(start)
        • line 6: end <- as.numeric(end)
        • line 7: })
      • in get_depth_matrix_Rsamtools_gp
        • line 2: min_mapq = 30, min_base_quality = 0)
        • line 3: {
        • line 4: coord_lst <- coord %>% str_replace_all(",", "") %>% str_split(":|-",
        • line 5: simplify = TRUE) %>% as.list() %>% structure(names = c("chr",
        • line 6: "start", "end")) %>% within({
        • line 7: start <- as.numeric(start)
        • line 8: end <- as.numeric(end)
        • line 9: })
      • in get_depth_samtools_gp
        • line 2: bash_script_base, samtools = NULL)
        • line 3:{
        • line 4: coord_lst <- coord %>% str_replace_all(",", "") %>% str_split(":|-",
        • line 5: simplify = TRUE) %>% as.list() %>% structure(names = c("chr",
        • line 6: "start", "end")) %>% within({
        • line 7: start <- as.numeric(start)
        • line 8: end <- as.numeric(end)
        • line 9: })
    • repetition in detect_bp__update_ref__sub_updateY, and get_BPsegment_v2
      • in detect_bp__update_ref__sub_updateY
        • line 12: BPs.sam <- length(BPs)
        • line 13: if (BPs.sam > 0) {
        • line 14: if (BPs.sam%%2 > 0) {
        • line 15: BPs.sam <- BPs.sam + 1
        • line 16: newBP <- c(seq_along(Ydiff_sam)[-BPs])[which.max(abs(Ydiff_sam[-BPs]))]
        • line 17: BPs <- c(BPs, newBP)
        • line 18: }
        • line 19: nseg <- length(BPs)
        • line 20: nseg2 <- nseg + 1
        • line 21: srt.BPs <- sort(BPs)
        • line 22: segment.table <- data.frame(seg = seq_len(nseg2),
        • line 23: begin = c(1, srt.BPs + 1), end = c(srt.BPs, d))
        • line 24: segbps <- lapply(seq_len(nseg), FUN = function(i) segment.table$begin[i]:segment.table$end[i])
        • line 25: segbps[[1]] <- c(segbps[[1]], segment.table$begin[nseg2]:segment.table$end[nseg2])
        • line 26: Y.tmp <- Y
        • line 27: Q1.zscore <- max.zscore <- mean.zscore.outwinreg <- rep(1000,
        • line 28: nseg)
        • line 29: med.outwinreg <- len.outwinreg <- rep(0, nseg)
        • line 30: names(mean.zscore.outwinreg) <- seq_len(nseg)
        • line 31: for (js in seq_len(nseg)) {
        • line 32: outwinreg <- segbps[[js]]
        • line 33: len.outwinreg[js] <- length(outwinreg)
        • line 34: newmed <- median(Y[outwinreg, sam])
        • line 35: med.outwinreg[js] <- newmed
        • line 36: if (newmed > 0.05) {
        • line 37: Y.tmp[, sam] <- Y[, sam]/newmed
        • line 38: Z.tmp <- t(apply(Y.tmp, 1, function(x) pd.rate.hy(x,
        • line 39: qrsc = TRUE)))
        • line 40: mean.zscore.outwinreg[js] <- mean(abs(Z.tmp[outwinreg,
        • line 41: sam]))
        • line 42: Q1.zscore[js] <- quantile(abs(Z.tmp[, sam]),
        • line 43: probs = 0.25)
        • line 44: max.zscore[js] <- max(abs(Z.tmp[, sam]))
        • line 45: }
        • line 46: }
        • line 47: ireg <- which((max.zscore < 30) & (len.outwinreg >
        • line 48: 500))
        • line 49: if (length(ireg) > 0) {
        • line 50: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 51: }
        • line 52: else {
        • line 53: ireg <- which(max.zscore < 30)
        • line 54: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 55: }
        • line 56: baseseg
      • in get_BPsegment_v2
        • line 29: BPs.sam <- length(BPs)
        • line 30: if (BPs.sam > 0) {
        • line 31: if (BPs.sam%%2 > 0) {
        • line 32: BPs.sam <- BPs.sam + 1
        • line 33: newBP <- c(c(seq_len(length(ydiff)))[-BPs])[which.max(abs(ydiff[-BPs]))]
        • line 34: BPs <- c(BPs, newBP)
        • line 35: }
        • line 37: nseg <- length(BPs)
        • line 38: nseg2 <- nseg + 1
        • line 39: srt.BPs <- sort(BPs)
        • line 40: segment.table <- data.frame(seg = seq_len(nseg2), begin = c(1,
        • line 41: srt.BPs + 1), end = c(srt.BPs, d))
        • line 42: segbps <- lapply(seq_len(nseg), FUN = function(i) segment.table$begin[i]:segment.table$end[i])
        • line 43: segbps[[1]] <- c(segbps[[1]], segment.table$begin[nseg2]:segment.table$end[nseg2])
        • line 44: Y.tmp <- Y
        • line 45: Q1.zscore <- max.zscore <- mean.zscore.outwinreg <- rep(1000,
        • line 46: nseg)
        • line 47: med.outwinreg <- len.outwinreg <- rep(0, nseg)
        • line 48: names(mean.zscore.outwinreg) <- seq_len(nseg)
        • line 49: for (js in seq_len(nseg)) {
        • line 50: outwinreg <- segbps[[js]]
        • line 51: len.outwinreg[js] <- length(outwinreg)
        • line 52: newmed <- median(Y[outwinreg, sam])
        • line 53: med.outwinreg[js] <- newmed
        • line 54: if (newmed > 0.05) {
        • line 55: Y.tmp[, sam] <- Y[, sam]/newmed
        • line 56: Z.tmp <- t(apply(Y.tmp, 1, function(x) pd.rate.hy(x,
        • line 57: qrsc = TRUE)))
        • line 58: mean.zscore.outwinreg[js] <- mean(abs(Z.tmp[outwinreg,
        • line 59: sam]))
        • line 60: Q1.zscore[js] <- quantile(abs(Z.tmp[, sam]),
        • line 61: probs = 0.25)
        • line 62: max.zscore[js] <- max(abs(Z.tmp[, sam]))
        • line 63: }
        • line 64: }
        • line 65: ireg <- which((max.zscore < 30) & (len.outwinreg > 500))
        • line 66: if (length(ireg) > 0) {
        • line 67: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 68: }
        • line 69: else {
        • line 70: ireg <- which(max.zscore < 30)
        • line 71: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 72: }
    • repetition in finalize_segments_and_clusters, get_segments_and_clusters, and update_reference_segments
      • in finalize_segments_and_clusters
        • line 13: if (K > 1) {
        • line 14: clust_seg <- segclust(testdata, lmin = 300, Kmax = 10,
        • line 15: ncluster = (2:K), seg.var = c("z"), scale.variable = FALSE,
        • line 16: subsample_by = 60)
        • line 17: rescued_data$clust.list[[sam]] <- clust_seg
      • in get_segments_and_clusters
        • line 14: output <- tryCatch({
        • line 15: K <- segment.K_initial[sam]
        • line 16: if (K > 1) {
        • line 17: shift_seg <- segmentation(testdata, lmin = 300,
        • line 18: Kmax = 10, seg.var = c("z", "y"), subsample_by = 60,
        • line 19: scale.variable = FALSE)
        • line 20: K <- shift_seg$Kopt.lavielle
        • line 21: if (K > 1) {
        • line 22: clust_seg <- segclust(testdata, lmin = 300,
        • line 23: Kmax = 10, ncluster = (2:K), seg.var = c("z",
        • line 24: "y"), scale.variable = FALSE, subsample_by = 60)
        • line 25: out <- list(K = K, clust = clust_seg)
        • line 33: }
        • line 34: msg <- paste0(sam, "| done")
        • line 35: message(msg)
        • line 36: out
        • line 37: }, error = function(err) {
        • line 38: msg <- paste0(sam, "|", err)
        • line 39: message(msg)
        • line 41: if (K > 1) {
        • line 42: shift_seg <- segmentation(testdata, lmin = 300,
        • line 43: Kmax = 10, seg.var = c("z"), subsample_by = 60,
        • line 44: scale.variable = FALSE)
        • line 45: K <- shift_seg$Kopt.lavielle
        • line 46: if (K > 1) {
        • line 47: clust_seg <- segclust(testdata, lmin = 300,
        • line 48: Kmax = 10, ncluster = (2:K), seg.var = c("z"),
        • line 49: scale.variable = FALSE, subsample_by = 60)
        • line 50: out <- list(K = K, clust = clust_seg)
        • line 58: }
        • line 59: msg <- paste0(sam, "| done")
        • line 60: message(msg)
        • line 61: out
        • line 62: })
        • line 63: return(output)
      • in update_reference_segments
        • line 17: output <- tryCatch({
        • line 18: shift_seg <- segmentation(testdata, lmin = 300,
        • line 19: Kmax = 10, seg.var = c("z", "y"), subsample_by = 60,
        • line 20: scale.variable = FALSE)
        • line 21: K <- shift_seg$Kopt.lavielle
        • line 22: msg <- paste0(sam, "| done")
        • line 23: message(msg)
        • line 24: K
        • line 25: }, error = function(err) {
        • line 26: msg <- paste0(sam, "|", err)
        • line 27: message(msg)
        • line 29: shift_seg <- segmentation(testdata, lmin = 300,
        • line 30: Kmax = 10, seg.var = c("z"), subsample_by = 60,
        • line 31: scale.variable = FALSE)
        • line 32: K <- shift_seg$Kopt.lavielle
        • line 33: msg <- paste0(sam, "| done")
        • line 34: message(msg)
        • line 35: K
        • line 36: })
        • line 37: return(output)
        • line 46: output <- tryCatch({
        • line 47: K <- segment.K_initial[sam]
        • line 48: if (K > 1) {
        • line 49: clust_seg <- segclust(testdata, lmin = 300,
        • line 50: Kmax = 10, ncluster = (2:K), seg.var = c("z",
        • line 51: "y"), scale.variable = FALSE, subsample_by = 60)
        • line 52: result <- segment(clust_seg)
        • line 59: }
        • line 60: msg <- paste0(sam, "| done")
        • line 61: message(msg)
        • line 62: new_y
        • line 63: }, error = function(err) {
        • line 64: msg <- paste0(sam, "|", err)
        • line 65: message(msg)
        • line 67: K <- segment.K_initial[sam]
        • line 68: if (K > 1) {
        • line 69: clust_seg <- segclust(testdata, lmin = 300,
        • line 70: Kmax = 10, ncluster = (2:K), seg.var = c("z"),
        • line 71: scale.variable = FALSE, subsample_by = 60)
        • line 72: result <- segment(clust_seg)
        • line 85: }
        • line 86: msg <- paste0(sam, "| done")
        • line 87: message(msg)
        • line 88: new_y
        • line 89: })
        • line 90: return(output)
    • repetition in gene_cn_heatmaps, and get_gene_rnt_ori
      • in gene_cn_heatmaps
        • line 9: txdb <- makeTxDbFromGFF(gff3_fn, format = "gff3")
        • line 10: genes <- genes(txdb) %>% sort
        • line 11: cds <- cdsBy(txdb, by = "gene")
        • line 12: if (!is.null(exclude_genes)) {
        • line 13: genes <- genes[!(genes$gene_id %in% exclude_genes)]
      • in get_gene_rnt_ori
        • line 4: txdb <- makeTxDbFromGFF(gff3_fn, format = "gff3")
        • line 5: genes <- genes(txdb) %>% sort
        • line 6: cds <- cdsBy(txdb, by = "gene")
        • line 7: if (!is.null(exclude_genes)) {
        • line 8: cds <- cds[!(names(cds) %in% exclude_genes)]
    • repetition in gene_cn_heatmaps, integrative_heatmap, and plot_pileUp_multisample
      • in gene_cn_heatmaps
        • line 3: "in"))
        • line 4:{
        • line 5: if (length(baseline) == 1) {
        • line 6: baseline <- rep(baseline, NCOL(X_raw))
        • line 7: }
        • line 8: baseline_target <- baseline
      • in integrative_heatmap
        • line 23: }
        • line 24: if (length(baseline) == 1) {
        • line 25: baseline <- rep(baseline, NCOL(X_raw))
        • line 26: }
        • line 27: baseline_target <- baseline
      • in plot_pileUp_multisample
        • line 6: {
        • line 7: if (length(baseline) == 1) {
        • line 8: baseline <- rep(baseline, NCOL(X_raw))
        • line 9: }
    • repetition in get_depth_matrix, and get_depth_matrix_gp
      • in get_depth_matrix
        • line 1: mode = "samtools_basilisk", target_virus_name, N_cores = detectCores(),
        • line 2: min_mapq = 30, min_base_quality = 0, max_depth = 1e+05, modules = NULL,
        • line 3: envs = NULL, tmpdir = tempdir(), samtools = NULL, condaenv = "env_samtools",
        • line 4: condaenv_samtools_version = "1.21")
        • line 5:{
        • line 6: os_name <- Sys.info()["sysname"]
        • line 7: if (os_name == "Windows") {
        • line 8: if (mode != "Rsamtools") {
        • line 9: warning(glue("mode={mode} is not supported for {os_name}. Changing mode to Rsamtools..."))
        • line 10: mode <- "Rsamtools"
        • line 11: }
        • line 12: }
        • line 16: }
        • line 17: if (mode == "samtools_basilisk") {
        • line 18: if (!requireNamespace("basilisk", quietly = TRUE)) {
        • line 19: stop("R Package 'basilisk' does not exist. Please install it by following instructions in 'https://www.bioconductor.org/packages/release/bioc/html/basilisk.html'")
        • line 20: }
        • line 21: if (grepl("[^0-9.]", condaenv_samtools_version)) {
        • line 22: stop("Invalid samtools version number. Please find correct version number refering to 'https://anaconda.org/bioconda/samtools'.")
        • line 23: }
        • line 24: samtools_env <- BasiliskEnvironment(envname = condaenv,
        • line 25: pkgname = "ELViS", channels = c("conda-forge", "bioconda"),
        • line 26: packages = c(glue("samtools=={condaenv_samtools_version}")))
        • line 27: env_dir <- obtainEnvironmentPath(samtools_env)
        • line 28: envs <- c(PATH = paste(env_dir, "bin", sep = "/"), LD_LIBRARY_PATH = paste(env_dir,
        • line 29: "lib", sep = "/"))
        • line 30: }
        • line 31: if (mode == "Rsamtools") {
        • line 32: if (!requireNamespace("Rsamtools", quietly = TRUE)) {
        • line 33: stop("R Package 'Rsamtools' does not exist. Please install it by executing following command.\n\nif (!requireNamespace('BiocManager', quietly = TRUE))\n utils::install.packages('BiocManager')\n\nBiocManager::install('Rsamtools')")
        • line 34: }
        • line 35: out_mtrx <- get_depth_matrix_Rsamtools(bam_files = bam_files,
        • line 36: target_virus_name = target_virus_name, N_cores = N_cores,
        • line 37: max_depth = max_depth, min_mapq = min_mapq, min_base_quality = min_base_quality)
        • line 38: }
        • line 39: else if (mode %in% c("samtools_custom", "samtools_basilisk")) {
        • line 40: if (!dir.exists(tmpdir)) {
        • line 41: dir.create(tmpdir)
        • line 42: }
        • line 43: out_mtrx <- get_depth_matrix_samtools(bam_files = bam_files,
        • line 48: }
        • line 49: colnames(out_mtrx) <- basename(bam_files)
        • line 50: if (grepl("Error", out_mtrx[1, 1], ignore.case = TRUE)) {
        • line 51: stop(out_mtrx[1, 1])
        • line 52: }
        • line 53: return(out_mtrx)
      • in get_depth_matrix_gp
        • line 1: mode = "samtools_custom", coord, N_cores = detectCores(),
        • line 2: min_mapq = 30, min_base_quality = 0, max_depth = 1e+05, modules = NULL,
        • line 3: envs = NULL, tmpdir = tempdir(), samtools = NULL, condaenv = "env_samtools",
        • line 4: condaenv_samtools_version = "1.21")
        • line 5:{
        • line 6: os_name <- Sys.info()["sysname"]
        • line 7: if (os_name == "Windows") {
        • line 8: if (mode != "Rsamtools") {
        • line 9: warning(glue("mode={mode} is not supported for {os_name}. Changing mode to Rsamtools..."))
        • line 10: mode <- "Rsamtools"
        • line 11: }
        • line 12: }
        • line 13: if (mode == "samtools_basilisk") {
        • line 14: if (!requireNamespace("basilisk", quietly = TRUE)) {
        • line 15: stop("R Package 'basilisk' does not exist. Please install it by following instructions in 'https://www.bioconductor.org/packages/release/bioc/html/basilisk.html'")
        • line 16: }
        • line 17: if (grepl("[^0-9.]", condaenv_samtools_version)) {
        • line 18: stop("Invalid samtools version number. Please find correct version number refering to 'https://anaconda.org/bioconda/samtools'.")
        • line 19: }
        • line 20: samtools_env <- BasiliskEnvironment(envname = condaenv,
        • line 21: pkgname = "ELViS", channels = c("conda-forge", "bioconda"),
        • line 22: packages = c(glue("samtools=={condaenv_samtools_version}")))
        • line 23: env_dir <- obtainEnvironmentPath(samtools_env)
        • line 24: envs <- c(PATH = paste(env_dir, "bin", sep = "/"), LD_LIBRARY_PATH = paste(env_dir,
        • line 25: "lib", sep = "/"))
        • line 26: }
        • line 27: if (mode == "Rsamtools") {
        • line 28: if (!requireNamespace("Rsamtools", quietly = TRUE)) {
        • line 29: stop("R Package 'Rsamtools' does not exist. Please install it by executing following command.\n\nif (!requireNamespace('BiocManager', quietly = TRUE))\n utils::install.packages('BiocManager')\n\nBiocManager::install('Rsamtools')")
        • line 30: }
        • line 31: out_mtrx <- get_depth_matrix_Rsamtools_gp(bam_files = bam_files,
        • line 32: coord = coord, N_cores = N_cores, max_depth = max_depth,
        • line 33: min_mapq = min_mapq, min_base_quality = min_base_quality)
        • line 34: }
        • line 35: else if (mode %in% c("samtools_custom", "samtools_basilisk")) {
        • line 36: if (!dir.exists(tmpdir)) {
        • line 37: dir.create(tmpdir)
        • line 38: }
        • line 39: out_mtrx <- get_depth_matrix_samtools_gp(bam_files = bam_files,
        • line 43: }
        • line 44: colnames(out_mtrx) <- basename(bam_files)
        • line 45: if (grepl("Error", out_mtrx[1, 1], ignore.case = TRUE)) {
        • line 46: stop(out_mtrx[1, 1])
        • line 47: }
        • line 48: return(out_mtrx)
    • repetition in get_depth_matrix_Rsamtools, and get_depth_matrix_Rsamtools_gp
      • in get_depth_matrix_Rsamtools
        • line 7: target_grng <- data.frame(chr = target_virus_name, start = 1,
        • line 8: end = virus_genome_size) %>% makeGRangesFromDataFrame()
        • line 9: paramScanBam <- Rsamtools::ScanBamParam(which = target_grng)
        • line 10: paramPileup <- Rsamtools::PileupParam(min_base_quality = min_base_quality,
        • line 11: max_depth = max_depth, min_mapq = min_mapq, min_nucleotide_depth = 0,
        • line 12: distinguish_strands = FALSE, distinguish_nucleotides = FALSE,
        • line 13: ignore_query_Ns = FALSE, include_deletions = FALSE,
        • line 14: include_insertions = FALSE, left_bins = NULL, query_bins = NULL,
        • line 15: cycle_bins = NULL)
        • line 16: depth_mtrx <- mclapply(X = bam_files, mc.cores = N_cores,
        • line 17: FUN = get_depth_Rsamtools, scanBamParam = paramScanBam,
        • line 18: pileupParam = paramPileup) %>% do.call(cbind, .)
        • line 19: return(depth_mtrx)
      • in get_depth_matrix_Rsamtools_gp
        • line 10: target_grng <- data.frame(chr = coord_lst$chr, start = coord_lst$start,
        • line 11: end = coord_lst$end) %>% makeGRangesFromDataFrame()
        • line 12: paramScanBam <- Rsamtools::ScanBamParam(which = target_grng)
        • line 13: paramPileup <- Rsamtools::PileupParam(min_base_quality = min_base_quality,
        • line 14: max_depth = max_depth, min_mapq = min_mapq, min_nucleotide_depth = 0,
        • line 15: distinguish_strands = FALSE, distinguish_nucleotides = FALSE,
        • line 16: ignore_query_Ns = FALSE, include_deletions = FALSE,
        • line 17: include_insertions = FALSE, left_bins = NULL, query_bins = NULL,
        • line 18: cycle_bins = NULL)
        • line 19: depth_mtrx <- mclapply(X = bam_files, mc.cores = N_cores,
        • line 20: FUN = get_depth_Rsamtools, scanBamParam = paramScanBam,
        • line 21: pileupParam = paramPileup) %>% do.call(cbind, .)
        • line 22: return(depth_mtrx)
    • repetition in get_depth_matrix_samtools, and get_depth_matrix_samtools_gp
      • in get_depth_matrix_samtools
        • line 1: function (bam_files, target_virus_name, N_cores = min(10,
        • line 2: detectCores()), min_mapq = 30, min_base_quality = 0,
        • line 3: modules = NULL, envs = NULL, tmpdir = tempdir(), samtools = NULL)
        • line 4: {
      • in get_depth_matrix_samtools_gp
        • line 1: function (bam_files, coord, N_cores = detectCores(), min_mapq = 30,
        • line 2: min_base_quality = 0, modules = NULL, envs = NULL, tmpdir = tempdir(),
        • line 3: samtools = NULL)
        • line 4: {
    • repetition in get_depth_samtools, and get_depth_samtools_gp
      • in get_depth_samtools
        • line 5: depth_bed_fn <- glue("{tmpdir}/{UUIDgenerate()}_{vec_i}.depth.bed")
        • line 6: run_samtools(bash_script_base = bash_script_base, command = glue("depth -a -r '{target_virus_name}' --min-MQ {min_mapq} --min-BQ {min_base_quality} -g 256 {bam_fn}"),
        • line 7: output_name = depth_bed_fn, samtools = samtools, depth_count_only = TRUE)
        • line 8: depth_bed <- unlist(read.csv(depth_bed_fn, sep = "\t", header = FALSE),
        • line 9: use.names = FALSE)
        • line 10: sanity_check(depth_bed_fn)
        • line 11: system2(command = "rm", args = depth_bed_fn)
        • line 12: return(depth_bed)
      • in get_depth_samtools_gp
        • line 11: depth_bed_fn <- glue("{tmpdir}/{UUIDgenerate()}_{vec_i}.depth.bed")
        • line 12: run_samtools(bash_script_base, command <- glue("depth -a -r '{coord_lst$chr}:{coord_lst$start}-{coord_lst$end}' --min-MQ {min_mapq} --min-BQ {min_base_quality} -g 256 {bam_fn}"),
        • line 13: output_name <- depth_bed_fn, samtools <- samtools, depth_count_only <- TRUE)
        • line 14: depth_bed <- unlist(read.csv(depth_bed_fn, sep = "\t", header = FALSE),
        • line 15: use.names = FALSE)
        • line 16: sanity_check(depth_bed_fn)
        • line 17: system2(command = "rm", args = depth_bed_fn)
        • line 18: return(depth_bed)
    • repetition in get_gene_anno_plot, and get_gene_rnt
      • in get_gene_anno_plot
        • line 3:{
        • line 4: mc <- match.call()
        • line 5: encl <- parent.env(environment())
        • line 6: called_args <- as.list(mc)[-1]
        • line 7: default_args <- encl$_default_args
        • line 8: default_args <- default_args[setdiff(names(default_args),
        • line 9: names(called_args))]
        • line 10: called_args[encl$_omit_args] <- NULL
        • line 11: args <- c(lapply(called_args, eval, parent.frame()), lapply(default_args,
        • line 12: eval, envir = environment()))
        • line 13: key <- encl$_hash(c(encl$_f_hash, args, lapply(encl$_additional,
        • line 14: function(x) eval(x[[2L]], environment(x)))))
        • line 15: res <- encl$_cache$get(key)
        • line 16: if (inherits(res, "key_missing")) {
        • line 17: mc[[1L]] <- encl$_f
        • line 18: res <- withVisible(eval(mc, parent.frame()))
        • line 19: encl$_cache$set(key, res)
        • line 20: }
        • line 21: if (res$visible) {
        • line 22: res$value
        • line 23: }
        • line 24: else {
        • line 25: invisible(res$value)
        • line 26: }
        • line 27:}, memoised = TRUE, class = c("memoised", "function"))), where = "namespace:ELViS",
      • in get_gene_rnt
        • line 3:{
        • line 4: mc <- match.call()
        • line 5: encl <- parent.env(environment())
        • line 6: called_args <- as.list(mc)[-1]
        • line 7: default_args <- encl$_default_args
        • line 8: default_args <- default_args[setdiff(names(default_args),
        • line 9: names(called_args))]
        • line 10: called_args[encl$_omit_args] <- NULL
        • line 11: args <- c(lapply(called_args, eval, parent.frame()), lapply(default_args,
        • line 12: eval, envir = environment()))
        • line 13: key <- encl$_hash(c(encl$_f_hash, args, lapply(encl$_additional,
        • line 14: function(x) eval(x[[2L]], environment(x)))))
        • line 15: res <- encl$_cache$get(key)
        • line 16: if (inherits(res, "key_missing")) {
        • line 17: mc[[1L]] <- encl$_f
        • line 18: res <- withVisible(eval(mc, parent.frame()))
        • line 19: encl$_cache$set(key, res)
        • line 20: }
        • line 21: if (res$visible) {
        • line 22: res$value
        • line 23: }
        • line 24: else {
        • line 25: invisible(res$value)
        • line 26: }
        • line 27:}, memoised = TRUE, class = c("memoised", "function"))), where = "namespace:ELViS",
    • repetition in get_gene_anno_plot_ori, and get_gene_rnt_ori
      • in get_gene_anno_plot_ori
        • line 67: is_custom_palette <- FALSE
        • line 68: if (length(col_pal) >= length(Gene_levels)) {
        • line 69: is_custom_palette <- TRUE
        • line 70: if (is.null(names(col_pal)) || length(setdiff(cds_plotdata$Gene,
        • line 71: names(col_pal))) != 0) {
        • line 72: col_pal_fin <- structure(col_pal[seq_len(length(Gene_levels))],
        • line 73: names = Gene_levels)
        • line 74: }
        • line 75: else {
        • line 76: col_pal_fin <- col_pal
        • line 77: }
        • line 78: }
      • in get_gene_rnt_ori
        • line 12: is_custom_palette <- FALSE
        • line 13: if (length(col_pal_gene) >= length(Gene_levels)) {
        • line 14: is_custom_palette <- TRUE
        • line 15: if (is.null(names(col_pal_gene)) || length(setdiff(Gene_levels,
        • line 16: names(col_pal_gene))) != 0) {
        • line 17: col_pal_fin <- structure(col_pal_gene[seq_len(length(Gene_levels))],
        • line 18: names = Gene_levels)
        • line 19: }
        • line 20: else {
        • line 21: col_pal_fin <- col_pal_gene
        • line 22: }
        • line 23: }
    • repetition in get_window_v1, and get_window2
      • in get_window_v1
        • line 5: win <- matrix(c(1, rep(ips, each = 2), dim(Y)[1]), ncol = 2,
        • line 6: byrow = TRUE)
        • line 7: win <- win[which((win[, 2] - win[, 1]) > 10), ]
        • line 8: return(win)
      • in get_window2
        • line 5: 0)
        • line 6: win <- matrix(c(1, rep(ips, each = 2), d), ncol = 2, byrow = TRUE)
        • line 7: win <- win[which((win[, 2] - win[, 1]) > min.length), ]
        • line 8: return(win)

Documentation

  • [ ] Note: Consider to include a package man page.
  • [ ] Important: Consider to include a readme file for all extdata.
  • [ ] Note: Vignette should use BiocStyle package for formatting.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
  • [ ] Important: Vignette should have an Introduction section. Please move the install paragraph to a new section.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
  • [ ] Important: In sample code, tmpdir should be output of tempdir.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
  • [ ] Important: Remove the section about installation from github.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
  • [ ] Note: Vignette includes motivation for submitting to Bioconductor as part of the abstract/intro of the main vignette.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
  • [ ] Important: Ensure the news file is updated to the latest version available.
  • [ ] Note: typos:
WORD FOUND IN.
annotaiton plot_pileUp_multisample.Rd:41
FInal ELViS_toy_run_result.Rd:13
Indecate get_new_baseline.Rd:12
modulefile get_depth_matrix.Rd:38
THe run_ELViS.Rd:17

jianhong avatar Dec 17 '24 15:12 jianhong

Received a valid push on git.bioconductor.org; starting a build for commit id: 321f83145d09912d68f88dc4491c531ba49b2eba

bioc-issue-bot avatar Feb 11 '25 22:02 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Feb 11 '25 22:02 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: c5cdaaab17fde37a20c93517b2b87602d61a6e58

bioc-issue-bot avatar Feb 11 '25 22:02 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Feb 11 '25 22:02 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: ff1cccf6dddda456b68393b3c6428d05863eac43

bioc-issue-bot avatar Feb 11 '25 23:02 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): ELViS_0.99.6.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Feb 11 '25 23:02 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: a907e98b72769b3d4f5e25810d872816bfd2e439

bioc-issue-bot avatar Feb 12 '25 00:02 bioc-issue-bot

Dear @jianhong,

I appreciate your in-depth review of the source code of ELViS. Your precious input has significantly improved the quality and maintainability.

I edited ELViS according to your comments.

I described how I changed source codes according to your comments below, item by item.

Thank you, Jin-Young


Package 'ELViS' Review

Thank you for submitting your package to Bioconductor. The package passed check and build. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review. Code: Note: please consider; Important: must be addressed.

The NAMESPACE file

  • [ ] Important: Selective imports using importFrom instead of import all with import.

    • in line 15 import(ComplexHeatmap)

      • changed this as follows in line 23-25
        
        importFrom(ComplexHeatmap,"%v%")
        importFrom(ComplexHeatmap,Heatmap)
        importFrom(ComplexHeatmap,HeatmapAnnotation)
        importFrom(ComplexHeatmap,column_order)
        importFrom(ComplexHeatmap,rowAnnotation)
        
    • in line 18 import(RBGL, except=c(transitivity,bfs,dfs))

      • import(RBGL) was removed.
        RBGL is not in use anymore.
        We had removed all the codes that use it.
        
    • in line 19 import(basilisk)

      • changed this as follows in line 29-30
        
        importFrom(basilisk,BasiliskEnvironment)
        importFrom(basilisk,obtainEnvironmentPath)
        
    • in line 20 import(circlize, except=c(degree))

      • changed this as follows in line 31
        
        importFrom(circlize,colorRamp2)
        
    • in line 25 import(glue, except=c(trim))

      • changed this as follows in line 32
        
        importFrom(glue,glue)
        
    • in line 27 import(knitr)

      • knitr was required for vignette building.
        I moved it to DESCRIPTION file as below.
        
        VignetteBuilder: knitr
        Suggests: 
            knitr
        
    • in line 29 import(memoise)

      • changed this as follows in line 53
        
        importFrom(memoise,memoise)
        
    • in line 30 import(parallel)

      • changed this as follows in line 54-55
        
        importFrom(parallel,detectCores)
        importFrom(parallel,mclapply)
        
    • in line 31 import(patchwork)

      • changed this as follows in line 56
        
        importFrom(patchwork,plot_layout)
        
    • in line 32 import(rmarkdown)

      • removed import(rmarkdown) and changed vignette not to use it
        
    • in line 33 import(scales)

      • changed this as follows in line 57-61
        
        importFrom(scales,alpha)
        importFrom(scales,log_breaks)
        importFrom(scales,muted)
        importFrom(scales,trans_new)
        importFrom(scales,viridis_pal)
        
    • in line 34 import(segclust2d)

      • changed this as follows in line 62-64
        
        importFrom(segclust2d,segclust)
        importFrom(segclust2d,segment)
        importFrom(segclust2d,segmentation)
        
    • in line 35 import(stringr)

      • changed this as follows in line 74-77
        
        importFrom(stringr,str_extract_all)
        importFrom(stringr,str_replace_all)
        importFrom(stringr,str_split)
        importFrom(stringr,str_to_title)
        
    • in line 36 import(uuid)

      • changed this as follows in line 81
        
        importFrom(uuid,UUIDgenerate)
        
    • in line 37 import(zoo, except=c(index,yearmon,yearqtr,"index<-"))

      • changed this as follows in line 82
        
        importFrom(zoo,rollapply)
        
  • [ ] Important: Function names use camelCase or snake_case and do not include ..

    • in line 11 export(norm.fun)
      • changed 
        norm.fun to norm_fun (line 117 in main.R)
        pd.rate.hy to pd_rate_hy (line 371 in main.R)
        yaxis.hy to yaxis_hy (line 432 in main.R)
        

General package development

  • [ ] Important: Consider adding more unit tests. Current unit tests only covered 0.215%
    • Added unit tests for most functions(~88%[84/95]).
      
      I excluded unit tests for 11 functions that are in Plotting.R and some other functions, which returns extremely complex object like ggplot objects and complex heatmap objects.
      
      These contains plotting functions that return extremely complex objects(list nested with >= 4 depths), which are hard to compare with true values and are implemented in the imported packages and not in mine.
      
      
      Here are the names of the functions I added the unit tests to.
      
        anno_color_var
        capture_params_glue
        check_mode_os
        clique_weight
        compScales
        compute_CN
        coord_to_grng
        coord_to_lst
        detect_BPs
        detect_bp__update_ref
        detect_bp__update_ref__sub_updateY
        detect_dollar_unusual
        detect_unquoted_pipe
        fastSplitSample
        filt_samples
        gen_dist_graph
        get_A_B_Prop_Subcount
        get_A_B_Prop_Subcount__internal
        get_BPsegment_v2
        get_baseseg
        get_bash_script_base
        get_depth_Rsamtools
        get_depth_matrix
        get_depth_matrix_Rsamtools
        get_depth_matrix_Rsamtools_core
        get_depth_matrix_samtools
        get_depth_samtools
        get_dt_zscores
        get_envs_samtools_basilisk
        get_new_baseline
        get_normalized_data
        get_outmat_v1
        get_perf
        get_segbps
        get_segments_and_clusters
        get_subcl
        get_window_core
        get_window_v1
        is_coordinate_only
        is_zero_dec
        make_baseline_vec
        make_col_pal_fin_gene
        norm_fun
        pd_rate_hy
        pooled_sd
        rhoHuber
        run_ELViS
        run_ELViS_core
        run_samtools
        sanity_check
        scale1StepM
        search_Kopt
        segmentation_2nd_phase
        smooth_segment
        stopifnot_ComplexHeatmap_col
        stopifnot_ELViS_result
        stopifnot_baseline
        stopifnot_character1
        stopifnot_characterL
        stopifnot_character_ge1
        stopifnot_classL
        stopifnot_class_ge1
        stopifnot_coordinate1
        stopifnot_integer1
        stopifnot_integer_ge1
        stopifnot_listL
        stopifnot_list_ge1
        stopifnot_logical1
        stopifnot_matrices_available
        stopifnot_mtrx_or_df
        stopifnot_normalized_data
        stopifnot_numeric1
        stopifnot_numericL
        stopifnot_numeric_ge1
        stopifnot_percent
        stopifnot_probs
        stopifnot_refupate_data
        stopifnot_rescued_data
        stopifnot_segment.table
        stopifnot_unit1
        stopifnot_win
        t_col
        update_reference_segments
        yaxis_hy
      
  • [ ] Important: Consider adding input checking. We strongly encourage them. See https://contributions.bioconductor.org/r-code.html#function-arguments
    •   Added input checking to exported functions and some other functions, to ensure that the users provide the intended inputs.
      
        ex)
         ## input checking
          stopifnot(is(mtrx,"matrix")|is(mtrx,"data.frame"))
          stopifnot(is(th,"numeric")&length(th)==1)
      
          if(is.null(title_txt)){
              dot_info <- capture_params_glue(...)
              fun_name <- str_to_title(deparse(substitute(smry_fun)))
              if(length(dot_info)==0){
                  title_txt <- (glue("{fun_name} Depth Distribution"))
              }else{
                  title_txt <- (glue("{fun_name} {dot_info} Depth Distribution"))
              }
      
          }
      
          stopifnot(is(title_txt,"character")&length(title_txt)==1)
          stopifnot(is(smry_fun,"function"))
      
  • [ ] Important: Consider adding instructions for download or creation for the extra data.
    • Instructions for download or creation for the extra data had already been
      in `inst/scripts/README_extdata.txt` following the instruction
      in https://contributions.bioconductor.org/docs.html#doc-inst-scripts
      
      However, after reading your comment, I realized that it needs to be more accessible to general users.
      Therefore, I added this also to the vignette to line 55-121
      

R code

  • [ ] Important:is() or inherits() instead of class().
    • In file R/main.R:
      • at line 1132 found ' if (sum(class(variable) %in% c("double","numeric")) > 0) {'
        •  line 1128 : if ( is(variable,"double") | is(variable, "numeric") ) {
          
      • at line 1146 found ' if (sum(class(variable) %in% "factor") == 0) variable <- factor(variable)'
        •  line 1142 : if (is(variable,"factor")) variable <- factor(variable)
          
      • at line 1155 found ' if (sum(class(variable) %in% c("ordered")) > 0) {'
        •  line 1151 : if (is(variable,"ordered") ) {
          
  • [ ] NOTE: expose overwrite parameter.
    • In file R/main.R:
      • at line 807 found ' ,overwrite=FALSE'
        •  expose `overwrite` parameter together with associated parameters (`save_intermediate_data`,`save_dir`)
          
            run_ELViS <-
                function(
                    X
                    ,N_cores=min(10L,detectCores())
                    ,reduced_output=TRUE
                    ,verbose=FALSE
                    ,save_intermediate_data = FALSE
                    ,save_dir="save_dir"
                    ,overwrite=FALSE
                ){
                    run_ELViS_core(
                        X
                        ,N_cores=N_cores
                        ,save_intermediate_data = save_intermediate_data
                        ,save_idx=NULL
                        ,save_dir=save_dir
                        ,overwrite=overwrite
                        ,reduced_output=reduced_output
                        ,verbose=verbose
                    )
          
  • [ ] NOTE: :: is not suggested in source code unless you can make sure all the packages are imported. Some people think it is better to keep ::. However, please be aware that you will need to manually double-check the imported items if you make any changes to the DESCRIPTION file during development. My suggestion is to remove one or two repetitions to trigger the dependency check.
    •  Removed RColorBrewer:: usage in misc.R in line 234-300
      
       We were unable to find ways to avoid Rsamtools::
       We removed Rsamtools from imported packages list to reduce the number of packages to import
       according to bioconductor manual and build notes, because there were >20 necessary packages to import
       and Rsamtools may not be needed depending on the user input.
      
       So, Rsamtools is conditionally required and that is why we used Rsamtools::.
      
       After removing Rsamtools::, we had to somehow import Rsamtools functions and used require function. as following.
      
       require("Rsamtools",
                      include.only =
                          c("BamFile"
                            ,"scanBamHeader"
                            ,"ScanBamParam"
                            ,"PileupParam"
                            ,"pileup"
                            ,"ScanBamParam"
                            ))
      
       But, there were new warnings that I should not use this require function... and functions from Rsamtools has no global definition.
       So three suggestions are colliding right now. 1) do not exceed 20 imported non-base packages, 2) do not use ::, 3) do not use require.
      
  • [ ] NOTE: Vectorize: for loops present, try to replace them by *apply funcitons.
    • In file R/main.R:
      • at line 200 found ' for (sam in seq_len(n) ) {'
        •  in line 195, changed it to vapply
          
           vapply(FUN.VALUE = numeric(d)
          
      • at line 502 found ' for (sam in which(segupdated_data$segment.K==0)) {'
        •  Removed the function containing this. This function was integrated
           into `get_segments_and_clusters` and `update_reference_segments`
          
      • at line 674 found ' for (js in seq_len(nseg)) {'
        •   dt_zscores =
                        seq_len(nseg) %>%
                        lapply(\(js){
          
      • at line 754 found ' for (sam in seq_len(ncol(X))) {'
        •   segtable.list <- lapply(seq_len(ncol(X)),
          
      • at line 1284 found ' for (js in seq_len(nseg)) {'
        •   dt_zscores =
                        seq_len(nseg) %>%
                        lapply(\(js){
          
      • at line 1384 found ' for( k_tmp in i_grouped_lst[order(vapply(i_grouped_lst,min,0))]){'
        •  state_numbers =
                    seq_along(i_grouped_lst_ord)
          
      • at line 1442 found ' for (clique in ordered_cliques) {'
        •  This part of code can not be changed into *apply format.
           This is because each iteration depends on the "assigned_nodes" updated in the previous iteration.
           Moreover, this is computationally light.
          
      • at line 1515 found ' for (js in state) {'
        •  state %>%
              lapply(\(js){
          
    • In file R/misc.R:
      • at line 224 found ' for(pkg in c("nord","wesanderson","RColorBrewer")){'
        •  pkgs_undetected  = 
              c("nord","wesanderson","RColorBrewer","yarrr") %>%
              vapply(FUN.VALUE = character(1),FUN = \(pkg)
          
    • In file R/Plotting.R:
      • at line 62 found ' for(ovlp in ovlp_status %>% dplyr::group_split(.data$queryHits)){'
        •  This part of code can not be changed into *apply format.
           This is because each iteration depends on the "ylevels" updated in the previous iteration.
           Moreover, this is computationally light.
          
    • In file R/Process_Bam.R:
      • at line 301 found ' for (part in parts) {'
        •  This part of code can not be changed into *apply format.
           This is because each iteration depends on the "is_quoted" updated in the previous iteration.
           Moreover, this is computationally light.
          
      • at line 331 found ' for(s in all_strings){ sanity_check(s) }'
        •  all_strings %>% lapply(sanity_check)
          
  • [ ] Important: Use file.path to replace paste
    • In file R/Process_Bam.R:
      • at line 112 found ' PATH = paste(env_dir,"bin",sep="/"),'
      • at line 113 found ' LD_LIBRARY_PATH = paste(env_dir,"lib",sep="/")'
      • at line 579 found ' PATH = paste(env_dir,"bin",sep="/"),'
      • at line 580 found ' LD_LIBRARY_PATH = paste(env_dir,"lib",sep="/")'
        • Changed the code as following, according to your comment.
          
          PATH = file.path(env_dir,"bin"),
          LD_LIBRARY_PATH = file.path(env_dir,"lib")
          PATH = file.path(env_dir,"bin"),
          LD_LIBRARY_PATH = file.path(env_dir,"lib")
          
  • [ ] Important: Remove unused code.
    • In file R/Plotting.R:
      • at line 199 found ' theme(legend.position = "none") #+ xlim(0,NROW(mtrx_for_plotting)+1)'
        •  Removed the following, according to your comment.
           #+ xlim(0,NROW(mtrx_for_plotting)+1)
          
  • [ ] Important: Need explanation the reason to use head or tail in following code.
    • In file R/main.R:
      • at line 1185 found ' rollYquant <- c(rep(rollYquant0[1],half.width), rollYquant0, rep(tail(rollYquant0,1),half.width-1))'
        •  rollYquant0 is vector, so we took the first and the last element to fill up
          
           The motivation for this is to obtain the criteria to judge if a read depth change between genomic positions is extreme.
           We obtained this criterion for each position by examining the read depth changes in a window centered on the target position and then calculating the 95th percentile.
           As a result, we get "rollYquant0", a numeric vector containing the percentile values.
           However, since we are getting a single value for each window, the first half window and the last half window would not be covered and rollYquant0 will be shorter than the input vector by 1 window size.
           Therefore, for these positions, we borrowed the available values at the nearest possible positions (“rollYquant0[1]” for the first half window and “tail(rollYquant0,1)” for the last half window).
           I believe using "tail" function is the most efficient way to get the last element, involving only 1 function call. Other approach such as x[length(x)] involves two function calls : 1) "length" function 2) and then indexing function "[".
          
             I added comments explaining this in the source code.
          
             # obtain rolling quantile of abolute read depth changes
             rollYquant0 <- rollapply(abs(y),width=2*half.width,FUN=function(x) quantile(x,probs=prob.cutoff))
             # for the first half-window and last half-window without the value, we borrowed the nearest possible values.
             rollYquant <- c(rep(rollYquant0[1],half.width), rollYquant0, rep(tail(rollYquant0,1),half.width-1))          
          
  • [ ] NOTE: Try to check the edge condition when using seq.int or seq_len. For example using seq.int(min(5, nrow(data))) to replace seq.int(5)
    • In file R/main.R:
      • at line 1393 found ' segtable2[nseg2,seq_len(3)] <- segment.table[nseg2,seq_len(3)]'
        •  segtable2 and segment.table always have the same set of columns respectively.
           And seq_len used here is just to represent c(1,2,3) and not to reflect the variable lengths of some object.
          
          To prevent confusion and improve readability, we changed seq_len(3) to c(1,2,3)
          
           segtable2[nseg2,c(1,2,3)] <- segment.table[nseg2,c(1,2,3)]
          
  • [ ] NOTE: Functional programming: code repetition.
    • repetition in coord_to_lst, get_depth_matrix_Rsamtools_gp, and get_depth_samtools_gp

      • in coord_to_lst
        • line 1:{
        • line 2: coord_lst <- coord %>% str_replace_all(",", "") %>% str_split(":|-",
        • line 3: simplify = TRUE) %>% as.list() %>% structure(names = c("chr",
        • line 4: "start", "end")) %>% within({
        • line 5: start <- as.numeric(start)
        • line 6: end <- as.numeric(end)
        • line 7: })
      • in get_depth_matrix_Rsamtools_gp
        • line 2: min_mapq = 30, min_base_quality = 0)
        • line 3: {
        • line 4: coord_lst <- coord %>% str_replace_all(",", "") %>% str_split(":|-",
        • line 5: simplify = TRUE) %>% as.list() %>% structure(names = c("chr",
        • line 6: "start", "end")) %>% within({
        • line 7: start <- as.numeric(start)
        • line 8: end <- as.numeric(end)
        • line 9: })
      • in get_depth_samtools_gp
        • line 2: bash_script_base, samtools = NULL)
        • line 3:{
        • line 4: coord_lst <- coord %>% str_replace_all(",", "") %>% str_split(":|-",
        • line 5: simplify = TRUE) %>% as.list() %>% structure(names = c("chr",
        • line 6: "start", "end")) %>% within({
        • line 7: start <- as.numeric(start)
        • line 8: end <- as.numeric(end)
        • line 9: })
        •  Changed them to the following code, using a custom function "coord_to_lst"
          
           coord_lst <- coord_to_lst(coord)
          
    • repetition in detect_bp__update_ref__sub_updateY, and get_BPsegment_v2

      • in detect_bp__update_ref__sub_updateY
        • line 12: BPs.sam <- length(BPs)
        • line 13: if (BPs.sam > 0) {
        • line 14: if (BPs.sam%%2 > 0) {
        • line 15: BPs.sam <- BPs.sam + 1
        • line 16: newBP <- c(seq_along(Ydiff_sam)[-BPs])[which.max(abs(Ydiff_sam[-BPs]))]
        • line 17: BPs <- c(BPs, newBP)
        • line 18: }
        • line 19: nseg <- length(BPs)
        • line 20: nseg2 <- nseg + 1
        • line 21: srt.BPs <- sort(BPs)
        • line 22: segment.table <- data.frame(seg = seq_len(nseg2),
        • line 23: begin = c(1, srt.BPs + 1), end = c(srt.BPs, d))
        • line 24: segbps <- lapply(seq_len(nseg), FUN = function(i) segment.table$begin[i]:segment.table$end[i])
        • line 25: segbps[[1]] <- c(segbps[[1]], segment.table$begin[nseg2]:segment.table$end[nseg2])
        • line 26: Y.tmp <- Y
        • line 27: Q1.zscore <- max.zscore <- mean.zscore.outwinreg <- rep(1000,
        • line 28: nseg)
        • line 29: med.outwinreg <- len.outwinreg <- rep(0, nseg)
        • line 30: names(mean.zscore.outwinreg) <- seq_len(nseg)
        • line 31: for (js in seq_len(nseg)) {
        • line 32: outwinreg <- segbps[[js]]
        • line 33: len.outwinreg[js] <- length(outwinreg)
        • line 34: newmed <- median(Y[outwinreg, sam])
        • line 35: med.outwinreg[js] <- newmed
        • line 36: if (newmed > 0.05) {
        • line 37: Y.tmp[, sam] <- Y[, sam]/newmed
        • line 38: Z.tmp <- t(apply(Y.tmp, 1, function(x) pd.rate.hy(x,
        • line 39: qrsc = TRUE)))
        • line 40: mean.zscore.outwinreg[js] <- mean(abs(Z.tmp[outwinreg,
        • line 41: sam]))
        • line 42: Q1.zscore[js] <- quantile(abs(Z.tmp[, sam]),
        • line 43: probs = 0.25)
        • line 44: max.zscore[js] <- max(abs(Z.tmp[, sam]))
        • line 45: }
        • line 46: }
        • line 47: ireg <- which((max.zscore < 30) & (len.outwinreg >
        • line 48: 500))
        • line 49: if (length(ireg) > 0) {
        • line 50: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 51: }
        • line 52: else {
        • line 53: ireg <- which(max.zscore < 30)
        • line 54: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 55: }
        • line 56: baseseg
      • in get_BPsegment_v2
        • line 29: BPs.sam <- length(BPs)
        • line 30: if (BPs.sam > 0) {
        • line 31: if (BPs.sam%%2 > 0) {
        • line 32: BPs.sam <- BPs.sam + 1
        • line 33: newBP <- c(c(seq_len(length(ydiff)))[-BPs])[which.max(abs(ydiff[-BPs]))]
        • line 34: BPs <- c(BPs, newBP)
        • line 35: }
        • line 37: nseg <- length(BPs)
        • line 38: nseg2 <- nseg + 1
        • line 39: srt.BPs <- sort(BPs)
        • line 40: segment.table <- data.frame(seg = seq_len(nseg2), begin = c(1,
        • line 41: srt.BPs + 1), end = c(srt.BPs, d))
        • line 42: segbps <- lapply(seq_len(nseg), FUN = function(i) segment.table$begin[i]:segment.table$end[i])
        • line 43: segbps[[1]] <- c(segbps[[1]], segment.table$begin[nseg2]:segment.table$end[nseg2])
        • line 44: Y.tmp <- Y
        • line 45: Q1.zscore <- max.zscore <- mean.zscore.outwinreg <- rep(1000,
        • line 46: nseg)
        • line 47: med.outwinreg <- len.outwinreg <- rep(0, nseg)
        • line 48: names(mean.zscore.outwinreg) <- seq_len(nseg)
        • line 49: for (js in seq_len(nseg)) {
        • line 50: outwinreg <- segbps[[js]]
        • line 51: len.outwinreg[js] <- length(outwinreg)
        • line 52: newmed <- median(Y[outwinreg, sam])
        • line 53: med.outwinreg[js] <- newmed
        • line 54: if (newmed > 0.05) {
        • line 55: Y.tmp[, sam] <- Y[, sam]/newmed
        • line 56: Z.tmp <- t(apply(Y.tmp, 1, function(x) pd.rate.hy(x,
        • line 57: qrsc = TRUE)))
        • line 58: mean.zscore.outwinreg[js] <- mean(abs(Z.tmp[outwinreg,
        • line 59: sam]))
        • line 60: Q1.zscore[js] <- quantile(abs(Z.tmp[, sam]),
        • line 61: probs = 0.25)
        • line 62: max.zscore[js] <- max(abs(Z.tmp[, sam]))
        • line 63: }
        • line 64: }
        • line 65: ireg <- which((max.zscore < 30) & (len.outwinreg > 500))
        • line 66: if (length(ireg) > 0) {
        • line 67: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 68: }
        • line 69: else {
        • line 70: ireg <- which(max.zscore < 30)
        • line 71: baseseg <- ireg[which.min(Q1.zscore[ireg])]
        • line 72: }
        •  This part was not feasible to wrap as a single function.
           This was because, although the same code is used, the variables the following codes utilize were different.
           Therefore, we converted repeated codes into functions as much as possible.
          
          1)
           from
                       segbps <- lapply(seq_len(nseg), FUN=function(i) segment.table$begin[i]:segment.table$end[i])
                       segbps[[1]] <- c(segbps[[1]], segment.table$begin[nseg2]:segment.table$end[nseg2])
           to
                        segbps <- get_segbps(nseg,segment.table)
          
           2)
           from
                         for (js in seq_len(nseg)) {
                                 outwinreg <- segbps[[js]]
                          ...(~10 more lines)
           to
                          dt_zscores <- get_dt_zscores(nseg,segbps,Y,sam)
          
    • repetition in finalize_segments_and_clusters, get_segments_and_clusters, and update_reference_segments

      • in finalize_segments_and_clusters
        • line 13: if (K > 1) {
        • line 14: clust_seg <- segclust(testdata, lmin = 300, Kmax = 10,
        • line 15: ncluster = (2:K), seg.var = c("z"), scale.variable = FALSE,
        • line 16: subsample_by = 60)
        • line 17: rescued_data$clust.list[[sam]] <- clust_seg
      • in get_segments_and_clusters
        • line 14: output <- tryCatch({
        • line 15: K <- segment.K_initial[sam]
        • line 16: if (K > 1) {
        • line 17: shift_seg <- segmentation(testdata, lmin = 300,
        • line 18: Kmax = 10, seg.var = c("z", "y"), subsample_by = 60,
        • line 19: scale.variable = FALSE)
        • line 20: K <- shift_seg$Kopt.lavielle
        • line 21: if (K > 1) {
        • line 22: clust_seg <- segclust(testdata, lmin = 300,
        • line 23: Kmax = 10, ncluster = (2:K), seg.var = c("z",
        • line 24: "y"), scale.variable = FALSE, subsample_by = 60)
        • line 25: out <- list(K = K, clust = clust_seg)
        • line 33: }
        • line 34: msg <- paste0(sam, "| done")
        • line 35: message(msg)
        • line 36: out
        • line 37: }, error = function(err) {
        • line 38: msg <- paste0(sam, "|", err)
        • line 39: message(msg)
        • line 41: if (K > 1) {
        • line 42: shift_seg <- segmentation(testdata, lmin = 300,
        • line 43: Kmax = 10, seg.var = c("z"), subsample_by = 60,
        • line 44: scale.variable = FALSE)
        • line 45: K <- shift_seg$Kopt.lavielle
        • line 46: if (K > 1) {
        • line 47: clust_seg <- segclust(testdata, lmin = 300,
        • line 48: Kmax = 10, ncluster = (2:K), seg.var = c("z"),
        • line 49: scale.variable = FALSE, subsample_by = 60)
        • line 50: out <- list(K = K, clust = clust_seg)
        • line 58: }
        • line 59: msg <- paste0(sam, "| done")
        • line 60: message(msg)
        • line 61: out
        • line 62: })
        • line 63: return(output)
      • in update_reference_segments
        • line 17: output <- tryCatch({
        • line 18: shift_seg <- segmentation(testdata, lmin = 300,
        • line 19: Kmax = 10, seg.var = c("z", "y"), subsample_by = 60,
        • line 20: scale.variable = FALSE)
        • line 21: K <- shift_seg$Kopt.lavielle
        • line 22: msg <- paste0(sam, "| done")
        • line 23: message(msg)
        • line 24: K
        • line 25: }, error = function(err) {
        • line 26: msg <- paste0(sam, "|", err)
        • line 27: message(msg)
        • line 29: shift_seg <- segmentation(testdata, lmin = 300,
        • line 30: Kmax = 10, seg.var = c("z"), subsample_by = 60,
        • line 31: scale.variable = FALSE)
        • line 32: K <- shift_seg$Kopt.lavielle
        • line 33: msg <- paste0(sam, "| done")
        • line 34: message(msg)
        • line 35: K
        • line 36: })
        • line 37: return(output)
        • line 46: output <- tryCatch({
        • line 47: K <- segment.K_initial[sam]
        • line 48: if (K > 1) {
        • line 49: clust_seg <- segclust(testdata, lmin = 300,
        • line 50: Kmax = 10, ncluster = (2:K), seg.var = c("z",
        • line 51: "y"), scale.variable = FALSE, subsample_by = 60)
        • line 52: result <- segment(clust_seg)
        • line 59: }
        • line 60: msg <- paste0(sam, "| done")
        • line 61: message(msg)
        • line 62: new_y
        • line 63: }, error = function(err) {
        • line 64: msg <- paste0(sam, "|", err)
        • line 65: message(msg)
        • line 67: K <- segment.K_initial[sam]
        • line 68: if (K > 1) {
        • line 69: clust_seg <- segclust(testdata, lmin = 300,
        • line 70: Kmax = 10, ncluster = (2:K), seg.var = c("z"),
        • line 71: scale.variable = FALSE, subsample_by = 60)
        • line 72: result <- segment(clust_seg)
        • line 85: }
        • line 86: msg <- paste0(sam, "| done")
        • line 87: message(msg)
        • line 88: new_y
        • line 89: })
        • line 90: return(output)
          •  1) removed `finalize_segments_and_clusters` it has been integrated into other function and no longer in use
             2) made the mentioned parts that are repeated
                 2-1)
                 from
                          shift_seg <- segmentation(testdata, lmin=300,
                                                                     Kmax = 10, seg.var = c("y","z"),
                                                                     subsample_by = 60, scale.variable = FALSE)
                          K <- shift_seg$Kopt.lavielle
                 to
                          K <- search_Kopt(x=testdata)
                  2-2)
                  from
                           clust_seg <- segclust(  testdata, lmin=300, Kmax=10, ncluster = (2:K),
                                                    seg.var = c("z","y"), scale.variable = FALSE, subsample_by = 60)
                  to
                            clust_seg <- get_clust_seg(x=testdata,ncluster = (2:K))
            
            
    • repetition in gene_cn_heatmaps, and get_gene_rnt_ori

      • in gene_cn_heatmaps
        • line 9: txdb <- makeTxDbFromGFF(gff3_fn, format = "gff3")
        • line 10: genes <- genes(txdb) %>% sort
        • line 11: cds <- cdsBy(txdb, by = "gene")
        • line 12: if (!is.null(exclude_genes)) {
        • line 13: genes <- genes[!(genes$gene_id %in% exclude_genes)]
      • in get_gene_rnt_ori
        • line 4: txdb <- makeTxDbFromGFF(gff3_fn, format = "gff3")
        • line 5: genes <- genes(txdb) %>% sort
        • line 6: cds <- cdsBy(txdb, by = "gene")
        • line 7: if (!is.null(exclude_genes)) {
        • line 8: cds <- cds[!(names(cds) %in% exclude_genes)]
          •  changed them to
            
             gff_parsed = parse_gff(gff3_fn,exclude_genes)
             txdb <- gff_parsed$txdb
             genes <- gff_parsed$gene
             cds <- gff_parsed$cds
            
    • repetition in gene_cn_heatmaps, integrative_heatmap, and plot_pileUp_multisample

      • in gene_cn_heatmaps
        • line 3: "in"))
        • line 4:{
        • line 5: if (length(baseline) == 1) {
        • line 6: baseline <- rep(baseline, NCOL(X_raw))
        • line 7: }
        • line 8: baseline_target <- baseline
      • in integrative_heatmap
        • line 23: }
        • line 24: if (length(baseline) == 1) {
        • line 25: baseline <- rep(baseline, NCOL(X_raw))
        • line 26: }
        • line 27: baseline_target <- baseline
      • in plot_pileUp_multisample
        • line 6: {
        • line 7: if (length(baseline) == 1) {
        • line 8: baseline <- rep(baseline, NCOL(X_raw))
        • line 9: }
          •  changed them to
            
             baseline <- make_baseline_vec(baseline,L=NCOL(X_raw))
            
    • repetition in get_depth_matrix, and get_depth_matrix_gp

      • in get_depth_matrix
        • line 1: mode = "samtools_basilisk", target_virus_name, N_cores = detectCores(),
        • line 2: min_mapq = 30, min_base_quality = 0, max_depth = 1e+05, modules = NULL,
        • line 3: envs = NULL, tmpdir = tempdir(), samtools = NULL, condaenv = "env_samtools",
        • line 4: condaenv_samtools_version = "1.21")
        • line 5:{
        • line 6: os_name <- Sys.info()["sysname"]
        • line 7: if (os_name == "Windows") {
        • line 8: if (mode != "Rsamtools") {
        • line 9: warning(glue("mode={mode} is not supported for {os_name}. Changing mode to Rsamtools..."))
        • line 10: mode <- "Rsamtools"
        • line 11: }
        • line 12: }
        • line 16: }
        • line 17: if (mode == "samtools_basilisk") {
        • line 18: if (!requireNamespace("basilisk", quietly = TRUE)) {
        • line 19: stop("R Package 'basilisk' does not exist. Please install it by following instructions in 'https://www.bioconductor.org/packages/release/bioc/html/basilisk.html'")
        • line 20: }
        • line 21: if (grepl("[^0-9.]", condaenv_samtools_version)) {
        • line 22: stop("Invalid samtools version number. Please find correct version number refering to 'https://anaconda.org/bioconda/samtools'.")
        • line 23: }
        • line 24: samtools_env <- BasiliskEnvironment(envname = condaenv,
        • line 25: pkgname = "ELViS", channels = c("conda-forge", "bioconda"),
        • line 26: packages = c(glue("samtools=={condaenv_samtools_version}")))
        • line 27: env_dir <- obtainEnvironmentPath(samtools_env)
        • line 28: envs <- c(PATH = paste(env_dir, "bin", sep = "/"), LD_LIBRARY_PATH = paste(env_dir,
        • line 29: "lib", sep = "/"))
        • line 30: }
        • line 31: if (mode == "Rsamtools") {
        • line 32: if (!requireNamespace("Rsamtools", quietly = TRUE)) {
        • line 33: stop("R Package 'Rsamtools' does not exist. Please install it by executing following command.\n\nif (!requireNamespace('BiocManager', quietly = TRUE))\n utils::install.packages('BiocManager')\n\nBiocManager::install('Rsamtools')")
        • line 34: }
        • line 35: out_mtrx <- get_depth_matrix_Rsamtools(bam_files = bam_files,
        • line 36: target_virus_name = target_virus_name, N_cores = N_cores,
        • line 37: max_depth = max_depth, min_mapq = min_mapq, min_base_quality = min_base_quality)
        • line 38: }
        • line 39: else if (mode %in% c("samtools_custom", "samtools_basilisk")) {
        • line 40: if (!dir.exists(tmpdir)) {
        • line 41: dir.create(tmpdir)
        • line 42: }
        • line 43: out_mtrx <- get_depth_matrix_samtools(bam_files = bam_files,
        • line 48: }
        • line 49: colnames(out_mtrx) <- basename(bam_files)
        • line 50: if (grepl("Error", out_mtrx[1, 1], ignore.case = TRUE)) {
        • line 51: stop(out_mtrx[1, 1])
        • line 52: }
        • line 53: return(out_mtrx)
      • in get_depth_matrix_gp
        • line 1: mode = "samtools_custom", coord, N_cores = detectCores(),
        • line 2: min_mapq = 30, min_base_quality = 0, max_depth = 1e+05, modules = NULL,
        • line 3: envs = NULL, tmpdir = tempdir(), samtools = NULL, condaenv = "env_samtools",
        • line 4: condaenv_samtools_version = "1.21")
        • line 5:{
        • line 6: os_name <- Sys.info()["sysname"]
        • line 7: if (os_name == "Windows") {
        • line 8: if (mode != "Rsamtools") {
        • line 9: warning(glue("mode={mode} is not supported for {os_name}. Changing mode to Rsamtools..."))
        • line 10: mode <- "Rsamtools"
        • line 11: }
        • line 12: }
        • line 13: if (mode == "samtools_basilisk") {
        • line 14: if (!requireNamespace("basilisk", quietly = TRUE)) {
        • line 15: stop("R Package 'basilisk' does not exist. Please install it by following instructions in 'https://www.bioconductor.org/packages/release/bioc/html/basilisk.html'")
        • line 16: }
        • line 17: if (grepl("[^0-9.]", condaenv_samtools_version)) {
        • line 18: stop("Invalid samtools version number. Please find correct version number refering to 'https://anaconda.org/bioconda/samtools'.")
        • line 19: }
        • line 20: samtools_env <- BasiliskEnvironment(envname = condaenv,
        • line 21: pkgname = "ELViS", channels = c("conda-forge", "bioconda"),
        • line 22: packages = c(glue("samtools=={condaenv_samtools_version}")))
        • line 23: env_dir <- obtainEnvironmentPath(samtools_env)
        • line 24: envs <- c(PATH = paste(env_dir, "bin", sep = "/"), LD_LIBRARY_PATH = paste(env_dir,
        • line 25: "lib", sep = "/"))
        • line 26: }
        • line 27: if (mode == "Rsamtools") {
        • line 28: if (!requireNamespace("Rsamtools", quietly = TRUE)) {
        • line 29: stop("R Package 'Rsamtools' does not exist. Please install it by executing following command.\n\nif (!requireNamespace('BiocManager', quietly = TRUE))\n utils::install.packages('BiocManager')\n\nBiocManager::install('Rsamtools')")
        • line 30: }
        • line 31: out_mtrx <- get_depth_matrix_Rsamtools_gp(bam_files = bam_files,
        • line 32: coord = coord, N_cores = N_cores, max_depth = max_depth,
        • line 33: min_mapq = min_mapq, min_base_quality = min_base_quality)
        • line 34: }
        • line 35: else if (mode %in% c("samtools_custom", "samtools_basilisk")) {
        • line 36: if (!dir.exists(tmpdir)) {
        • line 37: dir.create(tmpdir)
        • line 38: }
        • line 39: out_mtrx <- get_depth_matrix_samtools_gp(bam_files = bam_files,
        • line 43: }
        • line 44: colnames(out_mtrx) <- basename(bam_files)
        • line 45: if (grepl("Error", out_mtrx[1, 1], ignore.case = TRUE)) {
        • line 46: stop(out_mtrx[1, 1])
        • line 47: }
        • line 48: return(out_mtrx)
          • 1) merged get_depth_matrix_gp into get_depth_matrix
            2)
                changed
                 from
                            os_name <- Sys.info()["sysname"]
                             if( os_name == "Windows" ){
                                 if(mode != "Rsamtools"){
                                     warning(glue("mode={mode} is not supported for {os_name}. Changing mode to Rsamtools..."))
                                     mode <- "Rsamtools"
                                 }
                             }
                             if( !(mode %in% c("samtools_basilisk","samtools_custom","Rsamtools")) ){
                                         stop(glue("mode='{mode}' is not an allowed argument. Available arguments are 'samtools_basilisk','samtools_custom', and 'Rsamtools'"))
                                     }
                 to
                             mode <- check_mode_os(mode)
            
             3)
                changed
                 from
                             # check if basilisk is installed and install if not
                                 if (!requireNamespace("basilisk", quietly = TRUE)) {
                                     stop("R Package 'basilisk' does not exist. Please install it by following instructions in 'https://www.bioconductor.org/packages/release/bioc/html/basilisk.html'")
                                 }
            
                                 # samtools version sanity check
                                 if(grepl("[^0-9.]",condaenv_samtools_version)){
                                     stop("Invalid samtools version number. Please find correct version number refering to 'https://anaconda.org/bioconda/samtools'.")
                                 }
            
            
                                 # Load samtools conda environment
                                 samtools_env <- BasiliskEnvironment(
                                     envname=condaenv
                                     ,pkgname="ELViS"
                                     ,channels = c("conda-forge","bioconda")
                                     ,packages=c(glue("samtools=={condaenv_samtools_version}"))
                                 )
            
                                 env_dir <- obtainEnvironmentPath(samtools_env)
            
                                 envs <- c(
                                     PATH = file.path(env_dir,"bin"),
                                     LD_LIBRARY_PATH = file.path(env_dir,"lib")
                                 )
                 to
                           envs <- get_envs_samtools_basilisk(condaenv_samtools_version,condaenv)
            
            
             4)
                changed from
                                         if (!requireNamespace("Rsamtools", quietly = TRUE)) {
                                                         stop("R Package 'Rsamtools' does not exist. Please install it by executing following command.
            
                                         if (!requireNamespace('BiocManager', quietly = TRUE))
                                             utils::install.packages('BiocManager')
            
                                         BiocManager::install('Rsamtools')")
                                                     }
                to
                                         check_Rsamtools_installation()
            
    • repetition in get_depth_matrix_Rsamtools, and get_depth_matrix_Rsamtools_gp

      • in get_depth_matrix_Rsamtools
        • line 7: target_grng <- data.frame(chr = target_virus_name, start = 1,
        • line 8: end = virus_genome_size) %>% makeGRangesFromDataFrame()
        • line 9: paramScanBam <- Rsamtools::ScanBamParam(which = target_grng)
        • line 10: paramPileup <- Rsamtools::PileupParam(min_base_quality = min_base_quality,
        • line 11: max_depth = max_depth, min_mapq = min_mapq, min_nucleotide_depth = 0,
        • line 12: distinguish_strands = FALSE, distinguish_nucleotides = FALSE,
        • line 13: ignore_query_Ns = FALSE, include_deletions = FALSE,
        • line 14: include_insertions = FALSE, left_bins = NULL, query_bins = NULL,
        • line 15: cycle_bins = NULL)
        • line 16: depth_mtrx <- mclapply(X = bam_files, mc.cores = N_cores,
        • line 17: FUN = get_depth_Rsamtools, scanBamParam = paramScanBam,
        • line 18: pileupParam = paramPileup) %>% do.call(cbind, .)
        • line 19: return(depth_mtrx)
      • in get_depth_matrix_Rsamtools_gp
        • line 10: target_grng <- data.frame(chr = coord_lst$chr, start = coord_lst$start,
        • line 11: end = coord_lst$end) %>% makeGRangesFromDataFrame()
        • line 12: paramScanBam <- Rsamtools::ScanBamParam(which = target_grng)
        • line 13: paramPileup <- Rsamtools::PileupParam(min_base_quality = min_base_quality,
        • line 14: max_depth = max_depth, min_mapq = min_mapq, min_nucleotide_depth = 0,
        • line 15: distinguish_strands = FALSE, distinguish_nucleotides = FALSE,
        • line 16: ignore_query_Ns = FALSE, include_deletions = FALSE,
        • line 17: include_insertions = FALSE, left_bins = NULL, query_bins = NULL,
        • line 18: cycle_bins = NULL)
        • line 19: depth_mtrx <- mclapply(X = bam_files, mc.cores = N_cores,
        • line 20: FUN = get_depth_Rsamtools, scanBamParam = paramScanBam,
        • line 21: pileupParam = paramPileup) %>% do.call(cbind, .)
        • line 22: return(depth_mtrx)
          •  incorporated get_depth_matrix_Rsamtools_gp into get_depth_matrix_Rsamtools
            
    • repetition in get_depth_matrix_samtools, and get_depth_matrix_samtools_gp

      • in get_depth_matrix_samtools
        • line 1: function (bam_files, target_virus_name, N_cores = min(10,
        • line 2: detectCores()), min_mapq = 30, min_base_quality = 0,
        • line 3: modules = NULL, envs = NULL, tmpdir = tempdir(), samtools = NULL)
        • line 4: {
      • in get_depth_matrix_samtools_gp
        • line 1: function (bam_files, coord, N_cores = detectCores(), min_mapq = 30,
        • line 2: min_base_quality = 0, modules = NULL, envs = NULL, tmpdir = tempdir(),
        • line 3: samtools = NULL)
        • line 4: {
          •  incorporated get_depth_matrix_samtools_gp into get_depth_matrix_samtools
            
    • repetition in get_depth_samtools, and get_depth_samtools_gp

      • in get_depth_samtools
        • line 5: depth_bed_fn <- glue("{tmpdir}/{UUIDgenerate()}_{vec_i}.depth.bed")
        • line 6: run_samtools(bash_script_base = bash_script_base, command = glue("depth -a -r '{target_virus_name}' --min-MQ {min_mapq} --min-BQ {min_base_quality} -g 256 {bam_fn}"),
        • line 7: output_name = depth_bed_fn, samtools = samtools, depth_count_only = TRUE)
        • line 8: depth_bed <- unlist(read.csv(depth_bed_fn, sep = "\t", header = FALSE),
        • line 9: use.names = FALSE)
        • line 10: sanity_check(depth_bed_fn)
        • line 11: system2(command = "rm", args = depth_bed_fn)
        • line 12: return(depth_bed)
      • in get_depth_samtools_gp
        • line 11: depth_bed_fn <- glue("{tmpdir}/{UUIDgenerate()}_{vec_i}.depth.bed")
        • line 12: run_samtools(bash_script_base, command <- glue("depth -a -r '{coord_lst$chr}:{coord_lst$start}-{coord_lst$end}' --min-MQ {min_mapq} --min-BQ {min_base_quality} -g 256 {bam_fn}"),
        • line 13: output_name <- depth_bed_fn, samtools <- samtools, depth_count_only <- TRUE)
        • line 14: depth_bed <- unlist(read.csv(depth_bed_fn, sep = "\t", header = FALSE),
        • line 15: use.names = FALSE)
        • line 16: sanity_check(depth_bed_fn)
        • line 17: system2(command = "rm", args = depth_bed_fn)
        • line 18: return(depth_bed)
          •  incorporated get_depth_samtools_gp into get_depth_samtools
            
    • repetition in get_gene_anno_plot, and get_gene_rnt

      • in get_gene_anno_plot
        • line 3:{
        • line 4: mc <- match.call()
        • line 5: encl <- parent.env(environment())
        • line 6: called_args <- as.list(mc)[-1]
        • line 7: default_args <- encl$_default_args
        • line 8: default_args <- default_args[setdiff(names(default_args),
        • line 9: names(called_args))]
        • line 10: called_args[encl$_omit_args] <- NULL
        • line 11: args <- c(lapply(called_args, eval, parent.frame()), lapply(default_args,
        • line 12: eval, envir = environment()))
        • line 13: key <- encl$_hash(c(encl$_f_hash, args, lapply(encl$_additional,
        • line 14: function(x) eval(x[[2L]], environment(x)))))
        • line 15: res <- encl$_cache$get(key)
        • line 16: if (inherits(res, "key_missing")) {
        • line 17: mc[[1L]] <- encl$_f
        • line 18: res <- withVisible(eval(mc, parent.frame()))
        • line 19: encl$_cache$set(key, res)
        • line 20: }
        • line 21: if (res$visible) {
        • line 22: res$value
        • line 23: }
        • line 24: else {
        • line 25: invisible(res$value)
        • line 26: }
        • line 27:}, memoised = TRUE, class = c("memoised", "function"))), where = "namespace:ELViS",
      • in get_gene_rnt
        • line 3:{
        • line 4: mc <- match.call()
        • line 5: encl <- parent.env(environment())
        • line 6: called_args <- as.list(mc)[-1]
        • line 7: default_args <- encl$_default_args
        • line 8: default_args <- default_args[setdiff(names(default_args),
        • line 9: names(called_args))]
        • line 10: called_args[encl$_omit_args] <- NULL
        • line 11: args <- c(lapply(called_args, eval, parent.frame()), lapply(default_args,
        • line 12: eval, envir = environment()))
        • line 13: key <- encl$_hash(c(encl$_f_hash, args, lapply(encl$_additional,
        • line 14: function(x) eval(x[[2L]], environment(x)))))
        • line 15: res <- encl$_cache$get(key)
        • line 16: if (inherits(res, "key_missing")) {
        • line 17: mc[[1L]] <- encl$_f
        • line 18: res <- withVisible(eval(mc, parent.frame()))
        • line 19: encl$_cache$set(key, res)
        • line 20: }
        • line 21: if (res$visible) {
        • line 22: res$value
        • line 23: }
        • line 24: else {
        • line 25: invisible(res$value)
        • line 26: }
        • line 27:}, memoised = TRUE, class = c("memoised", "function"))), where = "namespace:ELViS",
          •  This code does not exist in ELViS source code. Probably generated by the following code
            
             get_gene_rnt <- memoise(get_gene_rnt_ori)
            
    • repetition in get_gene_anno_plot_ori, and get_gene_rnt_ori

      • in get_gene_anno_plot_ori
        • line 67: is_custom_palette <- FALSE
        • line 68: if (length(col_pal) >= length(Gene_levels)) {
        • line 69: is_custom_palette <- TRUE
        • line 70: if (is.null(names(col_pal)) || length(setdiff(cds_plotdata$Gene,
        • line 71: names(col_pal))) != 0) {
        • line 72: col_pal_fin <- structure(col_pal[seq_len(length(Gene_levels))],
        • line 73: names = Gene_levels)
        • line 74: }
        • line 75: else {
        • line 76: col_pal_fin <- col_pal
        • line 77: }
        • line 78: }
      • in get_gene_rnt_ori
        • line 12: is_custom_palette <- FALSE
        • line 13: if (length(col_pal_gene) >= length(Gene_levels)) {
        • line 14: is_custom_palette <- TRUE
        • line 15: if (is.null(names(col_pal_gene)) || length(setdiff(Gene_levels,
        • line 16: names(col_pal_gene))) != 0) {
        • line 17: col_pal_fin <- structure(col_pal_gene[seq_len(length(Gene_levels))],
        • line 18: names = Gene_levels)
        • line 19: }
        • line 20: else {
        • line 21: col_pal_fin <- col_pal_gene
        • line 22: }
        • line 23: }
          •  changed repeated codes to
            
                         pal_out <- make_col_pal_fin_gene(col_pal_gene=col_pal,Gene_levels=Gene_levels)
                         is_custom_palette <- pal_out$is_custom_palette
                         col_pal_fin <- pal_out$col_pal_fin
            
    • repetition in get_window_v1, and get_window2

      • in get_window_v1
        • line 5: win <- matrix(c(1, rep(ips, each = 2), dim(Y)[1]), ncol = 2,
        • line 6: byrow = TRUE)
        • line 7: win <- win[which((win[, 2] - win[, 1]) > 10), ]
        • line 8: return(win)
      • in get_window2
        • line 5: 0)
        • line 6: win <- matrix(c(1, rep(ips, each = 2), d), ncol = 2, byrow = TRUE)
        • line 7: win <- win[which((win[, 2] - win[, 1]) > min.length), ]
        • line 8: return(win)
          •  changed them to
            
                    get_window_v1 <- function(Y,sam,cutoff=1,min.length=10) {
                        win = get_window_core(y=Y[,sam],cutoff=cutoff,min.length=min.length)
                        return(win)
                    }
            
                    #get_window2
                    get_window_core <- function(y,cutoff=1,min.length=10) {
                        y_ofs = y-cutoff
                        d <- length(y)
                        ips <- which(y_ofs[seq_len(d-1)]*y_ofs[2:d]<=0)
                        win <- matrix(c(1,rep(ips,each=2),d),ncol=2,byrow=TRUE)
                        win <- win[which((win[,2] - win[,1])>min.length),]
                        return(win)
                    }
            

Documentation

  • [ ] Note: Consider to include a package man page.
    • Added a package man page, named man/ELViS.Rd
      
  • [ ] Important: Consider to include a readme file for all extdata.
    • There already was readme file for extdata in inst/scripts/README_extdata.txt,
      following the instruction in https://contributions.bioconductor.org/docs.html#doc-inst-scripts
      
      here is the content
      
      
      # inst/scripts/README_extdata.txt 
      
      This file contains the explanation of the files in extdata
      
      1. HPV16 Reference Files
      The following reference files are included in extdata directory of ELViS.
      
      inst/extdata/HPV16.fa
      inst/extdata/HPV16.fa.fai
      inst/extdata/HPV16REF_PaVE.gff
      HPV16.fa is a reference sequence file with the associated fasta index file HPV16.fa.fai
      HPV16REF_PaVE.gff are viral gene annotation file in GFF3 format.
      They were downloaded from PaVE database (https://pave.niaid.nih.gov/locus_viewer?seq_id=HPV16REF)
      
      2. BAM Files
      The following BAM files are included in extdata directory of ELViS.
      
      Control_100X_1102.bam
      Control_100X_1102.bam.bai
      Control_100X_1119.bam
      Control_100X_1119.bam.bai
      
      These are simulation data made using w-WESSIM-2. (https://github.com/GeorgetteTanner/w-Wessim2)
      Briefly, sequencing reads were simulated using this tool and aligned to HPV16.fa to generate the BAM file.
      
      
  • [ ] Note: Vignette should use BiocStyle package for formatting.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd * changed to BiocStyle output: BiocStyle::html_document: toc_float: true BiocStyle::pdf_document: default
  • [ ] Important: Vignette should have an Introduction section. Please move the install paragraph to a new section.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd * ``` Added introduction section.

          # 1. Introduction
         ```
      
  • [ ] Important: In sample code, tmpdir should be output of tempdir.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
      •  changed it to
        
         tmpdir=tempdir()
        
  • [ ] Important: Remove the section about installation from github.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
      •  removed installation from github
        
  • [ ] Note: Vignette includes motivation for submitting to Bioconductor as part of the abstract/intro of the main vignette.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
      •  Added Motivation for submitting to Bioconductor to the vignette
        
         ## 1.1 Motivation for submitting to Bioconductor
        
  • [ ] Important: Ensure the news file is updated to the latest version available.
    •  Updated NEWS.md up to 0.99.7
      
  • [ ] Note: typos:
WORD FOUND IN.
annotaiton plot_pileUp_multisample.Rd:41
FInal ELViS_toy_run_result.Rd:13
Indecate get_new_baseline.Rd:12
modulefile get_depth_matrix.Rd:38
THe run_ELViS.Rd:17
* ```
  fixed them all except for "modulefile", which is not a typo.

  modulefile : This is not a typo. A modulefile is a specific file format as described in https://modules.readthedocs.io/en/stable/modulefile.html
  ```

JYLeeBioinfo avatar Feb 12 '25 00:02 JYLeeBioinfo

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 24.04.1 LTS): ELViS_0.99.7.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/ELViS to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Feb 12 '25 00:02 bioc-issue-bot

Package 'ELViS' Review

It is almost there.
Code: Note: please consider; Important: must be addressed.

Documentation

  • [ ] Important: Vignette should have an Installation section.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd
  • [ ] Important: Please include Bioconductor installation instructions using BiocManager.
    • rmd file vignettes/ELViSPrecisely_Toy_Example.Rmd

jianhong avatar Feb 12 '25 13:02 jianhong