gatk-sv
gatk-sv copied to clipboard
A structural variation pipeline for short-read sequencing
This PR simplifies the Docker-related documentation sections and updates Docusarus to its latest version. Specifically, it implements the following changes. - [x] Make Docker-related docs more focused by simplifying and...
Fixes #643 Adding the index in-place from the main file name. I have not found any other reference to SDtoBAF that needs to be updated.
## Bug Report ### Affected module(s) or script(s) `task SDtoBAF` in `wdl/BatchEvidenceMerging.wdl` ### Affected version(s) - Latest public release version [0.28.4-beta] ### Description Task fails from a GATK User Error...
The ploidy table is currently an intermediate output in JoinRawCalls. This file is required for further downstream analysis and is a necessary input for FilterGenotypes. Thus it should be a...
Adds a call to `ReshardVcf` at the end of `ResolveComplexVariants`. In addition, each of the `BOTHSIDES_PASS` and `HIGH_SR_BACKGROUND` contig-sharded variant tables is concatenated into a single genome-wide table prior to...
A critical optimization to the `ParseGenotypes` task that reimplements `process_posthoc_cpx_depth_regenotyping.py` with greatly accelerated computations. The previous version of the script used many repetitive quadratic (N^2) commands that caused it to...
**Description:** In this PR, I made significant changes to the WDL workflow responsible for analyzing batch effects in genomic data. Our primary objective was to simplify and optimize the pipeline...
[see line](https://github.com/broadinstitute/gatk-sv/blob/2c97aad67bd5e7fb0c7632fc0ed632444e6f624d/wdl/RecalibrateGq.wdl#L92) FilterGenotypes.wdl -> RecalibrateGq.wdl The RecalibrateGqTask contains a usage of `XGBoostMinGqVariantFilter`. The only reference to that tool in the GATK suite is [this un-merged PR](https://github.com/broadinstitute/gatk/pull/7705), and it is currently...
## Bug Report $ Rscript plot_sv_vcf_distribs.R -N $( cat 4388.nyuwa.after.change.lst | sort | uniq | wc -l ) -S SV_colors.txt nvwaCHBCHS.4388.genotype.vcffilter.vcf.gz.gz.stats plotQC_vcfwide_output/ Error in read.table(INFILE, comment.char = "", sep =...
Vapor produces a large amount of storage (~1.5 GiB per sample), 99% of which is the plots. These plots are not necessary to store for every sample, so to reduce...