speedseq icon indicating copy to clipboard operation
speedseq copied to clipboard

SVTyper fails to genotype with hg38 alignments. --split_bam (-S) is deprecated

Open dantaki opened this issue 6 years ago • 2 comments

I am using Version: 0.1.2 of speedseq and version: v0.1.4 of svtyper

I have 3 bam files aligned with bwa mem to the hg38 reference. This is my speedseq command

speedseq sv \
-B NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved.bam,NA19239.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved.bam,NA19240.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved.bam \
-D NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_disc.bam,NA19239.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_disc.bam,NA19240.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_disc.bam \
-S NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_splt.bam,NA19239.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_splt.bam,NA19240.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_splt.bam \
-R GRCh38_full_analysis_set_plus_decoy_hla.fa \
-o hg38_yri \
-x GRCh38-centromere-gaps-segdups-combined.bed \
-g \
-t 8

I generated the discordant and split read files with the following commands

samtools view -bh -@ 8 -F 1294 $BAM | samtools sort -@ 8 -o $DISC_BAM

samtools view -@ 8 -h $BAM | /home/usr/bin/speedseq/src/lumpy-sv/scripts/extractSplitReads_BwaMem -i stdin | samtools view -@ 8 -b | samtools sort -@ 8 -o $SPLIT_BAM

I've ran the speedseq command twice, each using a different exclusion file. The exclusion file above is hg38 centromeres, assembly gaps, and segmental duplications.

The second exclusion file was a hg38 lift-over from the hg19 lumpy exclusion file packaged in speedseq.

I get the same error for both jobs.

Here is the error message:

Warning: --split_bam (-S) is deprecated. Ignoring NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_splt.bam.
Calculating library metrics from NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved.bam... done
slurmstepd: *** JOB 11382902 ON XXXX CANCELLED AT 2017-10-01T11:44:24 DUE TO TIME LIMIT ***

Note that the wall time limit was 48 hours.

Here's the STDOUT

LUMPY Express done
# genotype structural variants
python2.7 /home/usr/bin/speedseq/bin/svtyper -q -i hg38_yri.wAFJ3E8dZT5V/hg38_yri.sv.vcf -B NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved.bam -S NA19238.alt_bwamem_GRCh38DH.20150715.YRI.high_coverage_markedDupsRemoved_splt.bam > hg38_yri.wAFJ3E8dZT5V/hg38_yri.sv.gt.vcf ; mv hg38_yri.wAFJ3E8dZT5V/hg38_yri.sv.gt.vcf hg38_yri.wAFJ3E8dZT5V/hg38_yri.sv.vcf

Here's an entry from SVTyper output

chr1    20712560        105     N       <DEL>   9.08    .       SVTYPE=DEL;SVLEN=-179;END=20712739;STRANDS=+-:5;CIPOS=-10,9;CIEND=-10,9;CIPOS95=0,0;CIEND95=0,0;SU=5;PE=0;SR=5  GT:SU:PE:SR:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB  0/1:4:0:4:9:9.08:-14,-13,-59:78:69:8:69:8:69:5:2:0:0:0.1        ./.:0:0:0:.:.:.:.:.:.:.:.:.:.:.:.:.:.   ./.:1:0:1:.:.:.:.:.:.:.:.:.:.:.:.:.:.

Note that only one sample is genotyped. This is the case for all variants in the VCF. Only the first sample is genotyped.

I have used these commands successfully in the past with hg19 aligned genomes without error. I'm not sure what's causing the error messages I'm receiving.

  • Why am I receiving the -S is deprecated error?
  • Why only one sample is genotyped?
  • Is there a hg38 exclusion file that you recommend?

Thank you for your time

dantaki avatar Oct 02 '17 17:10 dantaki

Hi, We are trying to use Speedseq SV call for our whole genome analysis pipeline. We processed the HapMap sample NA12877. I am getting the same warning when I run speedseq sv on the bam outputs from speedseq align "Warning: --split_bam (-S) is deprecated. Ignoring NA12877_10X.splitters.bam." Is this a known issue?

Thank you, Sithara

Sithara85 avatar Aug 14 '18 15:08 Sithara85

Hi,

I met the same warning issue. I am wondering whether it's a big issue for CNV calling.

Thanks!

Elaine

Jia21 avatar Nov 15 '19 19:11 Jia21