smoove
smoove copied to clipboard
lumpy filter remove too much reads
Hi, when I ran the first code, the lumpy-filter removed more than 90% reads of my bam files. Are there something wrong with my original bam file? Is it possible to change the parameters of filtering? The process log is as follows: $ smoove call --outdir results/ --name ZWG7 --fasta GCA_001704415.1_ARS1_genomic.fna -p 1 --genotype ZWG7.sorted.uniqe.dedup.bam [smoove] 2022/10/10 09:20:44 starting with version 0.2.8 [smoove] 2022/10/10 09:20:50 calculating bam stats for 1 bams [smoove] 2022/10/10 09:21:54 done calculating bam stats [smoove]: 2022/10/10 09:26:57 finished process: lumpy-filter (set -eu; lumpy_filter -f /data/liyf/reference/GCA_001704415.1_ARS1_genomic.fna /data/liyf/data) in user-time:10m19.856767s system-time:46.940863s [smoove] 2022/10/10 09:49:14 removed 287488 alignments out of 2188411 (13.14%) with low mapq, depth > 1000, or from excluded chroms from ZWG7.disc.bam in 1337 seconds [smoove] 2022/10/10 09:49:14 removed 341630 alignments out of 2188411 (15.61%) that were bad interchromosomals or flanked-splitters from ZWG7.disc.bam [smoove] 2022/10/10 09:50:02 kept 8787 putative orphans [smoove] 2022/10/10 09:50:02 removed 499606 discordant orphans in 28 seconds [smoove] 2022/10/10 09:50:18 removed 1469059 singletons and isolated interchromosomals of 1559293 reads (94.21%) from ZWG7.disc.bam in 64 seconds [smoove] 2022/10/10 09:50:18 90234 reads (4.12%) of the original 2188411 remain from ZWG7.disc.bam [smoove] 2022/10/10 09:59:40 removed 16755 alignments out of 141102 (11.87%) with low mapq, depth > 1000, or from excluded chroms from ZWG7.split.bam in 560 seconds [smoove] 2022/10/10 09:59:41 removed 32891 alignments out of 141102 (23.31%) that were bad interchromosomals or flanked-splitters from ZWG7.split.bam [smoove] 2022/10/10 09:59:44 kept 777 putative orphans [smoove] 2022/10/10 09:59:44 removed 86 split orphans in 1 seconds [smoove] 2022/10/10 09:59:52 removed 88760 singletons of 91456 reads (97.05%) from ZWG7.split.bam in 11 seconds [smoove] 2022/10/10 09:59:52 2696 reads (1.91%) of the original 141102 remain from ZWG7.split.bam [smoove] 2022/10/10 09:59:57 starting lumpy [smoove] 2022/10/10 09:59:57 wrote lumpy command to results//ZWG7-lumpy-cmd.sh [smoove] 2022/10/10 09:59:57 writing sorted, indexed file to results/ZWG7-smoove.genotyped.vcf.gz [smoove] 2022/10/10 09:59:57 excluding variants with all unknown or homozygous reference genotypes
That's a bit high, but not unexpected. If all of those are left in, they'll result in spurious calls.
That's a bit high, but not unexpected. If all of those are left in, they'll result in spurious calls.
Thank for your reply.
That's a bit high, but not unexpected. If all of those are left in, they'll result in spurious calls.
Hi, I have a small question. How many samples could be merged together with the latest version of smoove (v0.2.8)? I am going to select ~500 high-depth samples (>15X) to call SVs, is it possible to merge all samples successfully?
Yes, 500 will probably work. It's simply using bcftools merge. Sometimes it can stall, but it's simple to merge the sample columns with a script if bcftools merge fails.
I have same issue. It is possible to set up the MapQ and depth parameters by ourselves?
I have same issue. It is possible to set up the MapQ and depth parameters by ourselves?
I didn't set any parameters, just made all outfiles merged