Pisces
Pisces copied to clipboard
Option to allow non-anchored indels?
Hi Tamsen, I've been testing validation datasets using Pisces and the results are very promising, however I have run into an issue where Pisces will complain about indels which have no anchor, stemming from this check here . I believe these reads are valid as they are supported by other reads which do have anchors and mirror the non-anchored reads. Would there be any way short of removing these reads or manually rewriting them to have an anchor for Pisces to accept them as valid? Thank you, Deric
Hi there,
Yes, I talked to the team and I think we can do this for you with our next release, but we need to test it internally first. We had a few questions: How often does this happen in your data? Is it frequent? We wanted to know if logging a warning rather than throwing would be better? Downside to a warning: If this happens a lot in your data, your log will be huge. Is it possible to send us some sample data? How did you align these unanchored indel reads in the first place?
thanks! Tamsen
Hi Tamsen, Thats great, thanks for the fast response! With regard to your questions:
- So far its only happened to one sample but I haven't tried running Pisces on that many samples so I can't really speak to the frequency.
- Logging a warning would be preferable to completely halting the program.
- The nonanchored indels with cigar 151I only occurred 6 times in the entire alignment, so I don't think spamming the log will be an issue.
- The sample data is publicly available here: ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/use_cases/mixtures/UMCUTRECHT_NA12878_NA24385_mixture_10052016/
- The samples were aligned using bwa-mem. I'm not entirely certain how it decided the alignments of unanchored reads but it might have something to do with the reads being paired end.
Thank you, Deric
Hi Deric,
I see fastq on the public data. For completeness, can you pls send me the exact bwa-mem cmd line used to make the bams you have? Testing the issue now
Thank you
Hi Tamsen, No problem, here is the command I used. I should mention that I used the sentieon implementation of bwa mem so the parameters may be slightly differently named. This was aligned to reference hs37d5.
${SENTIEON_BIN}/bwa mem \
-t 30 \
-M \
-R "@RG\\tID:NA12878_NA24385\\tPL:ILLUMINA\\tSM:NA12878_NA24385" \
genome.fa \
NA12878_NA24385_R1.fastq \
NA12878_NA24385_R2.fastq | \
\${SENTIEON_BIN}/sentieon util sort \
-o NA12878_NA24385.bam \
-t 30 \
--sam2bam \
-i -
thank you!
Update, this is checked in now to our internal dev branch. In our next release (5.2.10) we will allow unanchored indels. We will just warn and continue processing.
Thanks, I really appreciate you taking the time to look into this and add this change!
hi! this issue should now be resolved in 5.2.10. Let us know if you still have issues.