BALSAMIC
BALSAMIC copied to clipboard
[User Story] Update Sentieon
Need
As a hospital geneticist I want:
- As many true variants as possible with as few false positive variants
- As fast as possible
- And as cheaply as possible
The new version of Sentieon: 202308 contains changes that allow us to affect all of these aspects.
Our version in production: 202010.02
Some relevant changes since version 202010.02:
- Modified TNscope® AF output format.
- Solved issue in duplex umi consensus that results depend on internal data order.
- Improved speed of bwa mem.
- Solved issue in TNscope such that evidence from overlapping read pairs were not adequately accounted for.
- Solved issue in duplex umi consensus that could output zero-length reads.
- Added support in Dedup algorithm to perform UMI barcode error correction.
- Added support in LocusCollector and Dedup algorithms to perform consensus based deduplication as well as UMI barcode aware deduplication.
- Improved machine learning model for TNscope.
- Improved consensus of INDELs in Dedup algorithm.
- Improved DNAscope pipeline speed and accuracy with a BWA model.
In summary:
- Changes to variant callers could improve accuracy and sensitivity.
- Changes to bwa mem could decrease the TAT
- Inclusion of UMIs to Dedup could save ~20% of the reads wrongly discarded as duplicates, and improve quality of reads (https://github.com/Clinical-Genomics/BALSAMIC/issues/1361)
Suggested approach
In an update Sentieon PR temporarily hard-code the changes to the path to the newest version of Sentieon:
- Ensure that all workflows are behaving as normal
- Basically perform a mini-validation in the PR, checking to see that the variant-calling is working as normal and that all files can be produced
Considered alternatives
No response
Deviation
No response
System requirements assessed
- [ ] Yes, I have reviewed the system requirements
Requirements affected by this story
No response
Risk assessment needed
- [ ] Needed
- [ ] Not needed
Risk assessment
No response
SOUPs
No response
Can be closed when
- [ ] Sentieon has been updated to most recent version, or desired new version.
Blockers
No response
Anything else?
Replaces feature issue: https://github.com/Clinical-Genomics/BALSAMIC/issues/1250