[User Story] Improve CNV calling for target workflow
Need
As a clinical geneticist, I need an improved CNV workflow for targeted panel sequencing (UMI & non-UMI), in particular for cfDNA samples, to accurately detect genetic variations.
Suggested approach
- Incorporate new CNV tools for TGA analyses (WisecondorX, Control-FREEC, ...)
- Add new CNV tool for TGA+UMI analysis mcna
Considered alternatives
- Refine the current CNV calling workflow for cfDNA samples.
Deviation
No response
System requirements assessed
- [X] Yes, I have reviewed the system requirements
Requirements affected by this story
No response
Risk assessment needed
- [ ] Needed
- [X] Not needed
Risk assessment
No response
SOUPs
No response
Can be closed when
- [ ] The panel workflow of cfDNA samples demonstrates a reliable performance in accurately detecting CNVs.
Blockers
No response
Anything else?
No response
Any updates on this?
@zahrahaider, I could look into this. There is a new CNV calling method bioinformatics tool called Jumble. In the mean time, it will be helpful if you could provide us with some specific region(s) along with the case(s) where you identify needs improvement and we can look at it more closely and fix and/or improve the method.
Hi Khurram, The cases I am working on right now pertain to this ticket #910093 where we ordered tumor-only analysis of cfDNA samples using a panel of normals (built on gDNA) for the GMS lymphoid panel 7.3. I used the cns segment data from balsamic cnvkit output and ran it through GISTIC where we repeatedly saw artefacts in chr19 and chr20, and amplifications in 8p24 in almost 75% of patients which shouldn't be there. I am posting the gistic plots of Amps/Dels that we see most frequently in our cohort. I would like some help in also deciding parameters for running gistic.
Refinement meeting comments:
- collect regions and samples with issues of missing and artefact calls
- investigate why these artefacts appear from CNVkit (is it PON related or a problem with the tool?)
- decide on new tools or updates to the PONs (such as cfDNA specific PONs)
To resolve the issue we looked at the CNV analysis and identified the following:
- CNV segments from an intermediate step, *.cns file from cnvkit, during CNV analysis were used as an input for GISTIC
- The purpose was to combine the CNV calls across all the samples in the cohort and further filter using the method from GISTIC
- The final CNV calls from CNVkit VCF were not considered
We proposed the following immediate solution:
- Final CNV calls from *..svdb.clinical.filtered.pass.vcf.gz from should be considered as an initial filtered set of CNVs.
- The segments from the intermediate step differed from the final filtered CNV calls as shown in the table below
| Case | All segments from .cns | CNV segments from *..svdb.clinical.filtered.pass.vcf.gz |
|---|---|---|
| 1 | 70 | 27 |
| 2 | 67 | 65 |
| 3 | 75 | 27 |
| 4 | 60 | 58 |
| 5 | 55 | 54 |
| 6 | 66 | 13 |
| 7 | 63 | 21 |
| 8 | 71 | 69 |
| 9 | 53 | 53 |
| 10 | 63 | 60 |
| 11 | 59 | 15 |
| 12 | 70 | 68 |
| 13 | 55 | 55 |
| 14 | 66 | 63 |
| 15 | 60 | 59 |
| 16 | 79 | 24 |
| 17 | 59 | 57 |
| 18 | 72 | 70 |
| 19 | 59 | 58 |
| 20 | 76 | 45 |
| 21 | 66 | 65 |
| 22 | 59 | 59 |
| 23 | 62 | 9 |
| 24 | 59 | 14 |
| 25 | 61 | 60 |
| 26 | 75 | 73 |
| 27 | 106 | 74 |
| 28 | 61 | 10 |
| 29 | 61 | 61 |
| 30 | 80 | 26 |
| 31 | 59 | 58 |
| 32 | 82 | 79 |
| 33 | 79 | 75 |
| 34 | 62 | 61 |
I hope this solved the issue with artefacts mentioned above.