feat: add tga cnvkit to gens
Description
This PR adds post-processing steps to CNVkit results from TGA to facilitate upload to GENS, which has previously only been possible for WGS via post-processing of the GATK CollectReadCounts output.
As the gnomad vcf is required as well for the creation of the BAF visualisation track in GENS the config and the GENS rule assignment has been modified to make it possible to use of these rules and references in TGA as well.
And additional little script was added to massage the CNVkit file tumor.merged.cnr into a GENS accepted format with different resolutions.
This PR closes this issue: https://github.com/Clinical-Genomics/BALSAMIC/issues/1385
Open question to discuss: purity adjusted log2 coverage values
I have in this GENS post-processing also decided to take as input the tumor-purity from PureCN to modify the log2 coverage values to make the fold-changes more visible in low-purity samples. I don't know if this is recommended, however CNVs in low purity samples would be quite difficult to observe without it.
This change requires further changes in CG
We need 2 changes as far as I can tell at the moment:
- New argument for TGA analyses: --gnomad-min-af5 to add rules for creating GENS output
- We need to remove / re-write the filter for GENS upload which at the moment only allows WGS to be uploaded from balsamic
PR in CG: https://github.com/Clinical-Genomics/cg/pull/3361
Added
- Script to post-process CNVkit output to GENS-format
- DNAscope gnomad calling to TGA for GENS
Changed
- Parsing of GENS arguments changed to account for TGA
Documentation
- [ ] N/A
- [ ] Updated Balsamic documentation to reflect the changes as needed for this PR.
- [Document Name]
Tests
Feature Tests
- [ ] N/A
- [ ] Test [Description]
- [Screenshot]
Pipeline Integrity Tests
-
Report deliver (generation of the
.hkfile)- [x] N/A
- [ ] Verified
-
TGA T/O Workflow
- [x] N/A
- [ ] Verified
-
TGA T/N Workflow
- [x] N/A
- [ ] Verified
-
UMI T/O Workflow
- [x] N/A
- [ ] Verified
-
UMI T/N Workflow
- [x] N/A
- [ ] Verified
-
WGS T/O Workflow
- [x] N/A
- [ ] Verified
-
WGS T/N Workflow
- [x] N/A
- [ ] Verified
-
QC Workflow
- [x] N/A
- [ ] Verified
-
PON Workflow
- [x] N/A
- [ ] Verified
Clinical Genomics Stockholm
Documentation
-
Atlas documentation
- [x] N/A
- [ ] Updated: [Link]
-
Web portal for Clinical Genomics
- [x] N/A
- [ ] Updated: [Link]
User Changes
- [x] N/A
- [x] This PR affects the output files or results.
- [ ] User feedback is considered unnecessary because [Justification].
- [x] Affected users have been included in the development process and given a chance to provide feedback. (Asked for feedback in ticket: https://clinical-scilifelab.supportsystem.com/scp/tickets.php?id=70903)
- [ ] Feedback led to changes:
Infrastructure Changes
-
Stored files in Housekeeper
- [x] N/A
- [ ] Updated: [Link]
-
CG (CLI and delivered/uploaded files)
- [x] N/A
- [ ] Updated: [Link]
-
Servers (configuration files on Hasta)
- [x] N/A
- [ ] Updated: [Link]
-
Scout interface
- [x] N/A
- [ ] Updated: [Link]
Checklist
[!IMPORTANT]
Ensure that all checkboxes below are ticked before merging.
For Developers
-
PR Description
- [ ] Provided a comprehensive description of the PR.
- [ ] Linked relevant user stories or issues to the PR.
-
Documentation
- [ ] Verified and updated documentation if necessary.
-
Tests
- [ ] Described and tested the functionality addressed in the PR.
- [ ] Ensured integration of the new code with existing workflows.
- [ ] Confirmed that meaningful unit tests were added for the changes introduced.
- [ ] Checked that the PR has successfully passed all relevant code smells and coverage checks.
-
Review
- [ ] Addressed and resolved all the feedback provided during the code review process.
- [ ] Obtained final approval from designated reviewers.
For Reviewers
-
Code
- [ ] Code implements the intended features or fixes the reported issue.
- [ ] Code follows the project's coding standards and style guide.
-
Documentation
- [ ] Pipeline changes are well-documented in the CHANGELOG and relevant documentation.
-
Tests
- [ ] The author provided a description of their manual testing, including consideration of edge cases and boundary conditions where applicable, with satisfactory results.
-
Review
- [ ] Confirmed that the developer has addressed all the comments during the code review.
At the moment the pipeline is working for the TGA workflows but i haven't verified all workflows yet. So at the moment we could just view this review as a code-review.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.49%. Comparing base (
55dd650) to head (a420abb). Report is 1 commits behind head on update_cnvkit_pons.
Additional details and impacted files
@@ Coverage Diff @@
## update_cnvkit_pons #1448 +/- ##
===================================================
Coverage 99.48% 99.49%
===================================================
Files 40 40
Lines 1960 1976 +16
===================================================
+ Hits 1950 1966 +16
Misses 10 10
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 99.49% <100.00%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Quality Gate passed
Issues
3 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code