BALSAMIC icon indicating copy to clipboard operation
BALSAMIC copied to clipboard

feat: add tga cnvkit to gens

Open mathiasbio opened this issue 1 year ago • 3 comments

Description

This PR adds post-processing steps to CNVkit results from TGA to facilitate upload to GENS, which has previously only been possible for WGS via post-processing of the GATK CollectReadCounts output.

As the gnomad vcf is required as well for the creation of the BAF visualisation track in GENS the config and the GENS rule assignment has been modified to make it possible to use of these rules and references in TGA as well.

And additional little script was added to massage the CNVkit file tumor.merged.cnr into a GENS accepted format with different resolutions.

This PR closes this issue: https://github.com/Clinical-Genomics/BALSAMIC/issues/1385

Open question to discuss: purity adjusted log2 coverage values

I have in this GENS post-processing also decided to take as input the tumor-purity from PureCN to modify the log2 coverage values to make the fold-changes more visible in low-purity samples. I don't know if this is recommended, however CNVs in low purity samples would be quite difficult to observe without it.

This change requires further changes in CG

We need 2 changes as far as I can tell at the moment:

  1. New argument for TGA analyses: --gnomad-min-af5 to add rules for creating GENS output
  2. We need to remove / re-write the filter for GENS upload which at the moment only allows WGS to be uploaded from balsamic

PR in CG: https://github.com/Clinical-Genomics/cg/pull/3361

Added

  • Script to post-process CNVkit output to GENS-format
  • DNAscope gnomad calling to TGA for GENS

Changed

  • Parsing of GENS arguments changed to account for TGA

Documentation

  • [ ] N/A
  • [ ] Updated Balsamic documentation to reflect the changes as needed for this PR.
    • [Document Name]

Tests

Feature Tests

  • [ ] N/A
  • [ ] Test [Description]
    • [Screenshot]

Pipeline Integrity Tests

  • Report deliver (generation of the .hk file)
    • [x] N/A
    • [ ] Verified
  • TGA T/O Workflow
    • [x] N/A
    • [ ] Verified
  • TGA T/N Workflow
    • [x] N/A
    • [ ] Verified
  • UMI T/O Workflow
    • [x] N/A
    • [ ] Verified
  • UMI T/N Workflow
    • [x] N/A
    • [ ] Verified
  • WGS T/O Workflow
    • [x] N/A
    • [ ] Verified
  • WGS T/N Workflow
    • [x] N/A
    • [ ] Verified
  • QC Workflow
    • [x] N/A
    • [ ] Verified
  • PON Workflow
    • [x] N/A
    • [ ] Verified

Clinical Genomics Stockholm

Documentation

  • Atlas documentation
    • [x] N/A
    • [ ] Updated: [Link]
  • Web portal for Clinical Genomics
    • [x] N/A
    • [ ] Updated: [Link]

User Changes

  • [x] N/A
  • [x] This PR affects the output files or results.
    • [ ] User feedback is considered unnecessary because [Justification].
    • [x] Affected users have been included in the development process and given a chance to provide feedback. (Asked for feedback in ticket: https://clinical-scilifelab.supportsystem.com/scp/tickets.php?id=70903)
    • [ ] Feedback led to changes:

Infrastructure Changes

  • Stored files in Housekeeper
    • [x] N/A
    • [ ] Updated: [Link]
  • CG (CLI and delivered/uploaded files)
    • [x] N/A
    • [ ] Updated: [Link]
  • Servers (configuration files on Hasta)
    • [x] N/A
    • [ ] Updated: [Link]
  • Scout interface
    • [x] N/A
    • [ ] Updated: [Link]

Checklist

[!IMPORTANT]
Ensure that all checkboxes below are ticked before merging.

For Developers

  • PR Description
    • [ ] Provided a comprehensive description of the PR.
    • [ ] Linked relevant user stories or issues to the PR.
  • Documentation
    • [ ] Verified and updated documentation if necessary.
  • Tests
    • [ ] Described and tested the functionality addressed in the PR.
    • [ ] Ensured integration of the new code with existing workflows.
    • [ ] Confirmed that meaningful unit tests were added for the changes introduced.
    • [ ] Checked that the PR has successfully passed all relevant code smells and coverage checks.
  • Review
    • [ ] Addressed and resolved all the feedback provided during the code review process.
    • [ ] Obtained final approval from designated reviewers.

For Reviewers

  • Code
    • [ ] Code implements the intended features or fixes the reported issue.
    • [ ] Code follows the project's coding standards and style guide.
  • Documentation
    • [ ] Pipeline changes are well-documented in the CHANGELOG and relevant documentation.
  • Tests
    • [ ] The author provided a description of their manual testing, including consideration of edge cases and boundary conditions where applicable, with satisfactory results.
  • Review
    • [ ] Confirmed that the developer has addressed all the comments during the code review.

mathiasbio avatar Jun 14 '24 16:06 mathiasbio

At the moment the pipeline is working for the TGA workflows but i haven't verified all workflows yet. So at the moment we could just view this review as a code-review.

mathiasbio avatar Jun 19 '24 13:06 mathiasbio

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.49%. Comparing base (55dd650) to head (a420abb). Report is 1 commits behind head on update_cnvkit_pons.

Additional details and impacted files
@@                 Coverage Diff                 @@
##           update_cnvkit_pons    #1448   +/-   ##
===================================================
  Coverage               99.48%   99.49%           
===================================================
  Files                      40       40           
  Lines                    1960     1976   +16     
===================================================
+ Hits                     1950     1966   +16     
  Misses                     10       10           
Flag Coverage Δ
unittests 99.49% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 19 '24 14:06 codecov[bot]