oncoanalyser icon indicating copy to clipboard operation
oncoanalyser copied to clipboard

Add support for custom targeted panel

Open bounlu opened this issue 1 year ago • 1 comments

Description of feature

Currently the pipeline only supports tso500 panel for the targeted analysis with the pre-computed reference files given in the panel_data.config. It would be nice to allow custom panels and documentation how to prepare the corresponding reference files:

driver_gene_panel
sage_actionable_panel
sage_coverage_panel
pon_artefacts
target_region_bed
target_region_normalisation
target_region_ratios
target_region_msi_indels
isofox_tpm_norm
isofox_gene_ids
isofox_counts
isofox_gc_ratios

bounlu avatar Jul 16 '24 08:07 bounlu

We are trialing custom panel support in the 0.5.0, and I've put together some general high-level documentation on customising resource files that translates to panel data as well here.

You can also find information on generating custom panel resource files over on the hmftools GH here.

Is there any further documentation you would suggest having in the oncoanalyser GH for this? Keen to improve the docs for this

scwatts avatar Jul 16 '24 12:07 scwatts

Thanks so much for pointing me to the available docs.

I think adding a step to the pipeline for generation of custom panel resource files would be great, just like you did for staging reference data, something like:

nextflow run nf-core/oncoanalyser \
  -profile docker \
  -revision dev \
  --mode targeted \
  --genome GRCh38_hmf \
  --prepare_custom_reference_only \
  --custom_panel_regions custom_panel_regions.bed \
  --custom_msi_sites custom_msi_sites.bed \
  --custom_driver_genes custom_driver_genes.tsv \
  --input samplesheet.csv \
  --outdir prepare_reference/

bounlu avatar Sep 16 '24 06:09 bounlu

No worries! And I totally agree with your suggestion - we've previously discussed plans for implementing a similar feature that you've described.

scwatts avatar Sep 16 '24 23:09 scwatts

Hi @scwatts

Would you be able to provide more information how to prepare the Driver Gene Panel file for custom panels?

For example, what do the column headers mean exactly and how are they used in the pipeline?

reportMissense
reportNonsense
reportSplice
reportDeletion
reportDisruption
reportAmplification
reportSomaticHotspot
likelihoodType
reportGermlineVariant
reportGermlineHotspot
reportGermlineDisruption
additionalReportedTranscripts
reportPGX
InPanel

How are the values different than the original panel doc in case of TSO500?

For example, what does the last column “InPanel” mean? It can be either “Included” or “MISSING” based on the default TSO500 panel.

And what is the source for “likelihoodType” column? Some genes can be hard to classify as Oncogene or TSG as they may show both properties depending on the cancer context, or even conflicting information is given in different databases.

bounlu avatar Sep 19 '24 07:09 bounlu