Exomiser
Exomiser copied to clipboard
Enable output-prefix on cli
Issue
Given I run the same sample using two separate analyses e.g. genome and exome presets
--analysis genome.yml --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz
--analysis exome.yml --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz
Exomiser will currently overwrite the results of the first with those of the second:
results/
├── Pfeiffer_exomiser.html
└── Pfeiffer_exomiser.json
This can be remedied by defining two job or output-option files:
# output-options-exomiser.yml
---
outputPrefix: results/pfeiffer-exomiser
and
# output-options-genomiser.yml
---
outputPrefix: results/pfeiffer-genomiser
--analysis genome.yml --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output output-options-genomiser.yml
--analysis exome.yml --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output output-options-exomiser.yml
would return the results:
results/
├── pfeiffer-genomiser.html
├── pfeiffer-genomiser.json
├── pfeiffer-exomiser.html
└── pfeiffer-exomiser.json
This is a better outcome but not necessarily the easiest for the user as they need to create a new file with which to specify the output options.
Solution
The simplest would be to add a new --output-prefix option which will replace the default:
--preset genome --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output-prefix results/genomiser/Pfeiffer
--preset exome --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output-prefix results/exomiser/Pfeiffer
which would produce output in two new directories:
results/genomiser/Pfeiffer.html
results/genomiser/Pfeiffer.json
results/exomiser/Pfeiffer.html
results/exomiser/Pfeiffer.json
or
--preset genome --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output-prefix results/Pfeiffer-genomiser
--preset exome --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output-prefix results/Pfeiffer-exomiser
which would produce output in the results directory:
results/Pfeiffer-genomiser.html
results/Pfeiffer-genomiser.json
results/Pfeiffer-exomiser.html
results/Pfeiffer-exomiser.json
Both --output
and --output-prefix
can be specified together like so:
# project-specific-output.yml
---
outputContributingVariantsOnly: true
numGenes: 10
minExomiserGeneScore: 0.7
#outputPrefix: results/exomiser-output
#out-format options: HTML, JSON, TSV_GENE, TSV_VARIANT, VCF (default: [HTML, JSON])
outputFormats: [ HTML, JSON, TSV_GENE ]
--preset genome --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output-prefix results/genomiser/Pfeiffer --output project-specific-output.yml
--preset exome --sample examples/pfeiffer-phenopacket.yml --vcf examples/Pfeiffer.vcf.gz --output-prefix results/exomiser/Pfeiffer --output project-specific-output.yml
Here the --output-prefix
would override anything specified in the project-specific-output.yml
file.
@damiansm @pnrobinson Does anyone have a strong feeling about being able to change other output options, besides the outputPrefix
field? These could be specified on the CLI like the output-prefix
to override any defaults. I don't think there would be any great need for this as its probably only the outputPrefix
which is the sort of thing which will need to be changed for each analysis.
These can all be specified before-hand in a job.yml using a Python string Template, but sometimes a cli option is more convenient.