Exomiser
Exomiser copied to clipboard
exomiser does not output values from ReMM or CADD
As in the following variant TSV files, all CADD and REMM values are . :
#CHROM | POS | REF | ALT | QUAL | CADD(>0.483) | POLYPHEN(>0.956|>0.446) | MUTATIONTASTER(>0.94) | SIFT(<0.06) | REMM 11 | 1.13E+08 | C | A | 198.84 | . | 0.063 | . | 1 | . 1 | 89449390 | T | C | 1061 | . | 0.01 | 1 | 1 | . 10 | 95931011 | A | G | 1125.52 | . | 0.004 | . | 1 | .
I have had these CADD3.1 and ReMM datasets downloaded and in the ./data/ folder
I have the same experience. Any progress on this issue?
The data is in the JSON output file. If you need TSV you can use something like jq to slice the JSON output into TSV if you like. TSV isn't flexible and adding new fields will likely break people's code.
Thanks, I see the CADD scores in the JSON file but no REMM score. It appears the data files are found, but no annotations are added in the output.
2019-11-26 17:23:16.695 INFO 34682 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD snv data from source: /path/to/exomiser/exomiser-cli-12.1.0/data/1902_hg19/whole_genome_SNVs.tsv.gz
2019-11-26 17:23:16.852 INFO 34682 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD InDel data from source: /path/to/exomiser/exomiser-cli-12.1.0/data/1902_hg19/InDels.tsv.gz
2019-11-26 17:23:16.969 INFO 34682 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening REMM data from source: /path/to/exomiser/exomiser-cli-12.1.0/data/1902_hg19/ReMM.v0.3.1.tsv.gz
Any ideas why not REMM would be missing? I downloaded the file from here: https://charite.github.io/software-remm-score.html
Does the file need to be reformatted for Exomiser?
zcat ReMM.v0.3.1.tsv.gz | head -n 5
# ReMM score version 0.3.1
# CHR POS PROBABILITY
1 10001 0.0680
1 10002 0.0680
1 10003 0.0710
Thanks
Have you added the REMM and CADD scores to the pathogenicitySources:
?
Note also that REMM is trained on non-coding variants so if you're analysing exome data you'll not see any scores. The REMM datafile dosn't need reformatting.
Sorry for the delay, I didn't see this response until now. Yes, I'm analyzing exome data, so that explains it. I do actually see a couple variants now which passed due to ClinVar whitelisting and I see REMM scores for those so it is reading the file properly and your explanation makes sense.