Exomiser icon indicating copy to clipboard operation
Exomiser copied to clipboard

ML to optimise the combination of various variant and phenotype scores

Open damiansm opened this issue 6 years ago • 0 comments

In terms of missense pathogenicity prediction scores, we have already benchmarked several second generation methods in our existing frameworks to establish whether they improved over the existing method of taking the max(PolyPhen,SIFT, MutationTaster). REVEL, primateAI, MCAP, MVP and MPC were incorporated from dbNSFP and ClinPred was downloaded from source. Overall REVEL, MVP and ClinPred were the most promising and from 12.0.0 we will generally advise using REVEL and MVP that are built into our database.

However, some of the other scores probably did not perform as well as their distribution between 0 and 1 is very different from what we had before and does not fit well with our existing default scores for LoF etc, logistic regression for combining with the phenotype score etc.

There are also other scores such as MTR to test.

Ideally, we would like to perform ML training on a large set of solved cases incorporating all these methods, defaults for non-missense variant types, MAF and the human, mouse, fish and PPI PhenoDigm scores.

damiansm avatar Feb 15 '19 08:02 damiansm