Exomiser icon indicating copy to clipboard operation
Exomiser copied to clipboard

Incorporate genic intolerance scores

Open damiansm opened this issue 9 years ago • 2 comments

Sources of data

** New: ExAC paper scores and mouse emb lethal info into Exomiser: 79% of loss-of-function-intolerant genes to have no known associated disease (~2500): list of the loss-of-function-intolerant genes (linked from the supplementary data): https://docs.google.com/spreadsheets/d/18isx46crTMeeDif05BwBD37veko8Hyrvbf9WuACoubo/edit#gid=0. In the analysis paper (we), we should add analysis of what model organisms can tell us about them! not sure when i'm going to have time to work on this analysis, but if there's a way i can help, i'd like to. Also, for the next version of exomiser (or whatever the successor is), are you thinking of adding in the MAPS score they describe?

(i) The Petrovski Genic Intolerance paper 2013, PLoS Genetics. Downloaded scores from Table S2 (ii) The Haploinsufficiency (HI) score of Matt Hurles. 2010. Plos Genetics. Downloaded Dataset 2 (iii) Samocha KE, Robinson EB, Sanders SJ, Stevens C, Sabo A, McGrath LM, Kosmicki JA, Rehnström K, Mallick S, Kirby A, Wall DP, MacArthur DG, Gabriel SB, DePristo M, Purcell SM, Palotie A, Boerwinkle E, Buxbaum JD, Cook EH Jr, Gibbs RA, Schellenberg GD, Sutcliffe JS, Devlin B, Roeder K, Neale BM, Daly MJ. A framework for the interpretation of de novo mutation in human disease. Nat Genet. 2014 Sep;46(9):944-50. doi: 10.1038/ng.3050. Epub 2014 Aug 3. PubMed PMID: 25086666. downloaded p-vals from supp material (iv) Look at various "human KO" projects looking to identify full KO genes in consanginuous populations displaying little evidence of genetic disease

Initial trial by Damian

  • Downloaded scores as text files
  • perl produce_wekafiles_exomiser2_plus_genic_intol.pl Exomiser2NoModelKnown,Exomiser2NoModelKnownWithNoise,Exomiser2NoModelNovel,Exomiser2NoModelNovelWithNoise,Exomiser2WithModelKnown,Exomiser2WithModelKnownWithNoise,Exomiser2WithModelNovel,Exomiser2WithModelNovelWithNoise produes logit training files using each of these scores => ExomiserPlusGenicIntolerance.xls
  • perl analyse_exomiser_2_main_comparison_withintolerance_scores.pl - uses logit model along with text file of scores from paper
    • Petrovski: gave 2-4% increase in performance for no model, AD or AR and worked equally well with raw scores or percentiles and training with just AD data about same
    • Other scores gave <1% increase

Consider just filtering based on scores as Petrovski paper performance shows most applicable for dominant disorders and for certain disorder areas more than others

Marten assigned to continue development

damiansm avatar Apr 15 '15 08:04 damiansm

GnomAD have just release a new score as well

damiansm avatar Oct 30 '18 12:10 damiansm

Revisit this with GnomAD data: https://gnomad.broadinstitute.org/downloads#v2-constraint

Jules has already incorporated into the ACMG classification code.

Probably just display LOEUF, pLI etc at the gene level for the HTML and JSON output rather than hard filtering to remove AD, LoF candidates if LOEUF > 0.3 etc

damiansm avatar Oct 13 '21 14:10 damiansm