Exomiser icon indicating copy to clipboard operation
Exomiser copied to clipboard

False positive hits to Orphanet copy-number diseases.

Open julesjacobsen opened this issue 1 year ago • 3 comments

Exomiser <= 14.0.0 will give false positive hits for small sequence variants located in genes associated with contiguous gene deletion disorders mainly from Orphanet, e.g. Williams syndrome ORPHA:904 / OMIM:194050

The confusion arises from the disease-gene associations and the way Exomiser treats the disease type 'C' (copy-number) the same as 'D' (disease). In the case of disease.type = 'C' then the 'gene' needs to be treated as a large contiguous region covering all the associated genes and only matched to large deletions covering ~80/90+% of this region.

DISEASE_ID	DISEASENAME	        OMIM_GENE_ID	GENE_ID	SYMBOL	TYPE	INHERITANCE
ORPHA:904	Williams syndrome	OMIM:186590	6804	STX1A	C	D
ORPHA:904	Williams syndrome	OMIM:605842	26608	TBL2	C	D
ORPHA:904	Williams syndrome	OMIM:608899	84163	GTF2IRD2	C	D
ORPHA:904	Williams syndrome	OMIM:610039	155382	VPS37D	C	D
ORPHA:904	Williams syndrome	OMIM:604839	8468	FKBP6	C	D
ORPHA:904	Williams syndrome	OMIM:600404	5982	RFC2	D	D
ORPHA:904	Williams syndrome	OMIM:608512	653361	NCF1	C	D
ORPHA:904	Williams syndrome	OMIM:612547	135886	TMEM270	C	D
ORPHA:904	Williams syndrome	OMIM:605678	51085	MLXIPL	D	D
ORPHA:904	Williams syndrome	OMIM:603431	7458	EIF4H	C	D
ORPHA:904	Williams syndrome	OMIM:618202	84277	DNAJC30	D	D
ORPHA:904	Williams syndrome	OMIM:605846	9275	BCL7B	D	D
ORPHA:904	Williams syndrome	OMIM:612546	155368	METTL27	C	D
ORPHA:904	Williams syndrome	OMIM:615733	114049	BUD23	C	D
ORPHA:904	Williams syndrome	OMIM:605681	9031	BAZ1B	C	D
ORPHA:904	Williams syndrome	OMIM:603432	7461	CLIP2	C	D
ORPHA:904	Williams syndrome	OMIM:130160	2006	ELN	C	D
ORPHA:904	Williams syndrome	OMIM:601679	2969	GTF2I	C	D
ORPHA:904	Williams syndrome	OMIM:604318	9569	GTF2IRD1	C	D
ORPHA:904	Williams syndrome	OMIM:601329	3984	LIMK1	C	D

julesjacobsen avatar Apr 12 '24 12:04 julesjacobsen

This would be great. Caveat: Some of the entries are set to "D" without any obvious reason, e.g. RFC2, which is not listed as a disease gene in OMIM.

pnrobinson avatar Apr 13 '24 08:04 pnrobinson

Yes - spotted that as well and we are going to investigate where that is coming from. Presumably the Orphanet XML. What do you think it means if it is genuine - that LoF of BCL7B alone would cause the syndrome or at least a good part of the symptoms?

On Sat, Apr 13, 2024 at 9:20 AM Peter Robinson @.***> wrote:

This would be great. Caveat: Some of the entries are set to "D" without any obvious reason, e.g. RFC2, which is not listed as a disease gene in OMIM.

— Reply to this email directly, view it on GitHub https://github.com/exomiser/Exomiser/issues/557#issuecomment-2053570000, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHO4PCHMO3WEJUZCUYO4ALY5DTFXAVCNFSM6AAAAABGEAJ2UOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTGU3TAMBQGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

damiansm avatar Apr 13 '24 08:04 damiansm

Related: https://github.com/monarch-initiative/phenol/issues/445

julesjacobsen avatar Jul 01 '24 09:07 julesjacobsen