mondo icon indicating copy to clipboard operation
mondo copied to clipboard

Consider annotation prop to distinguish disease grouping class vs disease entity

Open kshefchek opened this issue 5 years ago • 37 comments

It would be useful if MONDO included a way to distinguish between a disease entity and a disease grouping. Currently the only way to do this is to check for subclasses/types of a disease; however, this results in missing valid diseases (see Huntington disease and Cystic Fibrosis). We regularly add a check that all diseases with an OMIM equivalent are disease entities, but a more formal way to determine this would be useful for analyses and application views.

kshefchek avatar Apr 30 '19 15:04 kshefchek

Not sure there is an agreed upon 'level'. E.g. ORDO calls juvenile Huntigton a disease entity

But we are doing something like this anyway for the mondo-analysis

I think the CF subclass is an error can you make ann efo ticket @nicolevasilevsky

cmungall avatar May 01 '19 02:05 cmungall

Dear @nicolevasilevsky - did you hear back from EFO about this?

monicacecilia avatar Sep 13 '19 05:09 monicacecilia

@monicacecilia I don't think I ever did this! Thanks for the nudge

nicolevasilevsky avatar Sep 24 '19 15:09 nicolevasilevsky

per @paolaroncaglia's comments on https://github.com/EBISPOT/efo/issues/553#issuecomment-534927291, I revised the subclass for MONDO_0005413 cystic fibrosis associated meconium ileum

nicolevasilevsky avatar Sep 26 '19 20:09 nicolevasilevsky

@nicolevasilevsky MONDO:0009061 'cystic fibrosis' needs fixing too please, see https://github.com/EBISPOT/efo/issues/553#issuecomment-539991798 Thanks!

paolaroncaglia avatar Oct 09 '19 13:10 paolaroncaglia

Got it - thanks @paolaroncaglia. I did one PR #882 I will work on the rest of this once @cmungall approves the PR.

  • [x] Narrow down its disease location to 'intestine'

Note to self, these are the action items from https://github.com/EBISPOT/efo/issues/553

  • [ ] Remove the mapping between EFO:0004608 'cystic fibrosis associated meconium ileum' and its counterpart MONDO:0005413, and remove MONDO:0005413 from the Mondo list of terms, until Mondo approves the pull request in monarch-initiative/mondo#685, or we'll get the wrong parent back.

  • [ ] Then add the Mondo mapping and the Mondo term back in their respective files: http://purl.obolibrary.org/obo/MONDO_0005413 http://www.ebi.ac.uk/efo/EFO_0004608 cystic fibrosis associated meconium ileum cystic fibrosis associated meconium ileum http://purl.obolibrary.org/obo/MONDO_0005413

  • [x] Add missing period at end of second-last sentence in definition

  • [x] When a term is created for 'perinatal disease' or similar (see https://github.com/EBISPOT/efo/issues/490), add it as a parent of 'cystic fibrosis associated meconium ileum'.

  • [x] Remove all current parents of EFO Orphanet:586 'Cystic fibrosis' except for 'autosomal recessive disease'

  • [x] Add database cross reference Wikipedia:Cystic_fibrosis

  • [ ] Consider capturing the other superclasses of EFO Orphanet:586 'Cystic fibrosis' as features/symptoms/phenotypes. Currently they are:

pancreas disease Rare genetic respiratory disease rare male fertility disorder with obstructive azoospermia Rare genetic disorder with obstructive azoospermia rare pulmonary disease Genetic biliary tract disease Genetic pancreatic disease

  • [ ] Remove the mapping between EFO:0004608 'cystic fibrosis' and its counterpart MONDO:0009061, and remove MONDO:0009061 from the Mondo list of terms, until Mondo fixes the same issue (see monarch-initiative/mondo#685 (comment)), or we'll get the wrong parents back.

nicolevasilevsky avatar Oct 10 '19 20:10 nicolevasilevsky

Is the original request in scope or feasible to be included in mondo, or should we look for a workaround?

kshefchek avatar Nov 07 '19 20:11 kshefchek

Going back to MONDO:0005413 'cystic fibrosis associated meconium ileum', I'm afraid there's a typo in the label that you inherited from EFO, it should be "ileus" not "ileum". (See e.g. https://www.cysticfibrosisjournal.com/article/S1569-1993(17)30809-3/fulltext https://www.chop.edu/conditions-diseases/meconium-ileus https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3085752/)

And based on the above, I'd also suggest to make 'cystic fibrosis associated meconium ileus' a subclass of MONDO:0054868 'meconium ileus'

Thanks, Paola

paolaroncaglia avatar Nov 11 '19 13:11 paolaroncaglia

Hi @nicolevasilevsky , I'm afraid that part of this ticket got a bit side-tracked due to technical issues ;-) It would be great if you could please, before your next release, complete the edits agreed upon for 'cystic fibrosis', so EFO can resume mapping to Mondo for this term. Summing up:

  • [ ] Edits (as applicable to Mondo) listed in https://github.com/monarch-initiative/mondo/issues/685#issuecomment-540770263
  • [ ] Edits suggested in https://github.com/monarch-initiative/mondo/issues/685#issuecomment-552448298 Thank you very much. Paola

paolaroncaglia avatar Nov 13 '19 13:11 paolaroncaglia

Would use of github milestones help with prioritizing tickets for releases?

Kent can you say more about the negative consequence of classifying CF as non leaf? I thought the distinction purely drove the kind of table displayed, to minimize repetition with leaf terms

On Wed, Nov 13, 2019, 05:28 paolaroncaglia [email protected] wrote:

Hi @nicolevasilevsky https://github.com/nicolevasilevsky , I'm afraid that part of this ticket got a bit side-tracked due to technical issues ;-) It would be great if you could please, before your next release, complete the edits agreed upon for 'cystic fibrosis', so EFO can resume mapping to Mondo. Summing up:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/685?email_source=notifications&email_token=AAAMMOICF7EOHZKAJ4M2IALQTP6H7A5CNFSM4HJNXBMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED6ENSI#issuecomment-553404105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOO5W2ODPBRRGKABQLLQTP6H7ANCNFSM4HJNXBMA .

cmungall avatar Nov 13 '19 13:11 cmungall

@kshefchek - see comment above

@cmungall yes- let's use GitHub milestones, great idea

nicolevasilevsky avatar Nov 13 '19 15:11 nicolevasilevsky

@paolaroncaglia, I assigned this ticket to the December release milestone (which I just created).

Are you able to assign milestones too? Feel free to assign and help prioritize. (If not, I think I can adjust your permissions so you can do so)

nicolevasilevsky avatar Nov 13 '19 15:11 nicolevasilevsky

@paolaroncaglia Is this an action item for Mondo, or just EFO? Remove the mapping between EFO:0004608 'cystic fibrosis' and its counterpart MONDO:0009061, and remove MONDO:0009061 from the Mondo list of terms, until Mondo fixes the same issue (see #685 (comment)), or we'll get the wrong parents back.

nicolevasilevsky avatar Nov 13 '19 17:11 nicolevasilevsky

I think this would be useful for analysis and application views. Previously I have had to add extra checks (eg all OMIM identifiers are entities, for ex otherwise Angelman would be filtered out). This works well for rare disease but I assume there are cases in common disease as well. It's not complicated to get around this from my side of things so if it's complicated to encode in the ontology not a big deal.

kshefchek avatar Nov 13 '19 17:11 kshefchek

@nicolevasilevsky In reply to your comment "I assigned this ticket to the December release milestone (which I just created). Are you able to assign milestones too? Feel free to assign and help prioritize. (If not, I think I can adjust your permissions so you can do so)" I just tested unassigning the December milestone and assigning it again, so yes I can do that and I will try to remember to do it in the future :-) Thanks! Pinging @zoependlington so she's aware of this option for Mondo tickets she and I create.

paolaroncaglia avatar Nov 14 '19 16:11 paolaroncaglia

@nicolevasilevsky

Is this an action item for Mondo, or just EFO? Remove the mapping between EFO:0004608 'cystic fibrosis' and its counterpart MONDO:0009061, and remove MONDO:0009061 from the Mondo list of terms, until Mondo fixes the same issue (see #685 (comment)), or we'll get the wrong parents back.

It's just for EFO, thanks.

paolaroncaglia avatar Nov 14 '19 16:11 paolaroncaglia

It's just for EFO, thanks.

got it, thanks!

Have I addressed on the issues on this ticket?

nicolevasilevsky avatar Nov 15 '19 01:11 nicolevasilevsky

@nicolevasilevsky

Have I addressed on the issues on this ticket?

I'm not sure if these ones are done, please: https://github.com/monarch-initiative/mondo/issues/685#issuecomment-552448298 Thanks and have a great weekend!

paolaroncaglia avatar Nov 15 '19 15:11 paolaroncaglia

The label for MONDO_0005413 was fixed. image

I think that is everything. Please reopen if there are outstanding action items.

nicolevasilevsky avatar Nov 15 '19 15:11 nicolevasilevsky

what is the final word on distinguishing disease entity vs grouping class?

kshefchek avatar Nov 15 '19 17:11 kshefchek

@kshefchek not sure, I reopened the ticket. @cmungall can you comment?

nicolevasilevsky avatar Nov 15 '19 17:11 nicolevasilevsky

I need a definition of disease entity, or at least a full description of how the distinction would manifest computationally on the monarch site/apis

cmungall avatar Nov 15 '19 18:11 cmungall

For example, I want to get all rare diseases in Mondo, I do a query to filter out any diseases with subclasses, this gives me Juvenile Huntington disease, but filters out Huntington disease. I assume that this is not expected, but I also don't really know, perhaps better if someone clinical weighed in, @pnrobinson?

kshefchek avatar Nov 15 '19 18:11 kshefchek

But why do you need to filter at all? What are the consequences of just saying: "this is an ontology, here are all subclasses"?

I can see some users may find it unsatisfying to have "neurodegenerative disease" in the list, as this is not what would be considered a distinct disease entity.

But what is a distinct disease entity? I am not sure you will get consistent answers. @pnrobinson suggests distinct treatments and we may be able to give answers that stratify this way as maxo matures, but we are not there yet.

Perhaps for now a strategy where we annotate what are unambiguously disease groupings, and this can be a negative filter. You will still have classes that subsume one another. Is that a problem.

If there is a high priority use case for having a single layer with no subsumption then we can explore that, using the ordo groupings as a basis, but this will take some engineering and curation for it to be complete. And it's not clear to me how things like cancers should be treated here.

For individual analyses there are other things you can do such as filter out by subset, e.g. etiological_subtype, and disease_grouping

cmungall avatar Nov 15 '19 19:11 cmungall

btw I want to caution that HD is not the bets example to illustrate your general point as it's a bit odd, we have a single-isa which is a bad smell

image

cmungall avatar Nov 15 '19 19:11 cmungall

Is disease_grouping already a subset? This is essentially what I am asking for

kshefchek avatar Nov 15 '19 19:11 kshefchek

Yes, but it will not be complete.

On Fri, Nov 15, 2019 at 11:18 AM Kent Shefchek [email protected] wrote:

Is disease_grouping already a subset? This is essentially what I am asking for

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/685?email_source=notifications&email_token=AAAMMONF4EMSGMSSOHHKBUTQT3YYPA5CNFSM4HJNXBMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGOFFQ#issuecomment-554492566, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOIQNHC47ULXQ3QWDH3QT3YYPANCNFSM4HJNXBMA .

cmungall avatar Nov 15 '19 19:11 cmungall

Perhaps for now a strategy where we annotate what are unambiguously disease groupings, and this can be a negative filter. You will still have classes that subsume one another. Is that a problem.

I think this would be perfect

kshefchek avatar Nov 15 '19 19:11 kshefchek

OK, plan

  • make a new subset disease_grouping
  • seed it with ordo_group_of_disorders (retain provenance)
  • auto-populate for any OMIMPS
  • auto-populate for a subset of DPs. E.g. disease-of-X
  • decide how to apply to infectious disease. above genus level for ncbitaxon?
  • decide how to apply to neoplasm/cancer/benign neoplasm DPs.
  • manually curate remaining gaps using some kind of heuristic guidance. E.g. has N levels below yet not classified as group

On Fri, Nov 15, 2019 at 11:29 AM Kent Shefchek [email protected] wrote:

Perhaps for now a strategy where we annotate what are unambiguously disease groupings, and this can be a negative filter. You will still have classes that subsume one another. Is that a problem.

I think this would be perfect

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/685?email_source=notifications&email_token=AAAMMOMO4S46TLMV2YJ725TQT32ARA5CNFSM4HJNXBMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGPCIQ#issuecomment-554496290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOKO4V2EYZJMAAIJBEDQT32ARANCNFSM4HJNXBMA .

cmungall avatar Nov 15 '19 19:11 cmungall

I don't think all OMIMPS qualify as disease_grouping if disease_grouping is restricted to terms that are not truly diseases. In fact, most OMIMPS correspond to non-gene-specific terms for a disease.

maglott avatar Nov 15 '19 22:11 maglott