mondo
mondo copied to clipboard
WIP: First round QC for cleaning source metadata
Probably needs some discussions..
This adds checks for cases where we have
- multiple conflicting source annotations
- no source annotations
Here are the current failures: https://docs.google.com/spreadsheets/d/10wrrwp0ewtN30MwKjG6KBgV7oQMuLpkqAuL9Xs7WSvw/edit#gid=0
@nicolevasilevsky I think we need to discuss in a meeting how to best deal with this.
@matentzn With the removal of the MONDO:subClassOf and MONDO:superClassOf, I think this is done, with a few exceptions:
- MONDO:cjm - should we convert this to Chris' ORCID? https://orcid.org/0000-0002-6601-2165. There are 1300 instances of this.
- MONDO:preferredExternal - this is allowed, could you revise the QC check to allow for this?
Yeah, mention in next 1:1 - we have too many bulk edits happening at the moment, but lets change all these cjms and other attributions to orcids!
Yeah, mention in next 1:1 - we have too many bulk edits happening at the moment, but lets change all these cjms and other attributions to orcids!
it is already on our agenda :)
Blocked by https://github.com/monarch-initiative/monarch-mapping-commons/issues/10
Revisit 1st April
@matentzn is going to regenerate this table. There are currently too many for manual review.
In the meeting two separate issues were conflated: the absence of source annotations (which is what this issue is all about, as opposed to my misleading comments up top) and the presence of conflicting annotations, for which I created a new PR: #4943
This PR here is indeed still blocked by above.
This PR checks for all of the source anotations on xrefs- we want to make sure each cross reference has some kind fo source annotation
some don't b/c we removed MONDO:superClassOf and MONDO:subClassOf
this can only be dealt with once we have the boomer mappings going back into Mondo
@nicolevasilevsky next time we meet, we could finish this PR.
I wrote a method that takes care of more than 4600 violations: If there is an equivalent Xref to a term in Mondo, and another Mondo term has an xref to that term but no equivalent class, then we delete the latter:
MONDO:123 xref OMIM:123 { source="ORDO:123" } MONDO:987 xref OMIM:123 {source="MONDO:equivalentTo" }
In this case, we delete the former.
To finish:
- Spot check 10 random removals if they are correct (search for equivalent ones)
- Deal with the remaining 62 cases manually
we don't want to remove the MONDO:includedEntryInOMIM annotations
MONDO:includedEntryInOMIM
What does this mean?
Re: MONDO:includedEntryInOMIM
see: https://github.com/monarch-initiative/mondo/issues/5507
@matentzn can we merge this?
Yes! Great job!