upheno icon indicating copy to clipboard operation
upheno copied to clipboard

Add draft docs for MP HP mappings

Open matentzn opened this issue 1 year ago • 3 comments

This PR adds a documentation page that explains the where to find MP HP mappings.

This is very complicated - so many files. Please help ordering the chaos

@sbello @cmungall

matentzn avatar Oct 31 '24 18:10 matentzn

I think this is a good description of the current state of things. But this is WAY too complicated for 99% of the intended consumers. Things such as whether HPO terms are subsumed by MPO terms is such an ontologists question. Most users just want mappings.

Can we not just integrate the files into a single SOT?

Think GAFs and evidence codes. If you want the GAF for a species you just download it. Most people just look at the key columns. Behind the scenes the GAF merges many different streams. Users who care to see it can see the provenance. Users who care to filter by evidence code can do that too.

Can we not just merge the SSSOMs? Or maybe into two files - high confidence, high recall.

cmungall avatar Oct 31 '24 18:10 cmungall

@cmungall If you do want to merge the files I would create at least 2 one for those with explicit match types and one for those with confidence scores. Anyone wanting to import this is going to need to handle and parse these differently. At MGI we brought in the IMPC files but skipped the Pistoia ones as I was uncertain of what the match type really was or what a good cutoff for inclusion/exclusion would be. For ease of upkeep I would still want a separate MGI file as that way I only need to worry about messing with my own data and not anyone else's There is also the problem that each group has a distinct set of columns but that is the easiest thing to deal with. Maybe. I added a confidence score to the MGI manual file but I doubt that my confidence score is really the same as the confidence score in the Pistoia files or that a consumer of the file would want to treat those the same.

sbello avatar Oct 31 '24 21:10 sbello

We will definitely not merge the HP MP source files so they can be correctly curated - but we should offer a well documented merged product to the community with as much confidence values and metadata as possible!

matentzn avatar Nov 01 '24 11:11 matentzn