sssom Defining a basic, non-normative model for confidence in SSSOM ontologies

Confidence in mappings is a tricky issue. While SSSOM has a nice confidence field, it is not very clear from the specification alone what it pertains to. There are at least two possible interpretations:

confidence of the mapping: the likelihood that the mapping is correct. This is most likely the prevalent interpretation, but not the one we have intended.
confidence of the mapping justification: degree of trust gained from the justification into the truthfulness of the mapping. This is what we originally intended, but never communicated very well.

In practice, both are quite similar (especially in the frequent case of only having a single justification), but the reality is that a mapping can have multiple justifications, all of which provide different levels of confidence into the truthfulness of the mapping. We can have a low confidence value provided by a lexical match justification, and a high confidence value by a human curated match, and neither, all by itself, says something about the "likelihood that the mappings is correct".

The matter of fact is, they mean something different. And to make things worse, we have the following to consider:

the phrase "likelihood that the mappings is correct" is basically meaningless as mappings cannot really be true in the philosophical sense of the idea of truth. Mappings can serve a purpose.
There are at least two more stakeholders that the sssom standard considers, but has not yet really documented well:
- registry confidence: The confidence of a mapping registry into the quality of a specific mapping set, which is basically a measure of trust of the registry into the mapping provider
- user ratings (semapv:MappingReview): Basically thumbs up/down votings or confirmations that a particular mapping is correct (this is similar to semapv:ManualMappingCuration but not quite the same, as it does not include the search for alternative, possibly better, mappings)

Now given all this complexity, it makes sense to think about a recommended way how tools should determine the overall confidence in a mapping. For example, consider an instance of OxO loading a mapping set with

a low registry_confidence (not too trustworthy, i.e. ad-hoc lexical matching)
multiple justifications per mapping (all with different confidence levels)

We also want to support a user-rating feature in the app (thumbs up/down).

The two concrete things we need to determine is this:

How should the tool compute overall mapping confidence? ("Give me all the high confidence, >90%, mappings")
How should the tool capture that confidence value? By creating an additional semapv:CompositeMatching justification with mapping_tool=OxO and a confidence value compounded of all the others? By adding a non-standard mapping_confidence value to the internal data model and use that to drive search?

I don't think anything should be done here in a normative way, but I think it is valuable to discuss this or at least have a ticket to capture some of our thoughts on the matter.

For me personally, right now, I tend to think something like this is a good start for computing the mapping confidence:

mapping confidence = (m*AVG(confidence)) * (n*RegistryConfidence) * (o * (thumbs-up/ratings))

with m, n, o initially set to 1, but independently adjustable by the mapping browser developer.

and recommending to throwing a new semapv:CompositeMatching justification into the mapping database to capture this.

Nov 21 '23 13:11 matentzn

The need to do build a more comprehensive model for confidence is rising with @cthoyt request #438

Mapping confidence has, the way I currently think of it, the following layers (this is a summary of the above):

Mapping confidence by owner:

individual mapping level (confidence): "the mapping author is X * 100 % certain that the mapping subject, predicate, object is correct"
mapping set level (#438): "the mapping publisher is X * 100 % certain of the general quality of the mapping set" (this can be, as @cthoyt something like an average confidence when using lexical matches, but often, this is simply a hand-wavey judgement).

Mapping confidence by registry

mapping set level (#264) mapping_registry_confidence. Independent of what the mapping set publisher things, the registry indexing the mapping set might have a different view on how trustworthy a mapping actually is. For example I might think that Uberon mappings are 95% correct, while I believe that OMIM mappings are 80% correct (just an example).

Mapping confidence by user rating

mapping level (property is TBD): Users of the mappings often spot errors in mappings. A simple model would be number of positive reviews (thumb up) divided by total number of reviews.

For considerations on how to calculate see comment above. The mapping set confidence will make the equation somewhat more complex, but probably not too much:

mapping confidence = `AVG(confidence)` * `mappings_set_confidence` * (thumbs-up/ratings) * `mapping_registry_confidence`.

This will of course never be normative, just spitballing here some thoughts for a future documentation page.

May 05 '25 07:05 matentzn

Totally agree with this! We had a similar discussion that made it into the SeMRA Design Document and many of these ideas are already implemented in SeMRA / described in its preprint

One small difference is that SeMRA uses a binomial model for aggregating confidences, so the confidence is

mapping confidence = 1 - (1 - AVG(confidence)) * (1 - `mappings_set_confidence`) * (1 - thumbs-up/ratings) * (1 - `mapping_registry_confidence`)

SeMRA didn't touch mapping reviews yet other than the incorporation of negative mappings - it's actually a bit more clear cut that if anyone goes as far as to say a mapping is incorrect, then it goes to 0% confidence immediately. I didn't build out the confidence model to assess the likelihood of negative assertions being true, as they are pretty final. We can use Biomappings as a source of (nontrivial) negative mappings for playing around with the math here

May 05 '25 08:05 cthoyt

negative mappings - it's actually a bit more clear cut that if anyone goes as far as to say a mapping is incorrect, then it goes to 0% confidence immediately

Err, I don’t see how is that clear cut, unless you are not talking about what I think you are.

When you say “negative mappings”, do you refer to mappings with a sssom:predicate_modifier set to Not? And do you consider that an “incorrect mapping”?.

Because I am absolutely confident than UBERON:0000981 is not an exact match to FBbt:00004644. And I could very well want to state exactly that in a “negative” mapping record with a Not predicate modifier and a 100% confidence value.

May 05 '25 10:05 gouttegd