Associate columns more clearly with each other in the data model
One of our helpful reviewers for the SSSOM paper noted that there is no explicit association between some of the columns that apply to each other. Like, there is no association between predicate_modifier and predicate_id, or object_mapping_field and object_id.
We removed these links for technical reasons a while back, but we should re-introduce them. As of now, there are only implicit in the documentation, not explicit in the model. This is in particular important w.r.t to match_type and tool related stuff like semantic_similarity_score etc, etc.
A typical approach taken in database and somehow largely adopted also by many SSSOM properties(https://mapping-commons.github.io/sssom/Mapping/) is to prefix the property name by the "entity" to which a property apply. Indeed:
- object_id
- object_label
- object_category
- object_type
- object_source
- object_source_version All relate to the object.
So things like semantic_similarity_score or semantic_similarity_measure could be prefixed maybe by "mapping_tool"
Wrt represent more complex things about predicate such as transitivity / symmetry (#136) then columns like :
- predicate_transitivity
- predicate_symetry taking boolean values could be used. If this is in conflict with the original (as defined by the original vocabulary defining the property/predicate) then it becomes up to the re-user to decide what to do.
This is a good idea. I am still playing with something more concrete, on schema level though (basically validation rules on schema level that establish the connection). This could avoid overly lengthy column/slot names..