Enhance enums
We have the option of providing linkml:meaning to PVs in enums
Note: this does affect RDF serialization - a URI is used rather than string literal, so if we do this it should be < 1.0.0 release
predicate_modifier_enum:
permissible_values:
Not: Negating the mapping predicate. The meaning of the triple becomes subject_id is not a predicate_id match to object_id.
some vocabs like SNOMED include a code for NO but this is overkill I think
We could use owl:complementOf or NegativePropertyAssertion but this doesn't seem quite right
mapping_cardinality_enum:
permissible_values:
"1:1": One-to-one mapping
"1:n": One-to-many mapping
"n:1": Many-to-one mapping
"1:0": One-to-none mapping
"0:1": None-to-one mapping
"n:n": Many-to-many mapping
not aware of any vocabulary
match_type_enum:
permissible_values:
Lexical: Lexical match
Logical: Logical match
HumanCurated: Match based on human expert opinion
Complex: Match based on a variety of different strategies
Unspecified: Unknown match type
SemanticSimilarity: Match based on close semantic similarity
not aware of a single vocab but could potentially be added to EDAM?
match_term_type_enum:
permissible_values:
TermMatch: A match between two terms
ConceptMatch: A match between two SKOS concepts
ClassMatch: A match between two OWL/RDFS classes
ObjectPropertyMatch: A match between two OWL object properties
IndividualMatch: A match between two OWL Individuals
DataPropertyMatch: A match between two OWL object properties
not aware... but I think this enum needs more docs.
A match between a Class and OP is valid - e.g. many classes in PATO are famously reifications of RO properties
and where is AnnotationProperty?
What about a match between two distinct property types? Perfectly valid. And what if we match something like a skos property, which we are punning?
I think we would just use TermMatch as a generic catch-all? This is fine but maybe say this explicitly
preprocessing_method_enum:
permissible_values:
Stemming:
TaxonRestrictionRemoval:
not aware of a vocabulary
options
- try harder to find terms, maybe engage EDAM, SWO
- leave with no meaning, but adorn with mappings later (will still be serialized as literals)
- make our own IDs, in the SSSOM namespace, with the option of a standalone ontology later
If we want to do 3, we should do before the 1.0 release
A shoot, I did that 1.0 release now (deleted it again). I didn't see your issue before. I will think about this more tomorrow.
This is a bit more intense than I was hoping for 1 second before I wanted to release 1.0.
mapping_cardinality_enum:
permissible_values:
"1:1":
description: One-to-one mapping
meaning: sssom:OneToOne
"1:n": One-to-many mapping
"n:1": Many-to-one mapping
"1:0": One-to-none mapping
"0:1": None-to-one mapping
"n:n": Many-to-many mapping
I think there needs to be a way to encode the enum mapping curation rules, for each set of enums. The combination of mapping rules in the file header and in the columns together should provide the meaning and provenance of each mapping. We may need a separate file or schema for this, as the current SSSOM files really have focused on the term-to-term mappings and not encoding list-to-list rules/provenance.
I think there needs to be a way to encode the enum mapping curation rules, for each set of enums.
@mellybelly can you give a concrete example? I have a hard time imagining how such a curation rule would look like (also because it is late here :))
An example would help me too. I'm afraid I'm a little lost about how list-to-list rules/provenance differs from term-to-term mappings. I may be at the wrong level of abstraction, as I'm thinking of list entries as just another type of term ('everything is an IRI' in my simple world). Or, feel free to say RTFM if that's the issue.
This issue is about the enums used in the metamodel, so it's largely separately from any issues about how users may map their own enums, value sets, whatever.
Specifically:
- is there an existing ontology we can reuse URIs for concepts such as "many to many mapping" or "lexical match"
I think the curation rules thing is important - I'm also not clear on this but I think it deserves its own issue