linkml icon indicating copy to clipboard operation
linkml copied to clipboard

`*_mappings` vs `slot_uri` and `class_uri`

Open ialarmedalien opened this issue 2 years ago • 5 comments

What is the difference between slot_uri (or class_uri) and the *_mappings fields, including exact_mappings? e.g. in the Person example, there are a couple of close and exact mappings, as well as examples of class and slot URIs:

  HasAliases:
    attributes:
      aliases:
        multivalued: true
        exact_mappings:
          - schema:alternateName
  Event:
    close_mappings:
      - schema:Event
  Organization:
    is_a: NamedThing
    class_uri: schema:Organization
slots:
  name:
    slot_uri: schema:name
  description:
    slot_uri: schema:description

If we pretend that the Event example was an exact mapping, would it be annotated using class_uri or exact_mappings? What happens if I have several terms (e.g. from different vocabularies) that could potentially be used as a class_uri or slot_uri?

Another example from the docs:

  gene:
    slots:
      ...
    exact_mappings:
      - SO:0000704
      - SIO:010035
      - WIKIDATA:Q7187

Why are these exact_mappings and not class_uris?

ialarmedalien avatar Aug 04 '22 18:08 ialarmedalien

Hi @ialarmedalien - thanks for this question, I am going to tag it as a good "FAQ" entry for us to add to the documentation.

class_uri points the user to a robust class definition defined somewhere on the web. When LinkML generates a jsonld-context file from your schema, the class_uri location will be used for the class in your model.

exact_mappings are slightly weaker than class_uri and allow you to avoid committing to completely reusing a linked data concept, whilst wanting to retain a mapping. Its also nice to be able to provide more than one exact_mapping, especially in a domain where several sources might have definitions of a concept, that for the purposes of your schema are exactly equivalent to your class definition.

Some additional information that might help:

sierra-moxon avatar Aug 04 '22 19:08 sierra-moxon

What does one do in the situation where there are several potential terms that could be the class_uri, particularly if you have already used a couple of different vocabularies for setting class_uri and slot_uri values? Just pick one at random?

The documentation and examples don't make it clear how class_uri / slot_uri differ from exact_mappings, which is why I raised this question. It's also unclear why class_uri and slot_uri are limited to cardinality one, apart from just for generating the jsonld version.

ialarmedalien avatar Aug 04 '22 21:08 ialarmedalien

It would be super helpful to see a specific example of your schema and/or the data domain that you are modeling to get a sense of the issue more clearly. But in general, if a class can have more than one definition, and it's difficult to pick just one, I would make use of the exact_mappings (or narrow, broad, related mappings) metamodeling slot to hold them. And while it's very good practice to make sure a single existing definition doesn't work, it could be that your definition in your schema is the unifying definition, and then it makes sense to have the URI for your class be defined by you.

Even if all that those mappings are used for is to help document the work you've done to understand that two or more definitions can be mapped together for a user, then it's a win.

@cmungall could definitely help sort out class_uri from mappings. @matentzn might also have some ideas here about mappings more specifically (he is an expert on the SSSOM standard that in part, aims to document and give provenance to mappings). @hrshdhgd

sierra-moxon avatar Aug 04 '22 22:08 sierra-moxon

I'm capturing dataset citation information from different sources, including NMDC, JGI, DataCite, CrossRef, OSTI, and others (I have posted about this on the Monarch slack in case this sounds familiar!). There are a number of fields that are shared by all the sources, such as dataset title, creator/contributor/authors, version, and ID. From what you have said, it will be more useful for me to maintain *_mappings to capture the equivalence between local terms and those used by other sources as my interest is in transforming / translating data and it's not as important to have definitive class_uri or slot_uris.

ialarmedalien avatar Aug 05 '22 14:08 ialarmedalien

My sense is, and I am far from the authority on this, that class_uri and slot_uri is relevant only to the RDF conversion/context use cases. It does not have any further implication. Indeed, a cleaner way to model this would have been:

exact_mappings:
  - id: SO:0000704
  - id: SIO:010035
  - id: WIKIDATA:Q7187
    preferred_for_export: TRUE

or some such, but in any case, I think this is what the class_uri means in effect.

matentzn avatar Aug 05 '22 17:08 matentzn

it seems like we have an answer for this ticket; I am going to close it, but of course feel free to reopen if I got that wrong @ialarmedalien :)

sierra-moxon avatar Dec 22 '22 19:12 sierra-moxon