define "slot_name" also known as "structured_comment_name"
The slot name is the LinkML attribute for the GSC MIXS attribute called "Structure comment name". The structured comment name is the name of a checklist item as it will appear in GenBank structured comments.
All LinkML slots have a name, even if it isn't explicitly asserted. For example, in this minimal schema:
id: http://example.com/minimal # range URI
name: minimal
default_prefix: minimal
prefixes:
minimal: http://example.com/minimal/
slots:
age:
required: true
description: the amount of time since something was created, born, etc.
is inferred to mean this
slots:
age:
name: age
description: the amount of time since something was created, born, etc.
from_schema: http://example.com/minimal
required: true
I never found a good way to describe that without getting into YAML jargon
Our task is to determine which LinkML slot naming practices we are going to follow, and whether we are going to claim any more rigorous constraints as a matter of GSC/MIxS policy, or in order to better interoperate with our partner systems.
The LinkML documentation (https://linkml.io/linkml-model/latest/docs/name/) says that the range of the name metaslot is string, so theoretically any number of characters of any type could go in there. The YAML specification (https://yaml.org/spec) requires that many non-alphanumeric characters must be quoted is they are going to be used in keys names.
One consideration for naming is that LinkML supports conversion of the schema and data to many different formats and serializations, and a poor choice of names can block us from using one or more of those formats, or create a serialization in which the name is represented differently from the YAML source of truth.
If a LinkML YAML file has a slot named age of thing and an attempt is made to convert it to OWL
linkml generate owl minimal.yaml
then the slot name is silently repaired
minimal:age_of_thing a owl:ObjectProperty ;
rdfs:label "age of thing" ;
skos:definition "the amount of time since something was created, born, etc." ;
skos:inScheme <http://example.com/minimal> .
the LinkML linter is good a finding violations of safe naming practices, and the https://linkml.io/linkml/schemas/linter.html#standard-naming documentation page documents that the standard rule for slot names is snake_case
It doesn't explicitly say that digits and punctuation shouldn't be used as the initial character, but those are definitely examples of things that would cause disconnects between the YAML source of truth and derived artifacts... possibly even in the documentation pages!
I have heard it said that INSDC attribute lengths must be 20 characters or shorter, but the following documentation
https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/
is full of much longer attributes, like "biospecimen_repository_sample_id" at 32 characters
We should also talk about policies around standardizing the underscore-separated tokens that make up a MIxS term name.
@mslarae13 has pointed out that there seems to be a lot of overlap between "regm" and "treat" slots
- chem_administration
- agrochem_addition
- chem_treatment
- pesticide_treatment
- antibiotic_treatment
- food_treat_proc
- antibiotic_regm
- fungicide_regm
- radiation_regm
- rainfall_regm
- herbicide_regm
- pesticide_regm
- fertilizer_regm