science-on-schema.org
science-on-schema.org copied to clipboard
variableMeasured: should SOSO recommend specific property ontologies?
A key to the utility of the schema:PropertyValue interoperability is consistent specification of what the the property is about, as specified by schema:propertyID. Does SOSO want to recommend a specific vocabulary for better interoperability?
Some Property vocabularies
Property URIs that might be used in the so:propertyID field.
- CF names CF standard name table; names have an id, but no namespace appears to be defined, so there aren't dereferenceable URIs it seems.
- SWEET has a set of property labels with some hierarchical structure providing weak semantics, but noe explicit definitions in the http://sweetontology.net/prop/Property namespace. Properties are at the conceptual level, e.g. 'total alkalinity', 'pH', 'precision', 'temperature range'.
- LTER Measurements Provides A word net, similar to SWEET, but doesn't provide URI or definition. All term URIs are query fragments on the 'vocab/index.php' resource.
- Structured Variable ontology (SVO) SVO provides a vocabulary of measured variables based on its model, it is apparently only accessible on a web page with URIs that are html fragment identifiers.
- Wikidata
- QUDT quanityKinds
- Minimum Information about any (x) Sequence (MIxS) the GSC family of minimum information standards –
- USGS NWIS parameters.
- Scientific Variables Ontology
- US EPA substance registry
We also use measurement type classes for variable annotations from:
- ENVO
- ECSO
- Core observation concepts from OBOE and SSN/SOSA
- And many OBO Foundry vocabularies
links to measurement types in ENVO and ECSO
- ENVO measurement types, subclasses of quality?
- The Ecosystem Ontology (ECSO) Measurement Type
I think any legitimate publicly accepted and accessible semantic resource should be fair game - that would include W3C-type ontologies, OBOFoundry ontologies, SWEET, and the other resources mentioned above. Since there are different communities involved they need to be able to specify their own terms!
Suggest closing with a decision of NO
The challenge is that if there is not a recommended vocabulary to use, interoperability becomes problematic.
This is true... but interoperability is already more than problematic. I don't have a problem with particular communities who want to play nicely together to jointly agree on some term set. The issue becomes harder when dealing with interoperability across communities where only a few fundamental concepts may be held in common. Classic example I'm dealing with at the moment. Biology community insisting that species name is a relevant discovery parameter; which doesn't sit well with the Astrophysicists in the community who insist on other things like Stellar Classification (O, B, A, F, G, K, M). Both could probably agree on an object type of some sort (astronomical entity, biological organism) but not much below that.
Good point. I guess its a question the scope of users you want to accommodate. For a cross domain situation (e.g. trying to work with biologists and astronomers ...) you'd probably need to think of that property as EntityType (i.e. what is this data record about). Its data type would be categorical, and the biologist could specify the range to be a vocabulary of taxa, and the astronomer would specify a vocabulary of star types. One suggestion that has been made is that the property ID could have multiple values, to account for different levels of granularity in the DDI variable cascade. (e.g. EntityType, stellar class, stellarClassCode from a particular codebook) On the other hand, if the metadata is only scoped to a single community, then this isn't necessary.
I generally agree that it is not possible, or necessarily desirable, to limit the set of permitted observed-property/variable-measured values to one or even a small number of vocabularies.
While it is useful to model good practice, the market will decide. While there is some activity to get some community agreement on these things (e.g. I-ADOPT in RDA), there will always be legitimate technical variations, particularly between disciplines, and also innovations.
Maybe the operational question here is 'what is the cardinality of propertyID'. Allowing multiple values would allow for at least some accommodation for' legitimate technical variations, particularly between disciplines, and also innovations.'.