yaml-ld
yaml-ld copied to clipboard
meld JSON Schema and JSON-LD context/frame
As an information architect. I want a harmonized way of specifying validation (JSON Schema) and semantic binding (JSON-LD context & frames). So that I can reap both benefits for my JSON and YAML data.
These are very complementary:
- JSON Schema specifies the shape of JSON data for validation
- JSON-LD specifies the binding of JSON data to semantics, and how to convert RDF<->JSON
What's the relation to YAML:
- JSON is trivially convertable to YAML
- JSON Schema can be used to validate YAML, see https://www.npmjs.com/package/pajv, https://github.com/json-schema-everywhere/pajv, https://json-schema-everywhere.github.io/
- Many people write their JSON schemas in YAML, eg OAS 3
This is a sub-UCR of #19, which itself:
- considers a wider context
- doesn't have a specific goal yet, i.e. is just informational
- considers simple data modeling languages based on YAML, wherein JSON Schema is derived but is not the source
"JSON Schema plus JSON-LD" is an especially relevant case for our community, thus this UCR
- @ioggstream "JSON-LD and JSON Schema... I travel these boundaries quite often":
- Wouldn't it be nice to "construct a smooth path" so you don't need to cross any boundaries, and can think more about your data model rather than the various modeling mechanisms?
Prior art
(from https://github.com/json-ld/yaml-ld/issues/2):
1: @OR13 often use OAS (Open API Specification) / YAML with JSON-LD and JSON Schema. I like the idea of controlling both semantics and data shape at the same time, using only 1 file. OAS supports JSON Schema represented in YAML.
We tweaked the JSON Schema to support JSON-LD terms ($linkedData), so now we can present RDF types and JSON Schema types in a single YAML file. This helps us keep semantics and security in sync (more discussion in https://github.com/json-ld/yaml-ld/issues/2#issuecomment-1137629452).
For example:
$linkedData:
term: AgActivity
'@id': https://w3id.org/traceability#AgActivity
title: Agricultural Activity
2: @ioggstream added new keywords (x-jsonld-context, x-jsonld-type) to be compatible with OAS 3.0.
- Modified Swagger editor that also does semantic mapping: https://ioggstream.github.io/swagger-editor/
- spec REST API Linked Data Keywords (23 June 2022), source
- whitepaper Add semantic context to APIs / Schemas
- presentation Self-explaining APIs: A machine-readable, semantic approach to schema design
- (to be) used by the extensive Italian network of ontologies and controlled vocabularies for the Public Administrations (OntoPiA): https://github.com/italia/daf-ontologie-vocabolari-controllati
Considerations
Modularization
TODO
Potential Conflicts
(from https://github.com/json-ld/yaml-ld/issues/51):
JSON Schema includes the following $ keywords:
$schema, $vocabulary, $defs, $ref, $id, $anchor, $comment, $dynamicRef, $dynamicAnchor
If we decide to use the same sigil for both kind of keywords, we should look out for conflicts
@idis a conflict with $id@vocabis a near-conflict with$vocabulary(i.e. could be confusing)
But maybe there is no problem if these keywords are localized to the Context vs Schema parts?
- After all,
@idis already "overloaded" in JSON-LD:
"@container": "@id" # Node Identifier Indexing
"@id": "bart" # Node identifier
"@id": {"@id": "bart", "age": 42} # triple, for which RDF-star annotations will follow
If we still choose the $ keyword for convenience context we might add a note about potential overlap with JSON Schema. It is possible that for use cases where JSON Schema is in play the users would want to choose something entirely different for their keywords.
💖 for @id and 🥑 for @type, for instance :) This is Unicode, users aren't even limited by ASCII.
Wouldn't it be nice to "construct a smooth path" so you don't need to cross any boundaries, and can think more about your data model rather than the various modeling mechanisms?
I tried extensively to achieve a "Theory of everything" but the point is that RDF is not designed to describe syntaxes, and JSON Schema is not designed to define semantics. In the API world, you define strict validation syntaxes for security reasons: this is not always the same thing you have with the generic rdfs:Class / rdfs:range ).
Moreover when creating e.g cross-border services between different countries, you may use the same rdfs:Class with different datatypes / syntaxes. Since this is the actual reality of deployed services, the only interoperable way I found is to address syntaxt and semantics in isolation but in co-operation.
In general, I think that this topic goes beyond the goal of yaml-ld and should probably addressed in JSON Schema and JSON-LD as a separate project (E.g. json-ld/restapi-ld-keywords).
A single file that can define both security constraints and semantics is useful.
However, it may very well be outside the capabilities or interest of this group...
This is the reason I am engaged here.
I'm interested in OAS in YAML with LD annotations.
We have a solution, but it could be better.
I'm not sure YAML-LD is really trying to solve the same problem, it seems more focused on RDF and less on API security.
@ioggstream so you sound quite pessimistic?
The problem with saying "this is out of scope" is that there's significant overlap between
- Schema to define the shape of data
- Context to define the mapping to URLs and datatypes
- Ontology to define the meaning of data (classes with their definitions, props with their definitions and domain/range)
If we don't address this UCR, how do you ensure that eg a prop URL is not misspelled between Schema, Context and Ontology?
Moreover when creating e.g cross-border services between different countries, you may use the same rdfs:Class with different datatypes / syntaxes. Since this is the actual reality of deployed services, the only interoperable way I found is to address syntaxt and semantics in isolation but in co-operation.
This seems like a counter argument. Overcoming regional differences is why you would include concise semantics in your syntax. Different jurisdictions will always have different requirements, use cases and APIs (not to mention languages). But they can align on the underlying semantics.
@VladimirAlexiev This presentation I hope to present at the next APISpec explains my view on JSON-LD vs JSON Schema https://docs.google.com/presentation/d/175ZFBXkhaawtvD97lU7II9-G2R6_a6XAMo82_BaYr4c/edit#slide=id.g14b35179850_0_8
It's not a trivial explanation, though.
I think a solution should work for both JSON-LD and YAML-LD so I won't address it here.
cc: @OR13