Support rdfs:domain and rdfs:range in generated schema by `import-rdfs`
When generating a LinkML schema using schemauto import-rdfs the resulting LinkML schema does not incorporate the rdfs:domain and rdfs:range definitions.
E.g. the following excerpt from FOAF:
### http://xmlns.com/foaf/0.1/knows
foaf:knows rdf:type owl:ObjectProperty ;
rdfs:domain foaf:Person ;
rdfs:range foaf:Person ;
rdfs:comment "A person known by this person (indicating some level of reciprocated interaction between the parties)." ;
rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
rdfs:label "knows" .
### http://xmlns.com/foaf/0.1/Person
foaf:Person rdf:type owl:Class ;
rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> ,
foaf:Agent ;
owl:disjointWith foaf:Project ;
rdfs:comment "A person." ;
rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
rdfs:label "Person" .
results in the the following LinkML schema:
slots:
knows:
comments:
- A person known by this person (indicating some level of reciprocated interaction
between the parties).
slot_uri: foaf:knows
classes:
Person:
comments:
- A person.
is_a: Agent
class_uri: foaf:Person
I would expect that rdfs:domain and rdfs:range of foaf:knows property is incorporated like:
slots:
knows:
comments:
- A person known by this person (indicating some level of reciprocated interaction
between the parties).
slot_uri: foaf:knows
range: Person
classes:
Person:
comments:
- A person.
is_a: Agent
class_uri: foaf:Person
slots:
- knows
I believe I've fixed this in #152. Can you please test that branch to let me know if it resolves your issue?
@multimeric
I tested it with the latest commit (cfe7e15563ee1cf681b3eb266eb9ab8754f0c39c) of https://github.com/linkml/schema-automator/pull/152 and with that I do not get the expected results. However, looking at the diff it shows that you reverted some changes in schema_automator/importers/rdfs_import_engine.py when merging from https://github.com/linkml/schema-automator/pull/151, e.g.:
https://github.com/linkml/schema-automator/pull/152/commits/cfe7e15563ee1cf681b3eb266eb9ab8754f0c39c#diff-b6464c40227100611b000caf1086c351935db40bcf0f78ca606ca68d94e5ca3fL37-L43:
<<<<<<< HEAD
"domain_of": [HTTP_SDO.domainIncludes, SDO.domainIncludes, RDFS.domain],
"range": [HTTP_SDO.rangeIncludes, SDO.rangeIncludes, RDFS.range],
=======
"domain_of": [HTTP_SDO.domainIncludes, SDO.domainIncludes],
"rangeIncludes": [HTTP_SDO.rangeIncludes, SDO.rangeIncludes],
>>>>>>> cleanup-deps
After testing it with commit 65869ba4d7739302e352ac02a8354a5746e89f34 before the merge rdfs:domain and rdfs:range are incorporated expected!
Was the revert of that changes unintended?
Just two caveats:
-
the generated schema has
default_prefix: examplebut it is not defined inprefixes:prefixes: linkml: https://w3id.org/linkml/ dc: http://purl.org/dc/elements/1.1/ vs: http://www.w3.org/2003/06/sw-vocab-status/ns# owl: http://www.w3.org/2002/07/owl# wot: http://xmlns.com/wot/0.1/ foaf: http://xmlns.com/foaf/0.1/ rdfs: http://www.w3.org/2000/01/rdf-schema# default_prefix: example -
All datatype properties with
rdfs:range rdfs:Literalare generated withrange: Literale.g.slots: jabberID: comments: - A jabber ID for something. slot_uri: foaf:jabberID range: LiteralHowever
Literalis unrecognized and I am getting this error when trying to rungen-pythonwith this schema:gen-python foaf_schema.yaml ValueError: File "foaf_schema.yaml", line 21, col 12 slot: jabberID - unrecognized range (Literal)
Thanks for the report. It probably was just a faulty merge. I'll likely fix it early next week.
Okay, I've rebased and hopefully fixed the underlying issue.
@multimeric Thanks, I tried it with the latest commit and it includes now rdfs:domain and rdfs:range.
Only this two issues still persist:
the generated schema has default_prefix: example but it is not defined in prefixes:
prefixes: linkml: https://w3id.org/linkml/ dc: http://purl.org/dc/elements/1.1/ vs: http://www.w3.org/2003/06/sw-vocab-status/ns# owl: http://www.w3.org/2002/07/owl# wot: http://xmlns.com/wot/0.1/ foaf: http://xmlns.com/foaf/0.1/ rdfs: http://www.w3.org/2000/01/rdf-schema# default_prefix: exampleAll datatype properties with
rdfs:range rdfs:Literalare generated withrange: Literale.g.slots: jabberID: comments: - A jabber ID for something. slot_uri: foaf:jabberID range: LiteralHowever Literal is unrecognized and I am getting this error when trying to run gen-python with this schema:
gen-python foaf_schema.yaml ValueError: File "foaf_schema.yaml", line 21, col 12 slot: jabberID - unrecognized range (Literal)
Hmm, I can't replicate this Literal issue. If I schemauto import-rdfs on the following ttl:
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
foaf:knows rdf:type owl:ObjectProperty ;
rdfs:domain foaf:Person ;
rdfs:range foaf:Person ;
rdfs:comment "A person known by this person (indicating some level of reciprocated interaction between the parties)." ;
rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
rdfs:label "knows" .
foaf:Person rdf:type owl:Class ;
rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> ,
foaf:Agent ;
owl:disjointWith foaf:Project ;
rdfs:comment "A person." ;
rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
rdfs:label "Person" .
I get:
name: example
id: http://example.org/example
imports:
- linkml:types
prefixes:
linkml: https://w3id.org/linkml/
foaf: http://xmlns.com/foaf/0.1/
default_prefix: example
default_range: string
slots:
knows:
comments:
- A person known by this person (indicating some level of reciprocated interaction
between the parties).
slot_uri: foaf:knows
range: Person
classes:
Agent:
class_uri: foaf:Agent
SpatialThing:
class_uri: ns1:SpatialThing
Person:
comments:
- A person.
is_a: Agent
slots:
- knows
class_uri: foaf:Person
You're right that the default prefix is messed up, and I think I need some input from the maintainers on what to do about that, but to be honest you should always pass in a name and model_uri. The schema won't make much sense otherwise. So for foaf you would do something like:
poetry run schemauto import-rdfs --format xml http://xmlns.com/foaf/spec/index.rdf --schema-name foaf --model-uri http://xmlns.com/foaf/0.1/`
@multimeric @jo-fra - agree the default 'example' is confusing and we may just want to get rid of that in the automated step. But agree with @multimeric that passing in an actual value here is a great standard practice. We tend to think of schema-automator as a bootstrapping tool, that users will interact with to get them most of the way towards a working schema, but that they will have to edit to add finishing touches. Schema-automator is getting so much better with these fixes; thank you!
Okay I've just pushed a new change. Firstly, it removes the custom example default in the RDFS importer in favour of letting the SchemaBuilder handle it. Secondly, it tries to infer the schema metadata from RDF. Basically if the name is not provided explicitly, the most common prefix it finds becomes the name. If the id is not explicitly provided, then the corresponding URI becomes the ID. So for FOAF it would determine that foaf is used a ton in the document and therefore schema.name = "foaf" and schema.id = http://xmlns.com/foaf/0.1/.