robot
robot copied to clipboard
Catalog not used for --input-iri
Catalog file does not seem to be used for --input-iri (tested on query command). Based on the documentation I am not completely sure if it is a bug, or intended behaviour. Yet, for my use-case I would like to use the provided catalog for any IRI around (i.e. not only in owl:imports statements, but also in --input-iri parameters).
Example:
test.owl:
@prefix owl: <http://www.w3.org/2002/07/owl#> .
<http://test.org/test.owl> a owl:Ontology ;
owl:imports <http://test.org/test-imported.owl> .
test-imported.owl:
@prefix owl: <http://www.w3.org/2002/07/owl#> .
<http://test.org/test-imported.owl> a owl:Ontology .
test.rq:
ASK {}
catalog-custom-name.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<catalog prefer="public" xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<uri name="http://test.org/test-imported.owl" uri="test-imported.owl"/>
</catalog>
The following commands succeeds
robot query --catalog catalog-custom-name.xml --input test.owl --query test.rq output.csv
robot query --input test-imported.owl --query test.rq output.csv
while the following fails
robot query --catalog catalog-custom-name.xml --input-iri http://test.org/test-imported.owl --query test.rq output.csv
with org.semanticweb.owlapi.io.OWLOntologyCreationIOException: test.org
I think this is by design, you are assumed to know the paths or urls of your inputs. I can see the use case for using logical names instead but I think technically introducing this could be breaking behavior if introduced?
On Tue, Jul 19, 2022 at 11:59 PM Petr Křemen @.***> wrote:
Catalog file does not seem to be used for --input-iri (tested on query command). Based on the documentation I am not completely sure if it is a bug, or intended behaviour. Yet, for my use-case I would like to use the provided catalog for any IRI around (i.e. not only in owl:imports statements, but also in --input-iri parameters).
Example:
test.owl:
@prefix owl: http://www.w3.org/2002/07/owl# . http://test.org/test.owl a owl:Ontology ; owl:imports http://test.org/test-imported.owl .
test-imported.owl:
@prefix owl: http://www.w3.org/2002/07/owl# . http://test.org/test-imported.owl a owl:Ontology .
test.rq: ASK {}
catalog-custom-name.xml
The following commands succeeds robot query --catalog catalog-custom-name.xml --input test.owl --query test.rq output.csv robot query --input test-imported.owl --query test.rq output.csv
while the following fails 'robot query --catalog catalog-custom-name.xml --input-iri http://test.org/test-imported.owl --query test.rq output.csv'
with org.semanticweb.owlapi.io.OWLOntologyCreationIOException: test.org
— Reply to this email directly, view it on GitHub https://github.com/ontodev/robot/issues/1030, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOINFI3EKIPVJ3AFVRTVU6PTZANCNFSM54COVUJA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Yes, this is intended behaviour. Checking for a local catalogue file is cheap and fast, and matches Protege's behaviour. So ROBOT makes a guess and checks for local catalogs without asking. Making a network request based on a guess could take seconds and cause other unexpected behaviour.
@psiotwo are you proposing that when using --input-iri http://example.com/foo.owl we should interpret --catalog cat.xml as relative, so http://example.com/cat.xml? That seems slightly too "magical" to me. I doubt that anybody is using a remote input with a local catalogue, but changing this behaviour would break that.
Maybe we could add an explicit --catalog-iri http://example.com/cat.xml option.
@jamesaoverton I think you're responding to something different. To me the request is totally sensible. An IRI input specified via --input-iri is currently directly requested, but IRIs in its imports are first mapped to URLs using the provided catalog, then requested. The IRI given to --input-iri should first be mapped to a URL via the catalog before it's requested. (I didn't know this was the case, just believing the report so far).
I think it's just a bug. The catalog is added to the manager before loading, but then an IRIDocumentSource is created, which I guess forces it to treat the provided IRI as a physical location:
https://github.com/ontodev/robot/blob/c7071a76178967639af30d2f56d5ca05644230b7/robot-core/src/main/java/org/obolibrary/robot/IOHelper.java#L513
I think this is by design, you are assumed to know the paths or urls of your inputs. I can see the use case for using logical names instead but I think technically introducing this could be breaking behavior if introduced?
@cmungall If the current behaviour is not a bug :-) then probably yes - it might happen that if the --input-iri is dereferenceable, after my proposal, it can be rewired to another URL based on the mapping provided by the local catalog. (Yet, using the same IRI for different resources is a bad practice anyway.)
In case the current behaviour is correct, at least the parameter name --input-iri is misleading (should be --input-url).
Yes, this is intended behaviour.
@jamesaoverton my point was different (not remote catalogs, but using local catalogs for resolving physical URLs of --input-iri parameters), as @balhoff points out.
Checked. Works in 1.9.1.