robot icon indicating copy to clipboard operation
robot copied to clipboard

Catalog not used for --input-iri

Open psiotwo opened this issue 3 years ago • 5 comments

Catalog file does not seem to be used for --input-iri (tested on query command). Based on the documentation I am not completely sure if it is a bug, or intended behaviour. Yet, for my use-case I would like to use the provided catalog for any IRI around (i.e. not only in owl:imports statements, but also in --input-iri parameters).


Example:

test.owl:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
<http://test.org/test.owl> a owl:Ontology ;
    owl:imports <http://test.org/test-imported.owl> .

test-imported.owl:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
<http://test.org/test-imported.owl> a owl:Ontology .

test.rq: ASK {}

catalog-custom-name.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<catalog prefer="public" xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <uri name="http://test.org/test-imported.owl" uri="test-imported.owl"/>
</catalog>

The following commands succeeds robot query --catalog catalog-custom-name.xml --input test.owl --query test.rq output.csv robot query --input test-imported.owl --query test.rq output.csv

while the following fails robot query --catalog catalog-custom-name.xml --input-iri http://test.org/test-imported.owl --query test.rq output.csv

with org.semanticweb.owlapi.io.OWLOntologyCreationIOException: test.org

psiotwo avatar Jul 20 '22 06:07 psiotwo

I think this is by design, you are assumed to know the paths or urls of your inputs. I can see the use case for using logical names instead but I think technically introducing this could be breaking behavior if introduced?

On Tue, Jul 19, 2022 at 11:59 PM Petr Křemen @.***> wrote:

Catalog file does not seem to be used for --input-iri (tested on query command). Based on the documentation I am not completely sure if it is a bug, or intended behaviour. Yet, for my use-case I would like to use the provided catalog for any IRI around (i.e. not only in owl:imports statements, but also in --input-iri parameters).

Example:

test.owl:

@prefix owl: http://www.w3.org/2002/07/owl# . http://test.org/test.owl a owl:Ontology ; owl:imports http://test.org/test-imported.owl .

test-imported.owl:

@prefix owl: http://www.w3.org/2002/07/owl# . http://test.org/test-imported.owl a owl:Ontology .

test.rq: ASK {}

catalog-custom-name.xml

The following commands succeeds robot query --catalog catalog-custom-name.xml --input test.owl --query test.rq output.csv robot query --input test-imported.owl --query test.rq output.csv

while the following fails 'robot query --catalog catalog-custom-name.xml --input-iri http://test.org/test-imported.owl --query test.rq output.csv'

with org.semanticweb.owlapi.io.OWLOntologyCreationIOException: test.org

— Reply to this email directly, view it on GitHub https://github.com/ontodev/robot/issues/1030, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOINFI3EKIPVJ3AFVRTVU6PTZANCNFSM54COVUJA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

cmungall avatar Jul 20 '22 14:07 cmungall

Yes, this is intended behaviour. Checking for a local catalogue file is cheap and fast, and matches Protege's behaviour. So ROBOT makes a guess and checks for local catalogs without asking. Making a network request based on a guess could take seconds and cause other unexpected behaviour.

@psiotwo are you proposing that when using --input-iri http://example.com/foo.owl we should interpret --catalog cat.xml as relative, so http://example.com/cat.xml? That seems slightly too "magical" to me. I doubt that anybody is using a remote input with a local catalogue, but changing this behaviour would break that.

Maybe we could add an explicit --catalog-iri http://example.com/cat.xml option.

jamesaoverton avatar Jul 20 '22 15:07 jamesaoverton

@jamesaoverton I think you're responding to something different. To me the request is totally sensible. An IRI input specified via --input-iri is currently directly requested, but IRIs in its imports are first mapped to URLs using the provided catalog, then requested. The IRI given to --input-iri should first be mapped to a URL via the catalog before it's requested. (I didn't know this was the case, just believing the report so far).

I think it's just a bug. The catalog is added to the manager before loading, but then an IRIDocumentSource is created, which I guess forces it to treat the provided IRI as a physical location:

https://github.com/ontodev/robot/blob/c7071a76178967639af30d2f56d5ca05644230b7/robot-core/src/main/java/org/obolibrary/robot/IOHelper.java#L513

balhoff avatar Jul 20 '22 17:07 balhoff

I think this is by design, you are assumed to know the paths or urls of your inputs. I can see the use case for using logical names instead but I think technically introducing this could be breaking behavior if introduced?

@cmungall If the current behaviour is not a bug :-) then probably yes - it might happen that if the --input-iri is dereferenceable, after my proposal, it can be rewired to another URL based on the mapping provided by the local catalog. (Yet, using the same IRI for different resources is a bad practice anyway.)

In case the current behaviour is correct, at least the parameter name --input-iri is misleading (should be --input-url).

psiotwo avatar Jul 21 '22 08:07 psiotwo

Yes, this is intended behaviour.

@jamesaoverton my point was different (not remote catalogs, but using local catalogs for resolving physical URLs of --input-iri parameters), as @balhoff points out.

psiotwo avatar Jul 21 '22 08:07 psiotwo

Checked. Works in 1.9.1.

psiotwo avatar Oct 31 '22 17:10 psiotwo