lightrdf icon indicating copy to clipboard operation
lightrdf copied to clipboard

lightrdf.Error: error while parsing IRI '': No scheme found in an absolute IRI

Open vadyushkins opened this issue 3 years ago • 2 comments

Hi @ozekik!

Thank you for the awesome library! :clap:

Unfortunately, while using your library, I got the error :bug: mentioned in the title. :disappointed: But using rdflib I was not getting a similar error. :thinking:

Environment

  • OS: Ubuntu 20.04
  • Python: 3.8.5
  • LightRDF: 0.2.1

Steps to reproduce.

  1. Download pathways archive.
wget -q https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/pathways.rdf.xz
  1. Unzip it using xz package.
sudo apt install xz-utils
unxz pathways.rdf.xz 
  1. Run count_triples_lightrdf_parser.py.
python3 count_triples_lightrdf_parser.py pathways.rdf
  1. Error log.
Traceback (most recent call last):
  File "count_triples_lightrdf_parser.py", line 8, in <module>
    for triple in parser.parse(sys.argv[1]):
lightrdf.Error: error while parsing IRI '': No scheme found in an absolute IRI

Please tell me where I am wrong. Thank you :pray:

vadyushkins avatar May 31 '21 18:05 vadyushkins

I'm sorry for the late reply. Thank you for the very clear report!

A quick solution is to specify base_iri of parse() to some absolute URI (like https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/pathways.rdf.xz):

import lightrdf
import sys

parser = lightrdf.Parser()

cnt = 0

for triple in parser.parse(sys.argv[1], base_iri="https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/pathways.rdf.xz"):
    cnt += 1

print(cnt)

More specifically, the problem is <owl:Ontology rdf:about=""> in pathways.rdf, for rdf:about="" means "the URI of the document containing the ontology" (as stated in OWL1/2 specs and in general), but there is no definitive URI/IRI for downloaded local files.

RDFLib avoids this problem by using the local IRI of the file (file:///.../pathways.rdf) for base IRI by default. We may make lightrdf do the same thing, but before that I'd like to investigate if it is reasonable.

ozekik avatar Aug 10 '21 17:08 ozekik

Thanks for your reply @ozekik!

I think the best solution would be to put your example in the README, as the base IRI default setting might be more confusing in other situations.

vadyushkins avatar Aug 16 '21 09:08 vadyushkins