Wikidata-Toolkit
Wikidata-Toolkit copied to clipboard
Illegal characters in URLs in NTriple export
The NTriples export contains URLs that use the character "|", which is not allowed in NTriples. For example, the export contains the following triple:
<http://www.wikidata.org/entity/R4a809378d29bcd83ef35faaed4415cfa> <http://www.wikidata.org/entity/P854r> <http://viaf.org/processed/BNC|a10176597> .
It is based on the reference given for the "Gran Enciclopèdia Catalana ID" of Virgil
It needs to be investigated why OpenRDF is failing to encode this properly, and what can be done to avoid this. Also, there might be further issues (a user suggested that "{" also occurs somewhere in a URL). If OpenRDF does not work as expected here, it might actually be easier to implement the NTriples grammar directly instead of using OpenRDF in this case. Another possibility is that we are somehow using OpenRDF wrongly.
@w013ccw Is this the issue with character encoding you were referring to? Do you know about additional encoding issues?