hdt-cpp icon indicating copy to clipboard operation
hdt-cpp copied to clipboard

Literal with ^@ results in error

Open pheyvaer opened this issue 4 years ago • 4 comments

Hi,

When I try to convert the following Turtle file I get an error:

@prefix : <https://example.org/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:1 a :Project;
  rdfs:label "Escala^@"@es;
  :startYear "2015"^^xsd:gYear .

The error that I receive is

Input format not given. Guessing from file extension...
Detected RDF input format: ttl
Catch exception load: ERROR: Could not convert triple to IDS! 
https://example.org/1 http://www.w3.org/2000/01/rdf-schema#label "Escala
ERROR: ERROR: Could not convert triple to IDS! 
https://example.org/1 http://www.w3.org/2000/01/rdf-schema#label "Escala

I used the latest version of the develop branch and executed

./rdf2hdt /tmp/input.ttl /tmp/test.hdt -v

Interesting to note is that when removing the year from the data the error does not appear.

pheyvaer avatar Dec 20 '19 09:12 pheyvaer

Can you try with Serd? What's your Serd version?

mielvds avatar Dec 20 '19 09:12 mielvds

My Serd version is 0.30.2.

When executing the following

serdi input.ttl

I get

<https://example.org/1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://example.org/Project> .
<https://example.org/1> <http://www.w3.org/2000/01/rdf-schema#label> "Escala\u0000"@es .
<https://example.org/1> <https://example.org/startYear> "2015"^^<http://www.w3.org/2001/XMLSchema#gYear> .

So the ^@ gets converted to \u0000, so it looks like something goes wrong here, no?

pheyvaer avatar Dec 20 '19 09:12 pheyvaer

IIt seems the default parser in HDT is somehow activated, which does not take this case into account. However, I don't understand why Serd is not used here, although it could or maybe this parsing happens after.

mielvds avatar Dec 20 '19 10:12 mielvds

At least, things go wrong here: https://github.com/rdfhdt/hdt-cpp/blob/develop/libhdt/src/dictionary/PlainDictionary.cpp#L108 and here https://github.com/rdfhdt/hdt-cpp/blob/develop/libhdt/include/SingleTriple.hpp#L276

The Object is not inserted succesfully in the dictionary, but fails silently. Afterwards, when encountering the incomplete triple, the error is thrown.

mielvds avatar Dec 20 '19 10:12 mielvds