eventkg
eventkg copied to clipboard
Dataset .nq files are not valid n-quads
Hello! According to this specification of n-quads, character @
can be encountered only in literals, i.e. directives (lines starting with @
) are not allowed in .nq files. This makes it impossible to import EventKG in triplestores with strict parsers, such as Apache Jena and neo4j (with plugin). The only triplestore I managed to get this imported to is OpenLink Virtuoso (the same you use to serve the endpoint I suppose), but it lacks the features I need and generally venderlocking to one implementation is not good. Is there any way to solve this issue? I haven't run the pipeline manually, but if you can tell me whether this is solvable and how I can fix this, I'd be happy to submit a pull request.
Also I believe that whenever typed literals are defined, xsd prefix must be defined too. It is defined in schema.ttl, but not in void.ttl, for example.
Hi! Thanks a lot for you advice. I did my own test and loaded the files with Apache Jena and can confirm both of your problems. You were right with your assumption that I am using Virtuoso as the triplestore which seems to be much more tolerant.
I will update the code to create valid NQ files. Are you planning to run the code yourself or are you just interested in the valid data?
Hi, thanks a lot for the reply. I am currently interested in the valid data only, but I could run the code in my future work.
Alright. I am currently preparing a new version of EventKG (2.1), with current data and some minor corrections and extensions. I will also provide the valid .nq files then. However, this will take some time (one week maybe?) I'll let you know when it's done.).
If you need a fast solution, you could instead just transform the existing .nq files with simple string operations (replacement of the prefix namespaces with the actual URLs).