Server.Java Blank node skolemization

HDT files with blank nodes generate invalid turtle because of blank nodes. Can we create a skolemized Model somehow?

Dec 02 '15 09:12 mielvds

On the one hand, this seems to be a problem with the HDT library not correctly converting blank nodes to the corresponding Jena representation. On the other hand, the TPF spec clearly says that components must not be blank nodes, so we should indeed skolemize them in any case, like the JavaScript implementation does.

Dec 02 '15 09:12 RubenVerborgh

This might help, although reprocessing all nodes doesn't seem very performant.

Dec 02 '15 09:12 mielvds

We should be able to do the same as in the JavaScript code by just changing this function (probably on the base class level even).

Dec 02 '15 09:12 RubenVerborgh

Right, that would be fairly easy, but it would be datasource specific though... Another option would be to create a decorator for https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/riot/WriterGraphRIOT.html

Dec 02 '15 09:12 mielvds

Well, in the JavaScript version, it's implemented on the base class, so not source-specific. I still think this is possible. The additional complexity here is that dictionary.getNode seems to have a bug so that it does not return blank nodes but IRIs that start with _:. The generic solution would be to work around that, and then everything works with a generic base method (or RIOT decorator). But then we have some performance loss, because it would first (incorrectly) convert to IRI, then to blank, then to IRI again. So it might be best to have a one-off solution here for performance.

Dec 02 '15 09:12 RubenVerborgh

So we'll have to improve the java HDT code no matter what, which gives us the opportunity to move to Jena 3

Dec 02 '15 09:12 mielvds

Hi all, I just stumbled over this and it still seems to be an issue. I'm working on some other things in Server.java (including support for quad formats) so if you can give me any hints on how to fix this issue, I can give it a try. A related note: The TPF specification says that bNodes SHOULD be skolemized, not that it is mandatory. Does anyone here know if e. g. comunica requires bNodes to be skolemized? And for TPF to work with bNodes, a TPF server MUST have bNode identifiers that are consistent over consecutive requests, I don't think that's explicit in the spec. Thanks, Lars

Jan 10 '19 17:01 larsgsvensson

As per https://github.com/comunica/comunica/issues/375 the spec now says that data triples MUST NOT contain blank nodes and that the RECOMMENDED way of removing them is skolemization.

Jan 18 '19 08:01 larsgsvensson

Is there such a thing as a conformance test suite for TPF servers?

Jan 18 '19 09:01 larsgsvensson

Not yet unfortunately, but that would indeed be very nice to have.

Jan 18 '19 09:01 RubenVerborgh

Sounds like fun. You can assign it to me, I'll try it as friday afternoon thing

Jan 18 '19 10:01 mielvds