rdf4j
rdf4j copied to clipboard
RDF lists created using RDFCollections.asRDF() are not properly serialized using TurtleWriter
Current Behavior
When serializing an RDF list created using RDFCollections.asRDF()
with TurtleWriter
the first segment of the list is not properly inlined as it is not recognized as a well-formed list.
The result looks as this:
@prefix ex: <http://example.com/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
ex:Cities ex:list [ a rdf:List;
rdf:first ex:NewYork;
rdf:rest (ex:Rio ex:Tokyo)
] .
Please notice the first part with explicit rdf:first
and rdf:rest
statements instead of the inlining of the list, which is done for the last two segments of the list.
This is caused by org.eclipse.rdf4j.rio.turtle.TurtleWriter.isWellFormedCollection()
not recognizing the rdf:type rdf:List
type statement for the first part of the list segment which is generated in RDFCollections.asRDF()
(or more specifically in org.eclipse.rdf4j.model.util.RDFCollections.consumeCollection(Iterable<?>, Resource, Consumer<Statement>, ValueFactory, Resource...)
).
A simple solution would be to adjust org.eclipse.rdf4j.rio.turtle.TurtleWriter.isWellFormedCollection()
to recognize and accept that statement when checking for additional statements within the list.
Expected Behavior
The proper visualization of the list would instead look like this:
@prefix ex: <http://example.com/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
ex:Cities ex:list (ex:NewYork ex:Rio ex:Tokyo) .
Steps To Reproduce
This test case reproduces the behavior:
package com.metaphacts.services.ontologies;
import static org.junit.Assert.assertEquals;
import java.io.StringWriter;
import java.util.List;
import org.eclipse.rdf4j.model.BNode;
import org.eclipse.rdf4j.model.IRI;
import org.eclipse.rdf4j.model.Model;
import org.eclipse.rdf4j.model.impl.TreeModel;
import org.eclipse.rdf4j.model.util.RDFCollections;
import org.eclipse.rdf4j.model.util.Values;
import org.eclipse.rdf4j.model.vocabulary.RDF;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.rio.Rio;
import org.eclipse.rdf4j.rio.WriterConfig;
import org.eclipse.rdf4j.rio.helpers.BasicWriterSettings;
import org.junit.Test;
public class RDFCollectionsTest {
@Test
public void testBNodeValuesInList() throws Exception {
final String ns = "http://example.com/ns#";
IRI newyork = Values.iri(ns, "NewYork");
IRI rio = Values.iri(ns, "Rio");
IRI tokyo = Values.iri(ns, "Tokyo");
IRI cities = Values.iri(ns, "Cities");
IRI exList = Values.iri(ns, "list");
BNode listHead = Values.bnode("n1");
Model data = new TreeModel();
data.setNamespace("ex", ns);
data.setNamespace("rdf", RDF.NAMESPACE);
RDFCollections.asRDF(List.of(newyork, rio, tokyo), listHead, data);
data.add(cities, exList, listHead);
String expected = "" +
"@prefix ex: <http://example.com/ns#> .\n" +
"@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n" +
"\n" +
"ex:Cities ex:list (ex:NewYork ex:Rio ex:Tokyo) .\n";
StringWriter stringWriter = new StringWriter();
WriterConfig config = new WriterConfig();
config.set(BasicWriterSettings.INLINE_BLANK_NODES, true);
config.set(BasicWriterSettings.PRETTY_PRINT, true);
Rio.write(data, stringWriter, RDFFormat.TURTLE, config);
String actual = stringWriter.toString();
// System.out.println("### ACTUAL ###");
// System.out.println(actual);
// System.out.println("#################\n");
// System.out.println("### EXPECTED ###");
// System.out.println(expected);
// System.out.println("#################\n");
assertEquals("The visual representation should be a proper list for all elements", expected, actual);
}
}
Adding this line before serializing the model as string makes the test work, but the fix should not remove that statement but rather recognize and accept it.
// deleting the type statement in the list would make it work
// [ a rdf:List ]
// data.remove(listHead, RDF.TYPE, RDF.LIST);
Version
3.7.4 (and likely earlier versions as well)
Are you interested in contributing a solution yourself?
Perhaps?
Anything else?
No response
In the ShaclSail I ended up writing my own rdf list implementation because of this issue. My fix was to not generate the type statements.
https://github.com/eclipse/rdf4j/blob/main/core/sail/shacl/src/main/java/org/eclipse/rdf4j/sail/shacl/ast/ShaclAstLists.java
This should be a simple fix in the TurtleWriter
without further side effects, so I would rather go for a real solution here instead of working around it (although we do currently have a workaround in our application until this is fixed, because the resulting RDF text looks rather weird). This seems just like an oversight when checking list whether they are well-formed.
I guess it doesn't hurt to have simplified solutions for special cases as you created for the ShaclSail
, but in general I'd rather reuse a single, well-tested and compliant implementation.
Would you be interested in making a PR with the fix and some tests?
If we do a minor change in consumeCollection Method in RDFCollections Class, we will be able to produce the desired output. consumeCollection is called in asRDF method by RDFCollections.asRDF(..).
public static void consumeCollection(Iterable<?> values, Resource head, Consumer<Statement> consumer,
ValueFactory vf,
Resource... contexts) {
Objects.requireNonNull(values, "input collection may not be null");
Objects.requireNonNull(consumer, "consumer may not be null");
Objects.requireNonNull(vf, "injected value factory may not be null");
Resource current = head != null ? head : vf.createBNode();
//Statements.consume(vf, current, RDF.TYPE, RDF.LIST, consumer, contexts);
//--just comment out this above line in order to get the desired output which is list items are properly inlined.
Iterator<?> iter = values.iterator();
while (iter.hasNext()) {
Object o = iter.next();
Value v = o instanceof Value ? (Value) o : Literals.createLiteralOrFail(vf, o);
Statements.consume(vf, current, RDF.FIRST, v, consumer, contexts);
if (iter.hasNext()) {
Resource next = vf.createBNode();
Statements.consume(vf, current, RDF.REST, next, consumer, contexts);
current = next;
} else {
Statements.consume(vf, current, RDF.REST, RDF.NIL, consumer, contexts);
}
}
}
public static void consumeCollection(Iterable<?> values, Resource head, Consumer<Statement> consumer,
Resource... contexts) {
consumeCollection(values, head, consumer, SimpleValueFactory.getInstance(), contexts);
}
public static <C extends Collection<Statement>> C asRDF(Iterable<?> values, Resource head, C sink,
Resource... contexts) {
Objects.requireNonNull(sink);
consumeCollection(values, head, sink::add, contexts);
return sink;
}
Output: @prefix ex: http://example.com/ns# . @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
ex:Cities ex:list (ex:NewYork ex:Rio ex:Tokyo) .
It would be a great help if someone could help me solve this problem from inside turtlewriter which will be the ideal fix for this issue. Ready to collaborate. @jetztgradnet
Thanks for picking this up @The-Nightwing ! I've left a few comments on your first PR, including a suggestion for a better fix in the TurtleWriter itself. Happy to talk further if anything is unclear.