rdf4j icon indicating copy to clipboard operation
rdf4j copied to clipboard

RDF lists created using RDFCollections.asRDF() are not properly serialized using TurtleWriter

Open jetztgradnet opened this issue 2 years ago • 5 comments

Current Behavior

When serializing an RDF list created using RDFCollections.asRDF() with TurtleWriter the first segment of the list is not properly inlined as it is not recognized as a well-formed list.

The result looks as this:

@prefix ex: <http://example.com/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

ex:Cities ex:list [ a rdf:List;
      rdf:first ex:NewYork;
      rdf:rest (ex:Rio ex:Tokyo)
    ] .

Please notice the first part with explicit rdf:first and rdf:rest statements instead of the inlining of the list, which is done for the last two segments of the list.

This is caused by org.eclipse.rdf4j.rio.turtle.TurtleWriter.isWellFormedCollection() not recognizing the rdf:type rdf:List type statement for the first part of the list segment which is generated in RDFCollections.asRDF() (or more specifically in org.eclipse.rdf4j.model.util.RDFCollections.consumeCollection(Iterable<?>, Resource, Consumer<Statement>, ValueFactory, Resource...)). A simple solution would be to adjust org.eclipse.rdf4j.rio.turtle.TurtleWriter.isWellFormedCollection() to recognize and accept that statement when checking for additional statements within the list.

Expected Behavior

The proper visualization of the list would instead look like this:

@prefix ex: <http://example.com/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

ex:Cities ex:list (ex:NewYork ex:Rio ex:Tokyo) .

Steps To Reproduce

This test case reproduces the behavior:

package com.metaphacts.services.ontologies;

import static org.junit.Assert.assertEquals;

import java.io.StringWriter;
import java.util.List;

import org.eclipse.rdf4j.model.BNode;
import org.eclipse.rdf4j.model.IRI;
import org.eclipse.rdf4j.model.Model;
import org.eclipse.rdf4j.model.impl.TreeModel;
import org.eclipse.rdf4j.model.util.RDFCollections;
import org.eclipse.rdf4j.model.util.Values;
import org.eclipse.rdf4j.model.vocabulary.RDF;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.rio.Rio;
import org.eclipse.rdf4j.rio.WriterConfig;
import org.eclipse.rdf4j.rio.helpers.BasicWriterSettings;
import org.junit.Test;

public class RDFCollectionsTest {
    @Test
    public void testBNodeValuesInList() throws Exception {
        final String ns = "http://example.com/ns#";
        IRI newyork = Values.iri(ns, "NewYork");
        IRI rio = Values.iri(ns, "Rio");
        IRI tokyo = Values.iri(ns, "Tokyo");
        IRI cities = Values.iri(ns, "Cities");
        IRI exList = Values.iri(ns, "list");
        BNode listHead = Values.bnode("n1");
        
        Model data = new TreeModel();
        data.setNamespace("ex", ns);
        data.setNamespace("rdf", RDF.NAMESPACE);
        RDFCollections.asRDF(List.of(newyork, rio, tokyo), listHead, data);
        data.add(cities, exList, listHead);

        String expected = "" +
                "@prefix ex: <http://example.com/ns#> .\n" +
                "@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n" +
                "\n" +
                "ex:Cities ex:list (ex:NewYork ex:Rio ex:Tokyo) .\n";

        StringWriter stringWriter = new StringWriter();
        WriterConfig config = new WriterConfig();
        config.set(BasicWriterSettings.INLINE_BLANK_NODES, true);
        config.set(BasicWriterSettings.PRETTY_PRINT, true);
        Rio.write(data, stringWriter, RDFFormat.TURTLE, config);
        String actual = stringWriter.toString();
        
//        System.out.println("### ACTUAL ###");
//        System.out.println(actual);
//        System.out.println("#################\n");
        
//        System.out.println("### EXPECTED ###");
//        System.out.println(expected);
//        System.out.println("#################\n");

        assertEquals("The visual representation should be a proper list for all elements", expected, actual);
    }
}

Adding this line before serializing the model as string makes the test work, but the fix should not remove that statement but rather recognize and accept it.

        // deleting the type statement in the list would make it work
        // [ a rdf:List ]
        // data.remove(listHead, RDF.TYPE, RDF.LIST);

Version

3.7.4 (and likely earlier versions as well)

Are you interested in contributing a solution yourself?

Perhaps?

Anything else?

No response

jetztgradnet avatar Feb 15 '22 10:02 jetztgradnet

In the ShaclSail I ended up writing my own rdf list implementation because of this issue. My fix was to not generate the type statements.

https://github.com/eclipse/rdf4j/blob/main/core/sail/shacl/src/main/java/org/eclipse/rdf4j/sail/shacl/ast/ShaclAstLists.java

hmottestad avatar Feb 15 '22 11:02 hmottestad

This should be a simple fix in the TurtleWriter without further side effects, so I would rather go for a real solution here instead of working around it (although we do currently have a workaround in our application until this is fixed, because the resulting RDF text looks rather weird). This seems just like an oversight when checking list whether they are well-formed.

I guess it doesn't hurt to have simplified solutions for special cases as you created for the ShaclSail, but in general I'd rather reuse a single, well-tested and compliant implementation.

jetztgradnet avatar Feb 15 '22 15:02 jetztgradnet

Would you be interested in making a PR with the fix and some tests?

hmottestad avatar Feb 15 '22 16:02 hmottestad

If we do a minor change in consumeCollection Method in RDFCollections Class, we will be able to produce the desired output. consumeCollection is called in asRDF method by RDFCollections.asRDF(..).

public static void consumeCollection(Iterable<?> values, Resource head, Consumer<Statement> consumer,
			ValueFactory vf,
			Resource... contexts) {
		Objects.requireNonNull(values, "input collection may not be null");
		Objects.requireNonNull(consumer, "consumer may not be null");
		Objects.requireNonNull(vf, "injected value factory may not be null");

		Resource current = head != null ? head : vf.createBNode();

		//Statements.consume(vf, current, RDF.TYPE, RDF.LIST, consumer, contexts); 
               //--just comment out this above line in order to get the desired output which is list items are properly inlined.

		Iterator<?> iter = values.iterator();
		while (iter.hasNext()) {
			Object o = iter.next();
			Value v = o instanceof Value ? (Value) o : Literals.createLiteralOrFail(vf, o);
			Statements.consume(vf, current, RDF.FIRST, v, consumer, contexts);
			if (iter.hasNext()) {
				Resource next = vf.createBNode();
				Statements.consume(vf, current, RDF.REST, next, consumer, contexts);
				current = next;
			} else {
				Statements.consume(vf, current, RDF.REST, RDF.NIL, consumer, contexts);
			}
		}
	}
	public static void consumeCollection(Iterable<?> values, Resource head, Consumer<Statement> consumer,
			Resource... contexts) {
		consumeCollection(values, head, consumer, SimpleValueFactory.getInstance(), contexts);
	}

	public static <C extends Collection<Statement>> C asRDF(Iterable<?> values, Resource head, C sink,
			Resource... contexts) {
		Objects.requireNonNull(sink);
		consumeCollection(values, head, sink::add, contexts);
		return sink;
	}

Output: @prefix ex: http://example.com/ns# . @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .

ex:Cities ex:list (ex:NewYork ex:Rio ex:Tokyo) .


It would be a great help if someone could help me solve this problem from inside turtlewriter which will be the ideal fix for this issue. Ready to collaborate. @jetztgradnet

The-Nightwing avatar May 01 '22 16:05 The-Nightwing

Thanks for picking this up @The-Nightwing ! I've left a few comments on your first PR, including a suggestion for a better fix in the TurtleWriter itself. Happy to talk further if anything is unclear.

abrokenjester avatar May 13 '22 23:05 abrokenjester