Erfurt icon indicating copy to clipboard operation
Erfurt copied to clipboard

Prettier output from Erfurt RDF/Turtle export

Open frodeseverin opened this issue 13 years ago • 4 comments

The Turtle source as exported from http://data.bbib.no/source/edit/r/Nordahl_Grieg is not very readable for the part containing frbr:creatorOf. Never mind the frbr:creatorOf appearing twice. This is an error on my part whilst using OntoWiki, adding a separate instance of a property widget for frbr:creatorOf, instead of using the + button to expand the widget. This is an OntoWiki issue, I suppose.

The readability of the line

                frbr:creatorOf <http://data.bbib.no/data/17._mai_1914>, <http://data.bbib.no/data/Ung%20m%C3%A5%20verden%20ennu%20v%C3%A6re>, <http://data.bbib.no/data/Ung%20m%C3%A5%20verden%20ennu%20v%C3%A6re%20%3A%20roman>, <http://data.bbib.no/data/V%C3%A5r%20%C3%A6re%20og%20v%C3%A5r%20makt> ;

is what concerns this issue.

Exporting this as

                frbr:creatorOf <http://data.bbib.no/data/17._mai_1914>, 
                                <http://data.bbib.no/data/Ung%20m%C3%A5%20verden%20ennu%20v%C3%A6re>, 
                                <http://data.bbib.no/data/Ung%20m%C3%A5%20verden%20ennu%20v%C3%A6re%20%3A%20roman>, 
                                <http://data.bbib.no/data/V%C3%A5r%20%C3%A6re%20og%20v%C3%A5r%20makt> ;

would be more readable, and still valid Turtle syntax, I suppose.

;)Frode

frodeseverin avatar Oct 12 '12 10:10 frodeseverin

Ping.

frodeseverin avatar Jan 03 '13 12:01 frodeseverin

But I think in general comma separated lists are better readable if they are in one line. The exporter would need to check if the output lines exceed some given length and add wrap the line according to some rules.

white-gecko avatar Jan 03 '13 18:01 white-gecko

I see your point. However, the exporter only needs to make sure a comma is followed by a line break and appropriate indentation, rather than a space.

The object of Turtle notation is, as far as I know, to provide a RDF serialization which is easier for humans to read. True enough, each human being will have his own personal preferences, and catering for this variety is a complex task.

Now, generally, other Turtle files I have seen tend to present combined triples like the above on multiple lines. The idea seems to be that line breaks are more easily interpreted as a logical separator than are spaces.

Turning to the source, the W3C TeamSubmission states that "This document defines a textual syntax for RDF called Turtle that allows RDF graphs to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes." The compactness and natural text form is of course a matter of interpretation.

Section 2.3 further exemplifies the syntax, and indicates that groups of triples should be written such:

# this is not a complete turtle document
:a :b :c ,
      :d .
# the last triple is :a :b :d .

and that groups of predicates should be written such:

# this is not a complete turtle document
:a :b :c ;
   :d :e .
# the last triple is :a :d :e .

Combining these, in my personal interpretation, gives the folowing:

# this is not a complete turtle document
:a :b :c ;
   :d :e ,
      :f .
# the last triple is :a :d :f .

Personally, I find this more readable than

# this is not a complete turtle document
:a :b :c ;
   :d :e , :f .
# the last triple is :a :d :f .

however valid and readable this syntax is for a machine.

I'll have to inquire of the TeamSubmission authors for a clarification.

frodeseverin avatar Jan 04 '13 10:01 frodeseverin

Reading a bit more on the issue reveals that things are more complicated still. The W3C Turtle Working Draft is not conclusive on this matter.

From http://www.w3.org/TR/2012/WD-turtle-20120710/#object-lists the preferred lay-out seems to be the one currently used by Erfurt.

From http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-parsing-example the preferred lay-out seems to be the one I suggest above,

# Example I
# this is not a complete turtle document
:a :b :c ;
   :d :e ,
      :f .
# the last triple is :a :d :f .

I'll look into the matter.

;)Frode

frodeseverin avatar Jan 07 '13 07:01 frodeseverin