rdflib.js
rdflib.js copied to clipboard
Turtle serializer shouldn't write blank nodes as <...>
I'm converting JSON-LD to Turtle using rdflib.js.
Example input:
{
"@context": {
"ex": "http://example.com#",
},
"@id": "ex:myid",
"ex:prop1": {
"ex:prop2": {
"ex:prop3": "value",
},
},
}
Example current output out of rdflib.js:
@prefix ex: <http://example.com#>.
<_:b0> ex:prop2 <_:b1>.
<_:b1> ex:prop3 "value".
ex:myid ex:prop1 <_:b0>.
Turtle spec states following:
RDF blank nodes in Turtle are expressed as _: followed by a blank node label which is a series of name characters.
So, I think blank nodes should be expressed without <...>, because this makes them absolute or relative IRIs and not blank nodes.
As an additional feature, it would be nice to be able to control the blank node output to have them nested or not nested.
Questions:
- Is this a known issue? I saw some non-conformances in #329, but couldn't find this exact case there.
- Could this be affected by arguments? In case I'm calling the functions wrong? I'm including below my code snippet.
/**
* Convert JSON-LD to Turtle
* @param input JSON string
* @param base Base IRI for the content
* @param namespaces The namespace map for use in ttl
* @returns TTL string
*/
async function convertJsonLdToTtl(
input: string,
base: string,
namespaces: Record<string, string> = {},
): Promise<string> {
return new Promise<string>((res, rej) => {
const store = rdflib.graph()
rdflib.parse(input, store, base, "application/ld+json", (err, kb) => {
if (err) {
rej(err)
} else {
if (!kb) {
rej("KB empty: " + kb)
} else {
console.log("KB # statements: " + kb.statements.length)
rdflib.serialize(
null,
kb,
undefined,
"text/turtle",
(err, output) => {
if (err) {
rej(err)
} else {
if (!output) {
rej("Empty output: " + output)
} else {
res(output)
}
}
},
{
namespaces,
},
)
}
}
})
})
}
Many thanks.
I can confirm that <_:b0> is a NamedNode, not a BlankNode in Turtle. So this looks like a bug.
Agreed. The issue may be in JSON-LD parser and not in turtle serializer.
Agreed. The issue may be in JSON-LD parser and not in turtle serializer.
Thx for the hint. So I tried to first convert from JSON-LD to N-Quads (with another library, jsonld) and then convert to Turtle. Which helped by embedding the blank nodes. So the blank node labels may still be wrong, I couldn't test this, but my problem is solved for now.
This is rather problematic for any system that uses rdflib.js to parse JSON-LD. Any chance this can get prioritized?
I can confirm that e.g. the following JSON-LD is not parsed correctly:
{
"@context": {
"@vocab": "https://example.com/"
},
"hasExampleProperty": "some literal value"
}
Results in the following statement (I'm using an example IRI for the graph here):
{
"subject": {
"termType": "NamedNode",
"classOrder": 5,
"value": "_:b0"
},
"predicate": {
"termType": "NamedNode",
"classOrder": 5,
"value": "https://example.com/hasExampleProperty"
},
"object": {
"termType": "Literal",
"classOrder": 1,
"value": "some literal value",
"datatype": {
"termType": "NamedNode",
"classOrder": 5,
"value": "http://www.w3.org/2001/XMLSchema#string"
},
"isVar": 0,
"language": ""
},
"graph": {
"termType": "NamedNode",
"classOrder": 5,
"value": "https://example.com/test/"
}
}
But clearly _:b0
should be a BlankNode
.
Whereas the corresponding Turtle, is parsed correctly:
@prefix ex: <https://example.com/> .
[] ex:hasExampleProperty "some literal value" .
Becomes:
{
"subject": {
"termType": "BlankNode",
"classOrder": 6,
"value": "_g_L2C39",
"isBlank": 1,
"isVar": 1
},
"predicate": {
"termType": "NamedNode",
"classOrder": 5,
"value": "https://example.com/hasExampleProperty"
},
"object": {
"termType": "Literal",
"classOrder": 1,
"value": "some literal value",
"datatype": {
"termType": "NamedNode",
"classOrder": 5,
"value": "http://www.w3.org/2001/XMLSchema#string"
},
"isVar": 0,
"language": ""
},
"graph": {
"termType": "NamedNode",
"classOrder": 5,
"value": "https://example.com/test/"
}
}
(Interestingly, the blank node gets a completely different internal identifier in this case).
When the JSON-LD contains a list, the blank nodes corresponding to that collection are generated correctly:
{
"@context": {
"@vocab": "https://example.com/",
"hasExampleProperty": {
"@container": "@list"
}
},
"hasExampleProperty": ["some literal value", "some other literal value"]
}
As N-Quads:
_:n4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "some other literal value".
_:n4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nill>.
_:n5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "some literal value".
_:n5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:n4.
<_:b0> <https://example.com/hasExampleProperty> _:n5 <https://example.com/test/> .
The function jsonldObjectToTerm
does not appear to ever return a BlankNode
https://github.com/linkeddata/rdflib.js/blob/c14dfd57d5159ad5ac1ee2523cc7924968e24f53/src/jsonldparser.js#L11
Diagnosis
It looks like the flatten
function from jsonld.js
is the culprit.
The JSON-LD parser takes the flattened output, and checks for @id
attributes to determine whether the JSON object represents a blank node or not.
https://github.com/linkeddata/rdflib.js/blob/c14dfd57d5159ad5ac1ee2523cc7924968e24f53/src/jsonldparser.js#L68-L83
and:
https://github.com/linkeddata/rdflib.js/blob/c14dfd57d5159ad5ac1ee2523cc7924968e24f53/src/jsonldparser.js#L24-L26
However, the jsonld.js
flattened output inserts @id
attributes, e.g. the above JSON-LD (without the list) results in:
[
{
"@id": "_:b0",
"https://example.com/hasExampleProperty": [
{
"@value": "some literal value"
}
]
}
]
This turns the node into a NamedNode
because it has an @id
attribute.
The @id
attribute is a non-normative part of the JSON-LD specification at https://www.w3.org/TR/json-ld11/#identifying-blank-nodes.
The flattened output (also non-normative) uses this in its examples: https://www.w3.org/TR/json-ld11/#flattened-document-form (and it needs to as it cannot use nesting to group the properties of the node together).
Proposed Solution
- Do not rely on the presence of an
@id
attribute, as it will always be there for named and blank nodes. - Use the standard syntax for blank nodes in JSON-LD to identify whether a JSON object is a blank node: any value of
@id
that starts with_:
is a blank node.