rdfa-streaming-parser.js
rdfa-streaming-parser.js copied to clipboard
Do not preserve markup of elements with the `property` attribute and without markup preserving `datatype`
Elements with the property attribute and without the datatype attribute, datatype="", datatype="rdf:HTML", or datatype="rdf:XMLLiteral" should not have the parser preserve the markup in the content of the element and add all namespaces and prefixes.
For example, given input:
<dd about="#foo" property="skos:definition">foo <em>bar</em> baz</dd>
the following is given:
<#foo> skos:definition "foo <em xmlns=\"http://www.w3.org/1999/xhtml\".......\">bar</em> baz" .
but the expected RDF is:
<#foo> skos:definition "foo bar baz" .
It's been a while since I looked into the processing model of RDFa.
But do you mean that the markup should not be preserved without the datatype attribute or datatype="", but it must be preserved when using datatype="rdf:HTML" or datatype="rdf:XMLLiteral"?
That's right AFAICT.
Doublechecking.. Some cases appear to give the expected output but some others do not, i.e., markup is preserved. For example, see https://solidproject.org/ED/protocol 's skos:definition values.
Nudge. It'd be great to fix this. We're trying to commit to using this parser now in dokieli but this issue is kind of a blocker.
Without this fix, I think the consuming applications would probably have to convert the literal to a DOM object and get its .textContent or something to get the actual literal, and that's quite cumbersome.
FWIW, compare:
https://rdf-play.rubensworks.net/#url=https%3A%2F%2Fsolidproject.org%2FTR%2Fprotocol
with
http://rdf.greggkellogg.net/distiller?command=serialize&url=https:%2F%2Fsolidproject.org%2FTR%2Fprotocol&raw
I've also tried to use the profile option, e.g., "html" in the parser configuration (in addition to contentType), and that didn't seem to help either.
Should be fixed in 3.0.1.
It might take a while for it to propagate to rdf-play though.
Great! Thank you. Did a preliminary test in dokieli and seems to be working as intended. Will keep you posted if I spot any issues.