rdfa-streaming-parser.js icon indicating copy to clipboard operation
rdfa-streaming-parser.js copied to clipboard

Do not preserve markup of elements with the `property` attribute and without markup preserving `datatype`

Open csarven opened this issue 3 years ago • 2 comments

Elements with the property attribute and without the datatype attribute, datatype="", datatype="rdf:HTML", or datatype="rdf:XMLLiteral" should not have the parser preserve the markup in the content of the element and add all namespaces and prefixes.

For example, given input:

<dd about="#foo" property="skos:definition">foo <em>bar</em> baz</dd>

the following is given:

<#foo> skos:definition "foo <em xmlns=\"http://www.w3.org/1999/xhtml\".......\">bar</em> baz" .

but the expected RDF is:

<#foo> skos:definition "foo bar baz" .

csarven avatar Nov 06 '22 12:11 csarven

It's been a while since I looked into the processing model of RDFa. But do you mean that the markup should not be preserved without the datatype attribute or datatype="", but it must be preserved when using datatype="rdf:HTML" or datatype="rdf:XMLLiteral"?

rubensworks avatar Nov 07 '22 08:11 rubensworks

That's right AFAICT.

Doublechecking.. Some cases appear to give the expected output but some others do not, i.e., markup is preserved. For example, see https://solidproject.org/ED/protocol 's skos:definition values.

csarven avatar Nov 07 '22 08:11 csarven

Nudge. It'd be great to fix this. We're trying to commit to using this parser now in dokieli but this issue is kind of a blocker.

Without this fix, I think the consuming applications would probably have to convert the literal to a DOM object and get its .textContent or something to get the actual literal, and that's quite cumbersome.

FWIW, compare:

https://rdf-play.rubensworks.net/#url=https%3A%2F%2Fsolidproject.org%2FTR%2Fprotocol

with

http://rdf.greggkellogg.net/distiller?command=serialize&url=https:%2F%2Fsolidproject.org%2FTR%2Fprotocol&raw


I've also tried to use the profile option, e.g., "html" in the parser configuration (in addition to contentType), and that didn't seem to help either.

csarven avatar Jan 10 '25 15:01 csarven

Should be fixed in 3.0.1.

It might take a while for it to propagate to rdf-play though.

rubensworks avatar Jan 13 '25 12:01 rubensworks

Great! Thank you. Did a preliminary test in dokieli and seems to be working as intended. Will keep you posted if I spot any issues.

csarven avatar Jan 13 '25 12:01 csarven