rdflib.js icon indicating copy to clipboard operation
rdflib.js copied to clipboard

XML qname parses correctly but is not serialized correctly.

Open jamsden opened this issue 7 years ago • 4 comments

Error: Invalid character "." cannot be in XML qname for URI: http://jazz.net/xmlns/prod/jazz/rtc/ext/1.0/com.ibm.team.apt.attribute.acceptance at Serializer.qnameMethod (/Users/jamsden/GitHub/rdflib.js/lib/serializer.js:913:17) at Serializer.subjectXMLTreeMethod (/Users/jamsden/GitHub/rdflib.js/lib/serializer.js:819:13) at Serializer.statementListToXMLTreeMethod (/Users/jamsden/GitHub/rdflib.js/lib/serializer.js:736:22) at Serializer.statementsToXML (/Users/jamsden/GitHub/rdflib.js/lib/serializer.js:931:16) at Object.serialize (/Users/jamsden/GitHub/rdflib.js/lib/serialize.js:27:29)

According to https://www.w3.org/TR/REC-xml/#NT-Name, NCName can contain ".". So the serializer should handle these QNames.

jamsden avatar Apr 17 '18 17:04 jamsden

For the syntax of a QName, https://www.w3.org/TR/REC-xml/ says that a local name for a qname is a Name less colon. A Name is defined here: https://www.w3.org/TR/REC-xml/#NT-Name. There's almost nothing illegal in that set of characters except colon, control characters, and unassigned code points (and as far as I can see the latter are merely discouraged, not illegal).

jamsden avatar May 21 '18 15:05 jamsden

I see this is still open. https://www.w3.org/TR/xml-names11/#NT-NCName explicitly states that "." is included in NCName which is part of a Prefix.

The fix is simple: In rdflib/src/serializer.js:196: Change: __Serializer.prototype._notQNameChars = '\t\r\n !"#$%&'()*.,+/;<=>?@[\]^{|}~'; to: __Serializer.prototype._notQNameChars = '\t\r\n !"#$%&\'()*,+/;<=>?@[\\]^{|}~'; (i.e., remove . from the _notQNameChars)

Any chance this could get addressed?

jamsden avatar Apr 12 '25 18:04 jamsden

Thanks for reopening. But there are other consequences that make some tests fail on serializing text/turtle see https://github.com/linkeddata/rdflib.js/pull/686

As also some consequence like https://github.com/linkeddata/rdflib.js/issues/647

bourgeoa avatar Apr 13 '25 10:04 bourgeoa

@bourgeoa: Understood. But it seems like anything that relies on considering valid QNames as invalid would be incorrect and should also be addressed.

Otherwise, is this something that could be addressed in a future release or do I need to fork?

IBM Engineering Workflow Manager (formerly Rational Team Concert) makes heavy use of prefixes that contain dots.

jamsden avatar Apr 23 '25 18:04 jamsden

Any progress on this? I can't publish an update to some of my OSLC packages without this fix.

jamsden avatar Aug 31 '25 13:08 jamsden

@jamsden — I think your requested change is clearer here than in your https://github.com/linkeddata/rdflib.js/issues/228#issuecomment-2798947604. Note that I have simply inserted full-line codefences (```) above and below each line of code. I have also put inline codefences (`) before and after the line identifier and the code snippets in the "i.e.".

The fix is simple: In rdflib/src/serializer.js:196: Change:

__Serializer.prototype._notQNameChars = '\t\r\n !"#$%&\'()*.,+/;<=>?@[\\]^`{|}~'; 

to:

__Serializer.prototype._notQNameChars = '\t\r\n !"#$%&\'()*,+/;<=>?@[\\]^`{|}~';

(i.e., remove . from the _notQNameChars)

The new line should only have one fewer characters than the old line, and I think I've put all the backslashes in correctly.

TallTed avatar Sep 02 '25 17:09 TallTed