stardog-language-servers
stardog-language-servers copied to clipboard
Turtle lang server does not allow uchar in iriref
@BonaBeavis I believe the text you provided is not valid Turtle. IRIREF
entities cannot have \u0020
(the space character), per the IRIREF grammar rule (note that, for an IRIREF, the parser is supposed to receive the string you provided with the escape sequences unescaped, not still escaped, so it is seeing a space character here (i.e., the unescaped value of \u0020
), which is not allowed). You can also double-check this with other validators to confirm that the text you provided is not valid Turtle.
The Turtle parser appears to be working correctly here and with other unicode sequences. For example, if you change your provided text to the following
<s> <o> <http://www.example.org/\u0021bar> .
the parser parses the text correctly.
I still do not get it, because the specs define a Turtle document as:
"A conforming Turtle document is a Unicode string that conforms to the grammar and additional constraints defined in section 6. Turtle Grammar, starting with the turtleDoc production. A Turtle document serializes an RDF Graph."
And the document does confirm to the turtle grammar: yacker: turtleEsc validation results.
But as you said, the parser should unescape the IRIREF
and that would produce a invalid IRI
.
I can think of three possible resolutions:
- The rule for uchar is simplified and omits the forbidden unicode escapes.
- Unescaped is just a typo and escaped was the intention.
- I overlooked some "additional constraints" defined in the spec.