webidl
webidl copied to clipboard
Replace the word "quoted" from IDL grammar section
https://heycam.github.io/webidl/#idl-grammar
Thus, the input text “long” is tokenized as the quoted terminal symbol long rather than an identifier called "long", and “.” is tokenized as the quoted terminal symbol . rather than an other.
This is from when terminal symbols were actually quoted with ", and now that's not true anymore as they are just monospaced.
Because the terminals defined in terms of regular expressions are referred to as “named terminals,” I’ve been calling the others “unnamed” or “anonymous” personally. Do either of these seem like good alternatives?
In the same paragraph:
“.” is tokenized as the quoted terminal symbol . rather than an other.
The "." and "-" terminals are only “anonymous” terminals because they appear as explicit alternatives of Other (the non-terminal), which is a tautology since they would otherwise be part of other (the terminal) which is itself part of Other. This doesn’t technically cause any problems but it is probably accidental that it still exists and makes it seem like a poor choice for an example. The paragraph also says that “The tokenizer operates on a sequence of Unicode characters [UNICODE],” which seems like it should probably say Unicode scalar values (since elsewhere it is taken for granted that the string value of a string terminal contains only USVs).
Those things are unrelated but the density of curious / slightly off things in one paragraph suggests that maybe the whole paragraph should get an overhaul?
If the longest possible match could match one of the above named terminal symbols
This "above named" sounds more like https://www.merriam-webster.com/dictionary/abovenamed to me, and I'm not sure others are really "anonymous" as I can certainly call optional as an "optional" terminal symbol?
Personally I'm calling them as regex terminal symbols and inline terminal symbols (since the latter is defined "inline" without a definition table). I'm open to other suggestions.
Those seem good to me.
FWIW: The anonymity of optional from my POV was that it’s a terminal symbol that’s not assigned a name, whereas the terminal symbol “decimal” say is expressly given a referenceable name. That the code point sequence described by the literal optional reads as a word in this case (but not in the case of e.g. [) didn’t seem to change this to me — but I’m just explaining my rationale, not arguing for it. I think that “regex” and “inline” are probably better choices and require much less explanation :)
I think "literal terminal symbol" would be even clearer than "inline terminal".
"literal terminal symbol"
I agree that this is a good description. My only concern is that it may make it even easier to miss that these are “genuine” terminal symbols on the same footing as those defined using regular expressions, not “refinements” that constrain the latter to specific source character sequences as is common in other language specs. Maybe that’s a separate issue, but it’s caused repeated problems, so I worry about confusing it further.