Text parser incorrectly requires a numeric stop character after text keywords
The Ion Spec does not require a non-numeric stop character after the keywords true, false, or any of the variants of null. However, ion-rs raises an error when these keywords are followed by something other than a numeric stop character. Here are some examples of valid inputs for which an error is raised:
| Input | Expected |
|---|---|
(true!=false) |
( true '!=' false ) |
(null+1) |
( null '+' 1 ) |
null.float+inf |
null.float +inf |
true+inf |
true +inf |
The Ion Spec does not require a non-numeric stop character after the keywords true, false, or any of the variants of null.
What's interesting is that there's definitely some class of character that ends these keywords, the spec just doesn't spell out what it is. For example, we consider falseTeeth to be an identifier, not the keyword false followed by the identifier Teeth. Whitespace is definitely in that character class (false Teeth) along with container delimiters ([false,Teeth]).
Inside an s-expression, I can see how null+1 is acceptable as a + is an operator in that context. However, given that, I would expect the s-expression (null+inf) to be parsed as ( null + inf ), not ( null +inf ).
I don't think I expect true+inf to be valid; I certainly wouldn't consider true5e0 to be valid.
I would expect the s-expression
(null+inf)to be parsed as( null + inf ), not( null +inf )
So, (null+1) seems like it's pretty clear that it should be ( null '+' 1 ), but there are cases where the Ion specification is not clear as to the correct interpretation. Your example of (null+inf) is one of them. Another might be (null-1). I created https://github.com/amzn/ion-docs/issues/176 to ask/answer how these cases should be handled.
I don't think I expect
true+infto be valid; I certainly wouldn't considertrue5e0to be valid.
I brought up this example because ion-java does allow true+inf.
It could be a bug in ion-java, but ion-java and ion-rust both allow things like []-1 and print("Hello world!"), and I can't find anything in the Ion specification requiring any particular delimitation between top level values—just that numeric values (and timestamps) are terminated by a numeric stop character.
I don't think anyone would expect true5e0 to be true 5e0 because true5e0 is a valid identifier symbol, whereas true+inf is not because it contains +.
I can't find anything in the Ion specification requiring any particular delimitation between top level values—just that numeric values (and timestamps) are terminated by a numeric stop character.
This is the problem I intended to highlight; the other examples were just illustrations.