BFO-2020
BFO-2020 copied to clipboard
Use of the '@' symbol in Common Logic files causes Hets to throw a parsing error
The cl:comment in the first few lines of each of these files contains an email address, which of course uses an '@' symbol. Whenever I try to load this file in the online Hets toolkit (rest.hets.eu), I get the following error message: "unexpected '@' / expecting ' ' '. [i.e., expecting a single quote]" (comment in brackets is my own).
I'm still not absolutely certain why this is an issue, but looking at the CLIF specification I noticed that the '@' symbol is not listed in Section A.2.2.4 under the characters that can be used to form lexical tokens (see attached). Because (1) this email address is part of a quoted string, (2) quoted strings are considered lexical tokens in CLIF, and (3) lexical tokens can only contain members of the sets of characters, delimiters, or whitespace that are defined in the specification, I believe this is why Hets is throwing this error.
Solution: Replacing the '@' symbol with '(at)' or something along those lines fixes this. Please note that, to my knowledge, you unfortunately cannot escape it with a backslash because the backslash is only reserved for special uses in CLIF, which is to escape single or double quotes within quoted strings.
data:image/s3,"s3://crabby-images/002ca/002ca9a06420f77a369d6c8f325d846ef0581d75" alt="Screen Shot 2023-03-27 at 6 42 47 PM"
This looks like a spec bug. It says: "This includes all the alphanumeric characters", but then that disagrees with the production. Who wins? It can't be an intentional omission.
Not that it's a better option, but you can use any Unicode by escaping with \u or \U. Has HETS been updated for the 2018 Common Logic spec? If not there might be other problems. cl:outdiscourse is defined in 2018 but not 2007. Looks like cl:ttl is also new.
I changed my source to use (at) in the future. If you want to submit a PR fixing the current files, that's welcome. Otherwise I'll get to it at some point.
It was pointed out to me that @ isn't an alphanumeric character. But the sentence starts "char is all the remaining ASCII non-control characters", so that includes @
@alanruttenberg , yeah, I find it very bizarre as well and thought it might be have been omitted by mistake. I might reach out to the Hets folks to see if this is a feature or a bug on either their end or the spec's. I can also make the pull request tomorrow.
Oh, it looks like I misread the spec and \u, like you said, can be used to escape any Unicode.