gocc
gocc copied to clipboard
How can I specify a Unicode category in a lexer?
I'm experimenting with the Haskell syntax, which allows any uppercase (Lu) Unicode category character in certain places and not others. Is there a way to do that in gocc? Specifically, it uses the Unicode categories for whitespace, uppercase/titlecase and lowercase letters, symbols/punctuation, and decimal digits.
I think you could probably build your own unicode categories in the lexer, but it is not provided out of the box by gocc.
But it is a cool usecase.
Yeah, I was hoping I wouldn't have to define the productions myself. I wish it was one easy char range, but no. ;)
@awalterschulze Thanks!