gocc icon indicating copy to clipboard operation
gocc copied to clipboard

How can I specify a Unicode category in a lexer?

Open willfaught opened this issue 8 years ago • 4 comments

I'm experimenting with the Haskell syntax, which allows any uppercase (Lu) Unicode category character in certain places and not others. Is there a way to do that in gocc? Specifically, it uses the Unicode categories for whitespace, uppercase/titlecase and lowercase letters, symbols/punctuation, and decimal digits.

willfaught avatar Apr 29 '16 05:04 willfaught

I think you could probably build your own unicode categories in the lexer, but it is not provided out of the box by gocc.

awalterschulze avatar Apr 29 '16 07:04 awalterschulze

But it is a cool usecase.

awalterschulze avatar Apr 29 '16 07:04 awalterschulze

Yeah, I was hoping I wouldn't have to define the productions myself. I wish it was one easy char range, but no. ;)

willfaught avatar Apr 29 '16 18:04 willfaught

@awalterschulze Thanks!

willfaught avatar Apr 29 '16 18:04 willfaught