moo
moo copied to clipboard
Unicode support for keywords
Since /u
is supported now, is there some convenient way to define a rule using an array of keywords with unicode enabled? Sth. like:
const keywords = ['foo', 'bar'];
moo.compile({
KEY: {
match: keywords,
type: moo.keywords({KEY: keywords}),
unicode: true,
},
});
In my understanding moo.keywords
in the unicode scenario only work if the "match" is a pattetrn with an /u
flag.
moo.keywords
only works properly when you use it on a match
er that matches anything that could be a word—not just keywords. For example, this lexer doesn't work the way you seem to expect it to:
const moo = require('moo')
const KW = ['ban', 'this']
const lexer = moo.compile({
kw: {match: KW, type: moo.keywords({kw: KW})},
w: /[A-Za-z_][\w]*/,
ws: / +/,
})
lexer.reset('banana ban')
lexer.next() // {type: 'kw', value: 'ban'}
lexer.next() // {type: 'w', value: 'ana'}
The normal use case for moo.keywords
looks like this:
const moo = require('moo')
const KW = ['ban', 'this']
const lexer = moo.compile({
w: {match: /[A-Za-z_][\w]*/, type: moo.keywords({kw: KW})},
ws: / +/,
})
lexer.reset('banana ban')
lexer.next() // {type: 'w', value: 'banana'}
lexer.next() // {type: 'ws', value: ' '}
lexer.next() // {type: 'kw', value: 'ban'}
It actually works fine with Unicode as-is:
const moo = require('moo')
const KW = ['η', 'ο', 'το', 'οι', 'τα']
const lexer = moo.compile({
w: {match: /\p{XIDS}\p{XIDC}*/u, type: moo.keywords({kw: KW})},
ws: {match: /\p{WSpace}+/u, lineBreaks: true},
})
lexer.reset('η ηθική')
lexer.next() // {type: 'kw', value: 'η'}
lexer.next() // {type: 'ws', value: ' '}
lexer.next() // {type: 'w', value: 'ηθική'}
We also already allow string literal and array matches to be combined with /u
regular expressions, so I'm not sure what you're asking for here.
(Some of these changes haven't been published to npm yet [@tjvr]; maybe that's where the confusion is coming from?)
Thank nathan, after seeing the first two examples it became much clearer.
Regarding the array match combined with /u
- I haven't found that in the doc nor in the tests.
I haven't found that in the doc nor in the tests.
We should probably have a test for that. The /u
tests are a bit sparse at the moment.
When’s the next npm publish planned?
I've published 0.5.1. :+1: