moo icon indicating copy to clipboard operation
moo copied to clipboard

Skipping whitespace tokens

Open jarble opened this issue 3 years ago • 3 comments

Is it possible to skip tokens when defining a lexer? I want to split a string into a list of tokens without whitespace, but I don't know if Moo can do this:

Input string:

  • "while ( a < 3 ) { a += 1; }"

List of tokens:

  • ["while","(","a","<","3",")","{","a","+=","1",",";","}"]

jarble avatar May 18 '21 15:05 jarble

const moo = require('moo')
const lex = moo.compile({
  ws: {match: /\p{White_Space}+/u, lineBreaks: true},
  word: /\p{XID_Start}\p{XID_Continue}*/u,
  op: moo.fallback,
})
;[...lex.reset('while ( a < 3 ) { a += 1; }')]
.filter(t => t.type !== 'ws')
.map(t => t.value)

nathan avatar May 18 '21 17:05 nathan

@nathan The documentation doesn't describe this feature: does it need to be updated?

jarble avatar May 18 '21 18:05 jarble

The documentation needs to be updated to document moo.fallback (see #112).

As for the rest, I think Nathan's just demonstrating that since a moo lexer object is an Iterator, you can use filter() and map() which are built-in to JavaScript.

tjvr avatar May 18 '21 19:05 tjvr