moo icon indicating copy to clipboard operation
moo copied to clipboard

Precompilation?

Open bgub opened this issue 6 years ago • 12 comments
trafficstars

I'd love to use Moo for a library I'm building, but it needs to be extremely lightweight (even more than 4KB minzipped).

Is there a possibility of adding an option to precompile, a step which would basically moo.compile a grammar and then output a generated, more lightweight JS file that could be tweaked and customized?

bgub avatar Aug 08 '19 02:08 bgub

It's not something we've thought about.

I haven't checked in detail, but I would guess about half of Moo's source code deals with compilation. The other half is used at runtime. Would 1–2K be small enough? (Of course, this is in addition to the tokenizer RegExp itself.)

Sent with GitHawk

tjvr avatar Aug 08 '19 07:08 tjvr

Yes, I think it would. I was also thinking that once I generated a new JS file, I could tweak some things by hand, like removing features I don't need.

bgub avatar Aug 08 '19 12:08 bgub

Would Tree-Shaking assist here?

  • https://github.com/rollup/rollup#tree-shaking

bd82 avatar Aug 08 '19 14:08 bd82

@bd82 I don't think so, because I want to find a way to have some things (like building the RegExp) happen before runtime.

It would be great if there was some way to just save the lexer in a separate JS file after generation.

bgub avatar Aug 08 '19 15:08 bgub

Just for fun, here's a Gist which provides a silly (albeit working) approach to compiling a Moo lexer.

It's silly because it extracts the Lexer and LexerIterator class definitions from inside moo.js in a rather gross way. I doubt we'd ever consider merging this code 🙃 If we wanted to support this properly, we'd probably want to split up moo.js into two or three parts: in particular, you'd want the runtime structures (i.e. the Lexer class) to be separate from everything that builds the tokenizer, so that you can import just the runtime in your code.

Some stats:

  • moo.js is 17682 bytes, 4949 gzipped.
  • A tiny example tokenizer is 5981 bytes, 1817 gzipped.

tjvr avatar Aug 09 '19 20:08 tjvr

Thanks @tjvr, that's pretty nifty! I agree the code would probably not be clean enough to merge, but I really like the idea of separate runtime structures.

bgub avatar Aug 12 '19 12:08 bgub

Hi @tvjr! I'm considering using Moo to build a template engine, and wondered if you considered moving the runtime structure outside of the main moo.js?

bgub avatar Sep 26 '19 01:09 bgub

Also, I've recently been digging into the source code. Could you explain what fast is?

bgub avatar Sep 26 '19 01:09 bgub

Could you explain what fast is?

It comes from https://github.com/no-context/moo/pull/40 / https://github.com/no-context/moo/pull/103. It makes single-character tokens significantly faster.

nathan avatar Sep 26 '19 01:09 nathan

@nathan thanks!

I seem to remember once running a benchmark that showed that str[0] actually ended up being faster than str.charCodeAt(0). Is there a reason why charCodeAt was chosen?

bgub avatar Sep 26 '19 01:09 bgub

@chocolateboy did your thumbs down mean you didn't approve of str.charCodeAt, or you didn't like my comment?

bgub avatar Sep 27 '19 15:09 bgub

Is there a reason why charCodeAt was chosen?

Benchmarks at the time showed that it was slightly faster. It's certainly possible that's changed.

tjvr avatar Sep 29 '19 18:09 tjvr