antlr4
antlr4 copied to clipboard
Why can't we compress the ATN?
Depending on how simply its implemented, it could be incredibly beneficial. Personally, since I'm already using zstd in my compiler project, I wouldn't mind zstd, but a super simple compression implementation could work.
Should not be a 3rd party lib, to avoid forcing all targets to have that in their specific environment. Instead a simple RLE might make more sense, but it's unclear if compressing the serialized ATN has any significant impact (on code size or runtime speed).
Instead maybe a new serialization format might be the better choice? However, I don't think that will ever be considered in ANTLR4. Instead follow the ANTLRng project, where this might become a reality.
Should not be a 3rd party lib, to avoid forcing all targets to have that in their specific environment. Instead a simple RLE might make more sense, but it's unclear if compressing the serialized ATN has any significant impact (on code size or runtime speed).
Instead maybe a new serialization format might be the better choice? However, I don't think that will ever be considered in ANTLR4. Instead follow the ANTLRng project, where this might become a reality.
I'm not sure whether every ATN actually needs to be fully unpacked, and whether it should even be done in the parser constructor. For our largest grammar sql/plsql, unpacking takes 0.2s in C#, but most of the grammar isn't even used for the parse. For the entire test suite of 379 files, it's only 66% of the rules that are used. You would think that for small tests the %-used is even less.
But is there a problem? Is this too much time or space required?
It could be optional, but zstd is brilliant and ubiquitous.