antlr4
antlr4 copied to clipboard
[Performance] Constants for Python runtime should be folded
The problem
Consider the following grammar:
grammar Test;
e: T1 | T2 | T3;
T1: 't1';
T2: 't2';
T3: 't3';
For Python runtime the following code is generated for rule e
:
def e(self):
localctx = TestParser.EContext(self, self._ctx, self.state)
self.enterRule(localctx, 0, self.RULE_e)
self._la = 0 # Token type
try:
self.enterOuterAlt(localctx, 1)
self.state = 0
_la = self._input.LA(1)
if not((((_la) & ~0x3f) == 0 and ((1 << _la) & ((1 << TestParser.T1) | (1 << TestParser.T2) | (1 << TestParser.T3))) != 0)):
self._errHandler.recoverInline(self)
else:
self._errHandler.reportMatch(self)
self.consume()
except RecognitionException as re:
localctx.exception = re
self._errHandler.reportError(self, re)
self._errHandler.recover(self, re)
finally:
self.exitRule()
return localctx
The following constant expression is not folded during interpretation:
(1 << TestParser.T1) | (1 << TestParser.T2) | (1 << TestParser.T3)
In Java bytecode it's just 14
. In this case these constant should be folded during generation.
How to check it?
I've created a benchmark: https://gist.github.com/KvanTTT/e3b355f7e321fe7f52e11ea1aa0ecbce#file-python_constant_folding_benchmark-py
It returns the following timings (python 3.8.5):
no_folding_const_refs_test: 1769 ns
no_folding_const_literals_test: 81 ns
folding: 84 ns
for the following methods:
def no_folding_const_refs_test():
return (1 << C1) | (1 << C2) | (1 << C3) | (1 << C4) | (1 << C5) | (1 << C6) | \
(1 << C7) | (1 << C8) | (1 << C9) | (1 << C10) | (1 << C11) | (1 << C12) | \
(1 << C13) | (1 << C14) | (1 << C15) | (1 << C16)
def no_folding_const_literals_test():
return (1 << 1) | (1 << 2) | (1 << 3) | (1 << 4) | (1 << 5) | (1 << 6) | \
(1 << 7) | (1 << 8) | (1 << 9) | (1 << 10) | (1 << 11) | (1 << 12) | \
(1 << 13) | (1 << 14) | (1 << 15) | (1 << 16)
def folding_test():
return 0xFFFF
no_folding_const_refs_test
works ~20 times slower than optimized versions. I vote for folding_test
because it's the shortest code and it provides more shorter generated code.
Probably some other runtimes also don't perform constant folding (I know about Java and C#, they perform constant folding).
Created a patch script that implements this optimization: https://gist.github.com/amykyta3/8285559e95074c4431c2836d78b36530
Thanks, but it should be (if it will) implemented via ANTLR tool like https://github.com/antlr/antlr4/pull/3701