lucene icon indicating copy to clipboard operation
lucene copied to clipboard

Upgrade ANTLR to version 4.11.1

Open reta opened this issue 3 years ago • 5 comments

Description

The Apache Lucene is using quite old version of ANTLR 4.5.1-1. By itself, it is not a showstopper, but more profound issue is that some ANTLR 3.x bits are used [1]. Since ANTLR 4.10.x (or even earlier), the compatibility layer with 3.x release line has been dropped in 4.x (see please [2]), which makes Apache Lucene impossile to be used with recent ANTLR 4.10.x+ releases [3]. The sample exception is below.

   >         java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 3 (expected 4).
   >             at [email protected]/org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:56)
   >             at [email protected]/org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:48)
   >             at [email protected]/org.apache.lucene.expressions.js.JavascriptLexer.<clinit>(JavascriptLexer.java:279)

[1] https://github.com/apache/lucene/blob/main/lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptLexer.java#L189 [2] https://github.com/antlr/antlr4/commit/c68e127a7cf14470565d6e6ae1eff06db3e56ea7 [3] https://github.com/opensearch-project/OpenSearch/pull/4546

@uschindler @jpountz any objections in migrating to ANTLR 4.11.1? I would be happy to offer my help here, thank you.

reta avatar Sep 19 '22 18:09 reta

looks like an antlr problem, if they broke backwards compat, they prolly should have named it 5.x?

let's be careful about upgrading to new versions. newer antlr versions have historically been trappy, e.g. happily doing extremely slow things instead of simply failing at "compile time" if there are problems in the grammar.

rmuir avatar Sep 22 '22 07:09 rmuir

Thanks Robert. I would have said the same. In the worst case we should (like most projects do for ASM, e.g. forbidden apis) shade the antrlr runtime to lucenes package name and include into the expressions jar.

uschindler avatar Sep 22 '22 07:09 uschindler

@rmuir @uschindler thanks guys

looks like an antlr problem, if they broke backwards compat, they prolly should have named it 5.x?

Sadly I don't know the story, I believe it was merged / reverted / and than brought up again.

let's be careful about upgrading to new versions. newer antlr versions have historically been trappy, e.g. happily doing extremely slow things instead of simply failing at "compile time" if there are problems in the grammar.

I see the risks now, may be we could explore the route to convert problematic serialized blobs from v3 to v4?

reta avatar Sep 22 '22 12:09 reta

i'd prefer not changing anything without addressing the testing. I need to reiterate just how insanely trappy antlr v4 is. for painless to work with v4 and prevent insanely slow performance we used some tricks to fail tests instead of doing slow things: https://github.com/opensearch-project/OpenSearch/blob/main/modules/lang-painless/src/main/java/org/opensearch/painless/antlr/Walker.java#L224-L245

It is still not as good as "compile-time" checking of the grammar, because you need 100% test coverage to ensure things never go slow.

rmuir avatar Sep 22 '22 13:09 rmuir

:+1: thanks @rmuir, I will start with tests first (with respect to the changes needed) and we could make the decision having the evidence / numbers at hand.

reta avatar Sep 23 '22 13:09 reta