[ZEPPELIN-6355] Merge zeppelin-zengine to zeppelin-server
What is this PR for?
This PR merges the zeppelin-zengine module into zeppelin-server to simplify the project architecture. The zengine module contained core engine functionality for notebook management, paragraph execution, and interpreter management, which is tightly coupled with the server. This merge reduces build complexity while preserving all git history using git mv for file movements.
What type of PR is it?
Refactoring
Todos
- [x] Move all source files from zeppelin-zengine to zeppelin-server using git mv (183 files)
- [x] Merge zengine dependencies into zeppelin-server pom.xml
- [x] Update zeppelin-plugins to depend on zeppelin-server instead of zeppelin-zengine
- [x] Update zeppelin-interpreter-integration dependencies
- [x] Update zeppelin-integration dependencies
- [x] Remove zeppelin-zengine module from root pom.xml
- [x] Update GitHub Actions workflows
- [x] Remove zeppelin-zengine directory
- [X] Verify full build succeeds:
./mvnw clean package
What is the Jira issue?
- ZEPPELIN-6355
How should this be tested?
- CI
Screenshots (if appropriate)
Questions:
- Does the license files need to update? No
- Is there breaking changes for older versions? No
- Does this needs documentation? No
@Reamer could you please check it? We might resolve this issue easier than we think.
I looked into the failing Selenium test to find the cause.
My working hypothesis is that maven-shade-plugin is rewriting a string-literal, the default value of ZEPPELIN_CONFIG_STORAGE_CLASS inside zeppelin-interpreter's ZeppelinConfiguration. I didn't directly inspect the effective classpath, but the behavior suggests that when the shaded class is resolved first, the rewritten default appears, and when the original class is resolved first the unmodified default remains. This inference comes from the two checks below.
~1. Shade raw-string check : Adding <rawString>true</rawString> in the relocation option makes the ZEPPELIN_CONFIG_STORAGE_CLASS default value to remain as the original literal value.~
2. Classpath order check : Reordering dependencies so that zeppelin-server appears before spark-interpreter also gives the same result as 1. Before the zengine and server merge, the zeppelin-zengine dependency also appeared before spark-interpreter.
Sharing these observations in case they help fix the failing Selenium test.
https://github.com/apache/zeppelin/blob/190ecd6a2febd58616e0fdbb2686fbd2b2ea7e7a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java#L1037
There are still some references to zeppelin-zengine left in the shell scripts under /bin.
(e.g., https://github.com/jongyoul/zeppelin/blob/5cc2196bf7afc172cf6f9f7d7cc5f64f9f70b31b/bin/interpreter.sh#L115)
How about switching to the unshaded ZeppelinConfiguration as in this commit? (2. from my previous comment.) Maybe rawString fixes some cases, but by skipping required shaded classes it creates new issues.
It seems to fix all the failing jobs, though I'd prefere a more explicit fix if possible.
(Failed E2E test was fixed in other PR)
@tbonelee
Have you found any good documentation regarding rawString?
On the official site, the only reference I could find is the Javadoc link for the type used by the <relocations> property (https://maven.apache.org/components/plugins/maven-shade-plugin/apidocs/org/apache/maven/plugins/shade/relocation/SimpleRelocator.html), which doesn’t provide much detail.
I skimmed the code and initially assumed rawString was an option to exclude string literals in .java files from being shaded (https://github.com/apache/maven-shade-plugin/blob/073e7ca970c15a8cae64e2de9517fe8d471b5373/src/main/java/org/apache/maven/plugins/shade/relocation/SimpleRelocator.java#L217). I only tried the rawString option to quickly check whether our test failures were caused by shading string literals in java code, so I didn’t investigate it more rigorously.
However, looking at the issue where this option was introduced (https://issues.apache.org/jira/projects/MSHADE/issues/MSHADE-104?filter=allissues) and re-read the whole code of SimpleRelocator, the primary intent seems to be treating patterns as regex. If that’s the case, a pattern like com.google would be interpreted as a regex, which could lead to unexpected relocations.
I’ve struck through my earlier explanation about rawString since it was inaccurate. Sorry about that.
Quick question with limited context. Why is ZeppelinConfiguration in the zeppelin-interpreter module? It looks like a general configuration class, so I wondered why it lives under interpreter. Because it sits in zeppelin-interpreter, it gets shaded and tends to be used first, which might relate to the failing tests. If it’s reasonable, would moving ZeppelinConfiguration to a more general module be a better fit and also resolve the tests?
Why is
ZeppelinConfigurationin thezeppelin-interpretermodule?
In my opinion, we should split this up. ZeppelinConfiguration belongs to the Zeppelin server, and the Zeppelin interpreter should really only work on a HashMap with ConfigKey and ConfigValue.
The entire loading of the configuration and possibly also a reload should all belong to the server.
Currently, the Zeppelin server transmits the configuration from the server to the remote interpreter when it starts up. See https://github.com/apache/zeppelin/blob/a03dcbcefc4e6d36993fd537d41626e27eb8e05a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterServer.java#L202-L207