zeppelin icon indicating copy to clipboard operation
zeppelin copied to clipboard

[ZEPPELIN-6355] Merge zeppelin-zengine to zeppelin-server

Open jongyoul opened this issue 2 months ago • 8 comments

What is this PR for?

This PR merges the zeppelin-zengine module into zeppelin-server to simplify the project architecture. The zengine module contained core engine functionality for notebook management, paragraph execution, and interpreter management, which is tightly coupled with the server. This merge reduces build complexity while preserving all git history using git mv for file movements.

What type of PR is it?

Refactoring

Todos

  • [x] Move all source files from zeppelin-zengine to zeppelin-server using git mv (183 files)
  • [x] Merge zengine dependencies into zeppelin-server pom.xml
  • [x] Update zeppelin-plugins to depend on zeppelin-server instead of zeppelin-zengine
  • [x] Update zeppelin-interpreter-integration dependencies
  • [x] Update zeppelin-integration dependencies
  • [x] Remove zeppelin-zengine module from root pom.xml
  • [x] Update GitHub Actions workflows
  • [x] Remove zeppelin-zengine directory
  • [X] Verify full build succeeds: ./mvnw clean package

What is the Jira issue?

  • ZEPPELIN-6355

How should this be tested?

  • CI

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

jongyoul avatar Oct 04 '25 04:10 jongyoul

@Reamer could you please check it? We might resolve this issue easier than we think.

jongyoul avatar Oct 04 '25 08:10 jongyoul

I looked into the failing Selenium test to find the cause.

My working hypothesis is that maven-shade-plugin is rewriting a string-literal, the default value of ZEPPELIN_CONFIG_STORAGE_CLASS inside zeppelin-interpreter's ZeppelinConfiguration. I didn't directly inspect the effective classpath, but the behavior suggests that when the shaded class is resolved first, the rewritten default appears, and when the original class is resolved first the unmodified default remains. This inference comes from the two checks below.

~1. Shade raw-string check : Adding <rawString>true</rawString> in the relocation option makes the ZEPPELIN_CONFIG_STORAGE_CLASS default value to remain as the original literal value.~ 2. Classpath order check : Reordering dependencies so that zeppelin-server appears before spark-interpreter also gives the same result as 1. Before the zengine and server merge, the zeppelin-zengine dependency also appeared before spark-interpreter.

Sharing these observations in case they help fix the failing Selenium test.

https://github.com/apache/zeppelin/blob/190ecd6a2febd58616e0fdbb2686fbd2b2ea7e7a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java#L1037

tbonelee avatar Oct 05 '25 04:10 tbonelee

There are still some references to zeppelin-zengine left in the shell scripts under /bin. (e.g., https://github.com/jongyoul/zeppelin/blob/5cc2196bf7afc172cf6f9f7d7cc5f64f9f70b31b/bin/interpreter.sh#L115)

tbonelee avatar Oct 06 '25 15:10 tbonelee

How about switching to the unshaded ZeppelinConfiguration as in this commit? (2. from my previous comment.) Maybe rawString fixes some cases, but by skipping required shaded classes it creates new issues. It seems to fix all the failing jobs, though I'd prefere a more explicit fix if possible. (Failed E2E test was fixed in other PR) image image

tbonelee avatar Oct 15 '25 02:10 tbonelee

@tbonelee Have you found any good documentation regarding rawString?

Reamer avatar Oct 15 '25 12:10 Reamer

On the official site, the only reference I could find is the Javadoc link for the type used by the <relocations> property (https://maven.apache.org/components/plugins/maven-shade-plugin/apidocs/org/apache/maven/plugins/shade/relocation/SimpleRelocator.html), which doesn’t provide much detail.

I skimmed the code and initially assumed rawString was an option to exclude string literals in .java files from being shaded (https://github.com/apache/maven-shade-plugin/blob/073e7ca970c15a8cae64e2de9517fe8d471b5373/src/main/java/org/apache/maven/plugins/shade/relocation/SimpleRelocator.java#L217). I only tried the rawString option to quickly check whether our test failures were caused by shading string literals in java code, so I didn’t investigate it more rigorously.

However, looking at the issue where this option was introduced (https://issues.apache.org/jira/projects/MSHADE/issues/MSHADE-104?filter=allissues) and re-read the whole code of SimpleRelocator, the primary intent seems to be treating patterns as regex. If that’s the case, a pattern like com.google would be interpreted as a regex, which could lead to unexpected relocations.

I’ve struck through my earlier explanation about rawString since it was inaccurate. Sorry about that.

tbonelee avatar Oct 15 '25 14:10 tbonelee

Quick question with limited context. Why is ZeppelinConfiguration in the zeppelin-interpreter module? It looks like a general configuration class, so I wondered why it lives under interpreter. Because it sits in zeppelin-interpreter, it gets shaded and tends to be used first, which might relate to the failing tests. If it’s reasonable, would moving ZeppelinConfiguration to a more general module be a better fit and also resolve the tests?

tbonelee avatar Oct 15 '25 15:10 tbonelee

Why is ZeppelinConfiguration in the zeppelin-interpreter module?

In my opinion, we should split this up. ZeppelinConfiguration belongs to the Zeppelin server, and the Zeppelin interpreter should really only work on a HashMap with ConfigKey and ConfigValue.

The entire loading of the configuration and possibly also a reload should all belong to the server.

Currently, the Zeppelin server transmits the configuration from the server to the remote interpreter when it starts up. See https://github.com/apache/zeppelin/blob/a03dcbcefc4e6d36993fd537d41626e27eb8e05a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterServer.java#L202-L207

Reamer avatar Oct 16 '25 07:10 Reamer