zeppelin icon indicating copy to clipboard operation
zeppelin copied to clipboard

DO-N0T-MERGE Move to Hadoop3

Open pan3793 opened this issue 2 years ago • 3 comments
trafficstars

What is this PR for?

This PR is a PoC of moving all modules of Zeppelin to Hadoop3

It is based on https://github.com/apache/zeppelin/pull/4674, and fixes Flink's testing.

What type of PR is it?

Improvement

Todos

  • [x] - Fix Flink Hadoop3 tests
  • [ ] - Split it into several small PRs

What is the Jira issue?

  • Open an issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN/
  • Put link here, and add [ZEPPELIN-Jira number] in PR title, eg. [ZEPPELIN-533]

How should this be tested?

  • Strongly recommended: add automated unit tests for any new or changed behavior
  • Outline any manual steps to test the PR here.

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? No
  • Is there breaking changes for older versions? Yes
  • Does this needs documentation? Yes

pan3793 avatar Nov 13 '23 05:11 pan3793

I basically have fixed all compile and test issues, the next step is to split it into several small PRs to speed up the review process.

I think we should start with the interpreter modules one by one, and then zengine, server and other modules, eventually dropping the hadoop2 profile and updating docs.

@Reamer could you give some advice?

pan3793 avatar Nov 27 '23 02:11 pan3793

I would prefer a larger PR. Where individual tasks are contained in commits. It was clear that the drop of Hadoop2 is very large. Thank you for your work so far.

I think it's great that you have deleted all the excludes in the parent pom.xml, that makes the file much more readable.

Btw. I do not insist on co-authorship.

Reamer avatar Nov 27 '23 13:11 Reamer

unfortunately, I found the IT does not run properly now, see https://github.com/apache/zeppelin/pull/4699, we may need to postpone this PR after recovering IT

pan3793 avatar Dec 01 '23 13:12 pan3793

@Reamer it's ready for review, please take a look when you have time

pan3793 avatar Mar 23 '24 14:03 pan3793

The Python 3.8 test failure should be addressed in #4748

pan3793 avatar Apr 01 '24 12:04 pan3793

@Reamer all failed tests are known flaky tests, this patch should be good to go :)

pan3793 avatar Apr 02 '24 10:04 pan3793

I will merge the pull request on Wednesday as long as no further comments are received.

Reamer avatar Apr 02 '24 11:04 Reamer

Could this change break the build? I try to collect and get an error

WARN [2024-05-23 17:04:06,762] ({main} WebAppContext.java[doStart]:533) - Failed startup of context o.e.j.w.WebAppContext@4816278d{/,jar:file:///opt/zeppelin/zeppelin-web-0.12.0-SNAPSHOT.war!/,STOPPED}{/opt/zeppelin/zeppelin-web-0.12.0-SNAPSHOT.war} java.io.FileNotFoundException: JAR entry WEB-INF/lib/hadoop-client-api-3.3.6.jar!/ not found in /opt/zeppelin/zeppelin-web-0.12.0-SNAPSHOT.war

Armadik avatar May 23 '24 14:05 Armadik

@Armadik mind providing a reproducible step? e.g. build command, start command, OS platform, JDK version, etc.

pan3793 avatar May 24 '24 02:05 pan3793

I see an error when running the zeppelin.sh script

Ubuntu 22.04.4 LTS `apt update

apt install -y curl git maven openjdk-11-jdk npm libfontconfig r-base-dev r-cran-evaluate

wget https://repo.maven.apache.org/maven2/org/apache/maven/apache-maven/3.6.3/apache-maven-3.6.3-bin.tar.gz

sudo tar -zxf apache-maven-3.6.3-bin.tar.gz -C /usr/local/

sudo ln -s /usr/local/apache-maven-3.6.3/bin/mvn /usr/local/bin/mvn

cd Documents/

git clone https://github.com/apache/zeppelin.git

cd zeppelin/

export MAVEN_OPTS="-Xms1024M -Xmx4096M -XX:MaxMetaspaceSize=1024m -XX:-UseGCOverheadLimit -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=war"

./mvnw -B package -DskipTests -Pbuild-distr -Pspark-3.3 -Pinclude-hadoop -Phadoop3 -Pspark-scala-2.12 -Pweb-angular -Pweb-dist -pl '!groovy,!submarine,!flink,!cassandra,!jdbc,!bigquery,!alluxio,!mongodb,!neo4j' -am --no-transfer-progress `

Armadik avatar May 24 '24 06:05 Armadik

] Copying webapp resources [/home/micha/Documents/zeppelin/zeppelin-web/dist]
[INFO] deleting outdated resource WEB-INF/lib/hadoop-client-api-3.3.6.jar
[INFO] deleting outdated resource WEB-INF/lib/hadoop-client-runtime-3.3.6.jar
[INFO] Building war: /zeppelin/zeppelin-web/target/zeppelin-web-0.12.0-SNAPSHOT.war``

This one doesn't seem to work

The team helped me
zip -d /opt/zeppelin/zeppelin-web-0.12.0-SNAPSHOT.war WEB-INF/lib/*

Armadik avatar May 24 '24 13:05 Armadik

@Armadik sorry, can not reproduce, both classic and new UI are good on my side.

pan3793 avatar Jul 03 '24 14:07 pan3793

I tried a clean build. It seems the problem was in my environment(

Armadik avatar Aug 01 '24 13:08 Armadik