incubator-livy
incubator-livy copied to clipboard
[LiVY-590] Add dependency to jersey-core
What changes were proposed in this pull request?
After I upgraded Livy to 0.6.0-incubating, I get following error message when starting livy-server. Also, the livy-server process cannot get the job's appId and status. (See the JIRA for the details)
19/04/24 15:05:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-ja
va classes where applicable
java.lang.NoClassDefFoundError: javax/ws/rs/ext/MessageBodyReader
at java.lang.ClassLoader.defineClass1(Native Method)
..
at org.apache.hadoop.yarn.util.timeline.TimelineUtils.<clinit>(TimelineUtils.java:50)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:179)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.livy.utils.SparkYarnApp$.yarnClient$lzycompute(SparkYarnApp.scala:51)
at org.apache.livy.utils.SparkYarnApp$.yarnClient(SparkYarnApp.scala:49)
at org.apache.livy.server.LivyServer$$anonfun$start$6.apply(LivyServer.scala:145)
at org.apache.livy.server.LivyServer$$anonfun$start$6.apply(LivyServer.scala:145)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.ClassNotFoundException: javax.ws.rs.ext.MessageBodyReader
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 50 more
Comparing livy 0.5.0 and 0.6.0 packages, I noticed that livy-0.6.0 does not have jersey-core jar file. Due to the lack of the jar file, we're getting the error above.
This change was made by a part of the changes in LIVY-502. This pull request reverts the change.
Looks like this issue is not happening in the CI. Unfortunately, I could not get why the issue does not happen in the CI env. (I'd like to double-check that the CI env does not have jersey-core jar.) However, I think we should not exclude jersey-core dependency because livy-server depends on hadoop and hadoop depends on jersey-core (MessageBodyReader).
How was this patch tested?
Tested manually and confirmed that this change can fix the issue we are seeing.
thanks for submitting the PR @akitanaka.
I think we should really first understand why you are seeing this issue. I remember I did several tests also on real clusters after that patch and never saw that issue. Moreover, the reason why it was excluded was an incompatibility with the classes needed by the thriftserver for the http mode. IIRC, the thriftserver needed a newer version of the libraries which are in the dependencies of the hadoop module.
I think the solution might be to keep the exclusion form the hadoop dependencies and add a dependency on the needed jars in Livy, so that we do not rely on what is part of the hadoop distribution but we have control on it.
Codecov Report
Merging #170 into master will decrease coverage by
0.08%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #170 +/- ##
============================================
- Coverage 68.67% 68.58% -0.09%
+ Complexity 907 904 -3
============================================
Files 100 100
Lines 5666 5666
Branches 850 850
============================================
- Hits 3891 3886 -5
- Misses 1223 1226 +3
- Partials 552 554 +2
Impacted Files | Coverage Δ | Complexity Δ | |
---|---|---|---|
...c/main/scala/org/apache/livy/repl/ReplDriver.scala | 30.76% <0%> (-2.57%) |
7% <0%> (ø) |
|
...ain/java/org/apache/livy/rsc/driver/RSCDriver.java | 77.96% <0%> (-2.12%) |
41% <0%> (-1%) |
|
...main/scala/org/apache/livy/server/LivyServer.scala | 35.46% <0%> (-0.5%) |
11% <0%> (ø) |
|
...cala/org/apache/livy/scalaapi/ScalaJobHandle.scala | 55.88% <0%> (+5.88%) |
7% <0%> (ø) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 5abc043...f708c30. Read the comment docs.
Thank you very much for your answer, mgaido91@
In our env, we don't build thrift server, we just build and run livy-server. I think this is the reason that we're seeing the issue in our env. On the other hand, I still think livy-server should have jersey-core jar file.
I think the solution might be to keep the exclusion form the hadoop dependencies and add a dependency on the needed jars in Livy, so that we do not rely on what is part of the hadoop distribution but we have control on it.
I agree with the approach. I updated my PR with adding a dependency on the jersey-core jar.
Some checks were failed.
I think the check failed because we updated jersey-core from 1.9 to 1.19. When I build Livy without adding this PR, the jersey-core version in thriftserver was 1.9 and not 1.19. So, you did not see the failure when you pushed LIVY-502 because the jersey-core version was 1.9 in the CI test.
$ mvn clean package -P thriftserver -DskipTests=true
..
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 04:00 min
[INFO] Finished at: 2019-04-28T04:38:55Z
[INFO] ------------------------------------------------------------------------
$ find|grep jersey-core
./thriftserver/client/target/jars/jersey-core-1.9.jar
mmmh, well the thriftserver/client
module is useful only for having a working beeline (which you can find in the dev folder). That jar/path is never used in the server side (it should not be used, at least). And in the ITs which are failing, hence, that path is not considered at all.
I see, instead, that there is a test dependency on hadoop-common
in the livy-integration-test
module, which brings jersey-core-1.9
. So this may be the root cause of the problem, bringing 2 incompatible versions for the same library. You might want and try to exclude it from there too.
Hello. As far as I tested locally, the the ITs fails when livy-server has jersey-core 1.19
. If the package does not have jersey-core
or it has jersey-core 1.9
, the test succeeded.
Since a thrift/client jar/path is never used in the server side, I think we can have a different jersey-core version for livy-server and thrift/client. Also, livy-server should have a same version of jersey-core package that hadoop-common has. So, I think we should specify jersey-core version in thrift/client and not livy-server.
I updated my pull request, now livy-server has jersey-core 1.9
(The version is defined by hadoop-common) and thrift/client has jersey-core 1.19
.
- default
[ec2-user@ip-10-0-2-216 incubator-livy]$ find|grep jersey-core
./thriftserver/client/target/jars/jersey-core-1.9.jar
# In my environment, jersey-core version is 1.9 and not 1.19.
- remove exclustion for jersey-core in server/pom.xml
[ec2-user@ip-10-0-2-216 incubator-livy]$ find|grep jersey-core
./server/target/jars/jersey-core-1.9.jar
./thriftserver/client/target/jars/jersey-core-1.9.jar
- add jersey-core version to thrift/client
[ec2-user@ip-10-0-2-216 incubator-livy]$ find|grep jersey-core
./server/target/jars/jersey-core-1.9.jar
./thriftserver/client/target/jars/jersey-core-1.19.jar
@akitanaka the problem is not with the thtiftserver client. The problem is when you are enabling the thriftserver module, so the thrift server is running in the Livy server and on server side I remember I had issues because in http mode the Hive 3.0 protocol which is the base for the livy thriftserver needed a newer version than 1.9.
To give you more reference, you can see here my commit for avoiding issues with http mode for the thriftserver (https://github.com/apache/incubator-livy/pull/117/commits/545a5c3017e6daca022a61e8c51dbaefc98f8433). I am not sure why we are not seeing issues in the CI. As you can see in the commit description, I had to do that in order to avoid problems with http mode for the server side of the thriftserver.
But since I don't see UT failures, I can't prove that. I'll try and run this patch on a local env, meanwhile let me cc @vanzin so he can check and maybe run more tests with this patch in order to ensure this doesn't introduces problems.
I have not been able to reproduce any issue with this new PR, but I remember I did have problems with it and it was environment dependent because it depended on the class loading order.
Honestly I don't think the current approach is fine. Just reverting that change isn't the right fix IMHO. I saw those files are in javax.ws.rs:javax.ws.rs-api:jar:2.0.1
which is indeed included through the glassfish dependency in the thriftserver module. May you try adding this dependency to the Livy server and check if this works for you?
@mgaido91 I haven't been able to reproduce the issue you experienced in LIVY-502, so I'm still not sure what the issue is. (As I added my test result, as far as I checked out a latest Livy code and built the Livy and Livy thrift server module, a jersey-core-1.9.jar
was created only in thriftserver/client directory. (You mentioned that the thrift server needs a jersey-core-1.19
.)
What I want to say is I feel the approach in LIVY-502 (https://github.com/apache/incubator-livy/pull/117) was not correct. Since hadoop-client consumes jersey-core (and livy-server consumes hadoop-client) so we should not exclude the dependency from livy-server.
If you can give me a test to reproduce the issue you saw when working on #117, I'll test it in my environment.
Also, I'm not sure about the glassfish dependency you mentioned in the previous comment... (At least in my PR, I have not mentioned anything about the glassfish dependency) Could you please explain what this is and what do you want me to test ?
Also, I'm not sure about the glassfish dependency you mentioned in the previous comment...
If you check https://github.com/apache/incubator-livy/commit/545a5c3017e6daca022a61e8c51dbaefc98f8433, you'll see that I had to introduce a glassfish dependency, which was incompatible with version 1.9 of jersey. The reason I had to introduce that dependency was to make the thriftserver work also in http mode.
To test the thriftserver in http mode, you can build livy with -Pthriftserver
and then configure livy to use the thriftserver, ie. add to your livy.conf
the properties:
livy.server.thrift.enabled=true
livy.server.thrift.transport.mode=http
In this configuration, without excluding jersey-core-1.9.jar
I remember I faced some exception due to incompatible versions of that library.
This problem persists in the current master.
at org.apache.livy.utils.SparkYarnApp$.yarnClient(SparkYarnApp.scala:52)
at org.apache.livy.utils.SparkYarnApp$$anon$1.run(SparkYarnApp.scala:78)
Caused by: java.lang.ClassNotFoundException: javax.ws.rs.ext.MessageBodyReader
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 54 more
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.yarn.util.timeline.TimelineUtils
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:200)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.livy.utils.SparkYarnApp$.yarnClient$lzycompute(SparkYarnApp.scala:54)
And then I see the exact same issue that @akitanaka describes, i.e. the Livy session remains in starting
state.
Even if, jersey-core-1.19.jar
is added, the problem won't be solved, since that jar no longer includes the missing class. That is
jar tvf jersey-core-1.19.jar | grep javax.ws.rs.ext.MessageBodyReader
1763 Thu Nov 21 07:17:18 UTC 2013 META-INF/services/javax.ws.rs.ext.MessageBodyReader
However, If we look inside the jersey-core-1.9.jar
, we see that the missing class is there:
jar tvf jersey-core-1.9.jar | grep javax.ws.rs.ext.MessageBodyReader
1763 Fri Sep 02 11:16:04 UTC 2011 META-INF/services/javax.ws.rs.ext.MessageBodyReader
950 Fri Sep 02 11:16:40 UTC 2011 javax/ws/rs/ext/MessageBodyReader.class
In order to keep jersey-core-1.19.jar
that's required by the thriftserver and get Livy server working, we need to add the right version of jsr311-api
jar. For example, Hadoop 3.3.0 now includes jsr311-api-1.1.1.jar
. This is the jar that now contains the required class:
jar tvf jsr311-api-1.1.1.jar | grep javax.ws.rs.ext.MessageBodyReader
950 Mon Nov 09 13:45:50 UTC 2009 javax/ws/rs/ext/MessageBodyReader.class
If I manually add this jar to the class path of the Livy server, then it works as expected.
@akitanaka, can you please add jsr311-api-1.1.1.jar
and see if that works for you as well? Shouldn't need to add jersey-core-1.9.jar
if that's done as long as jersey-core-1.19.jar
is on the classpath.
@akitanaka thank you
for my side i copied the 2 jar files into livy, then it works fine.
cp /opt/hadoop-3.3.0/share/hadoop/common/lib/jersey-core-1.19.jar /opt/livy2/jars/ cp /opt/hadoop-3.3.0/share/hadoop/common/lib/jsr311-api-1.1.1.jar /opt/livy2/jars/
Thanks