docker-zeppelin
docker-zeppelin copied to clipboard
Increase java heap space
@dylanmei Hello, We have an issue with the java heap space, how could we increase it ? Thanks
Customize with environment variables, listed here: http://zeppelin.apache.org/docs/0.6.2/install/install.html#apache-zeppelin-configuration
I've increased ZEPPELIN_MEM and ZEPPELIN_INTP_MEM :
environment: - ZEPPELIN_MEM=-Xmx4g - ZEPPELIN_INTP_MEM=-Xms1024m -Xmx4g -XX:MaxPermSize=4g
I still have a java heap size error. When I connect to the container and I list the processes running I found the Spark interpreter process which is launched like this by Zeppelin :
/usr/jdk1.8.0_92/bin/java -cp /usr/zeppelin/local-repo/2C8SVCYAB/*:/usr/zeppelin/interpreter/spark/*:/usr/zeppelin/lib/zeppelin-interpreter-0.7.0-SNAPSHOT.jar:/usr/zeppelin/interpreter/spark/zeppelin-spark_2.11-0.7.0-SNAPSHOT.jar:/usr/spark-2.0.1/conf/:/usr/spark-2.0.1/jars/*:/usr/hadoop-2.7.2/etc/hadoop/:/usr/hadoop-2.7.2/etc/hadoop/*:/usr/hadoop-2.7.2/share/hadoop/common/lib/*:/usr/hadoop-2.7.2/share/hadoop/common/*:/usr/hadoop-2.7.2/share/hadoop/hdfs/*:/usr/hadoop-2.7.2/share/hadoop/hdfs/lib/*:/usr/hadoop-2.7.2/share/hadoop/hdfs/*:/usr/hadoop-2.7.2/share/hadoop/yarn/lib/*:/usr/hadoop-2.7.2/share/hadoop/yarn/*:/usr/hadoop-2.7.2/share/hadoop/mapreduce/lib/*:/usr/hadoop-2.7.2/share/hadoop/mapreduce/*:/usr/hadoop-2.7.2/share/hadoop/tools/lib/* -Xmx1g -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///usr/zeppelin/conf/log4j.properties -Dzeppelin.log.file=/usr/zeppelin/logs/zeppelin-interpreter-spark--30d79a89c5be.log org.apache.spark.deploy.SparkSubmit --conf spark.driver.extraClassPath=::/usr/zeppelin/local-repo/2C8SVCYAB/*:/usr/zeppelin/interpreter/spark/*::/usr/zeppelin/lib/zeppelin-interpreter-0.7.0-SNAPSHOT.jar:/usr/zeppelin/interpreter/spark/zeppelin-spark_2.11-0.7.0-SNAPSHOT.jar --conf spark.driver.extraJavaOptions= -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///usr/zeppelin/conf/log4j.properties -Dzeppelin.log.file=/usr/zeppelin/logs/zeppelin-interpreter-spark--30d79a89c5be.log --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer /usr/zeppelin/interpreter/spark/zeppelin-spark_2.11-0.7.0-SNAPSHOT.jar 33398
As you can see, there is a -Xmx1g option specified and I think the problem comes from that. Do you have an idea on how I could change that ?
@dylanmei Somebody solved this? Have same issue with docker-compose
zeppelin: image: dylanmei/docker-zeppelin environment: ZEPPELIN_PORT: 8080 ZEPPELIN_MEM: Xmx4g ZEPPELIN_INTP_MEM: >- -Xms1024m -Xmx4g -XX:MaxPermSize=4g ZEPPELIN_JAVA_OPTS: >- -Dspark.driver.memory=2g -Dspark.executor.memory=4g -Dspark.cores.max=2 MASTER: local[*] ports: - 8080:8080 - 4040:4040 volumes: - ./data:/usr/zeppelin/data - ./notebooks:/usr/zeppelin/notebook
@MielHostens
I find a "homemade solution" which is not optimal but it works.
Once I'm in the docker container, I modified the /usr/spark-2.0.1/bin/spark-class
file.
I added after the line 82 the following line (to have 12gb memory in my case) :
CMD[3]="-Xmx12g"
You should have something like this :
COUNT=${#CMD[@]}
LAST=$((COUNT - 1))
LAUNCHER_EXIT_CODE=${CMD[$LAST]}
CMD[3]="-Xmx12g"
That's a good hint, @hugopetiot. There's no corresponding issues in Zeppelin's Jira about this. I don't get anything by grepping "Xmx1g" in the Zeppelin project. I feel like I need to understand how all Zeppelin's aren't broken or how we're misconfigured on our side.
@dylanmei @hugopetiot Thanks for the update, i will try how far i can get with this.
Would be good to have the option to set spark drivers higher than currently 366 Mb from the docker build file. Any help on that would be great!
Guys, just checked, it indeed works. How can we make this persistent via the docker build or docker compose?
Would it make sense to create a spark-defaults.conf
and use ADD as part of Dockerfile (or dynamically generate it as part of docker-compose), instead of having to modify spark-class
?
Seems like a easier way to modify spark.driver.memory
, spark.driver.maxResultSize
, spark.executor.memory
, etc.
ZEPPELIN_MEM variable is for Zeppelin main process (ZeppelinServer). ZEPPELIN_INTP_MEM is for every zeppelin interpreter (except spark submit). spark.driver.memory is for run the spark driver when there is a spark submit, default values is 1GB. I think that is the property value that you want to increase.
@jhonderson I have a question about ZEPPELIN_INTP_MEM
, does this mean a Markdown interpreter would also take the memory I set in ZEPPELIN_INTP_MEM
?