docker-zeppelin icon indicating copy to clipboard operation
docker-zeppelin copied to clipboard

Increase java heap space

Open hugopetiot opened this issue 8 years ago • 10 comments

@dylanmei Hello, We have an issue with the java heap space, how could we increase it ? Thanks

hugopetiot avatar Jan 30 '17 14:01 hugopetiot

Customize with environment variables, listed here: http://zeppelin.apache.org/docs/0.6.2/install/install.html#apache-zeppelin-configuration

dylanmei avatar Jan 30 '17 14:01 dylanmei

I've increased ZEPPELIN_MEM and ZEPPELIN_INTP_MEM : environment: - ZEPPELIN_MEM=-Xmx4g - ZEPPELIN_INTP_MEM=-Xms1024m -Xmx4g -XX:MaxPermSize=4g

I still have a java heap size error. When I connect to the container and I list the processes running I found the Spark interpreter process which is launched like this by Zeppelin : /usr/jdk1.8.0_92/bin/java -cp /usr/zeppelin/local-repo/2C8SVCYAB/*:/usr/zeppelin/interpreter/spark/*:/usr/zeppelin/lib/zeppelin-interpreter-0.7.0-SNAPSHOT.jar:/usr/zeppelin/interpreter/spark/zeppelin-spark_2.11-0.7.0-SNAPSHOT.jar:/usr/spark-2.0.1/conf/:/usr/spark-2.0.1/jars/*:/usr/hadoop-2.7.2/etc/hadoop/:/usr/hadoop-2.7.2/etc/hadoop/*:/usr/hadoop-2.7.2/share/hadoop/common/lib/*:/usr/hadoop-2.7.2/share/hadoop/common/*:/usr/hadoop-2.7.2/share/hadoop/hdfs/*:/usr/hadoop-2.7.2/share/hadoop/hdfs/lib/*:/usr/hadoop-2.7.2/share/hadoop/hdfs/*:/usr/hadoop-2.7.2/share/hadoop/yarn/lib/*:/usr/hadoop-2.7.2/share/hadoop/yarn/*:/usr/hadoop-2.7.2/share/hadoop/mapreduce/lib/*:/usr/hadoop-2.7.2/share/hadoop/mapreduce/*:/usr/hadoop-2.7.2/share/hadoop/tools/lib/* -Xmx1g -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///usr/zeppelin/conf/log4j.properties -Dzeppelin.log.file=/usr/zeppelin/logs/zeppelin-interpreter-spark--30d79a89c5be.log org.apache.spark.deploy.SparkSubmit --conf spark.driver.extraClassPath=::/usr/zeppelin/local-repo/2C8SVCYAB/*:/usr/zeppelin/interpreter/spark/*::/usr/zeppelin/lib/zeppelin-interpreter-0.7.0-SNAPSHOT.jar:/usr/zeppelin/interpreter/spark/zeppelin-spark_2.11-0.7.0-SNAPSHOT.jar --conf spark.driver.extraJavaOptions= -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///usr/zeppelin/conf/log4j.properties -Dzeppelin.log.file=/usr/zeppelin/logs/zeppelin-interpreter-spark--30d79a89c5be.log --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer /usr/zeppelin/interpreter/spark/zeppelin-spark_2.11-0.7.0-SNAPSHOT.jar 33398

As you can see, there is a -Xmx1g option specified and I think the problem comes from that. Do you have an idea on how I could change that ?

hugopetiot avatar Jan 31 '17 13:01 hugopetiot

@dylanmei Somebody solved this? Have same issue with docker-compose

zeppelin: image: dylanmei/docker-zeppelin environment: ZEPPELIN_PORT: 8080 ZEPPELIN_MEM: Xmx4g ZEPPELIN_INTP_MEM: >- -Xms1024m -Xmx4g -XX:MaxPermSize=4g ZEPPELIN_JAVA_OPTS: >- -Dspark.driver.memory=2g -Dspark.executor.memory=4g -Dspark.cores.max=2 MASTER: local[*] ports: - 8080:8080 - 4040:4040 volumes: - ./data:/usr/zeppelin/data - ./notebooks:/usr/zeppelin/notebook

MielHostens avatar Feb 14 '17 09:02 MielHostens

@MielHostens I find a "homemade solution" which is not optimal but it works. Once I'm in the docker container, I modified the /usr/spark-2.0.1/bin/spark-class file. I added after the line 82 the following line (to have 12gb memory in my case) : CMD[3]="-Xmx12g"

You should have something like this :

COUNT=${#CMD[@]}  
LAST=$((COUNT - 1))  
LAUNCHER_EXIT_CODE=${CMD[$LAST]}  
CMD[3]="-Xmx12g"

hugopetiot avatar Feb 18 '17 21:02 hugopetiot

That's a good hint, @hugopetiot. There's no corresponding issues in Zeppelin's Jira about this. I don't get anything by grepping "Xmx1g" in the Zeppelin project. I feel like I need to understand how all Zeppelin's aren't broken or how we're misconfigured on our side.

dylanmei avatar Feb 19 '17 17:02 dylanmei

@dylanmei @hugopetiot Thanks for the update, i will try how far i can get with this.

Would be good to have the option to set spark drivers higher than currently 366 Mb from the docker build file. Any help on that would be great!

MielHostens avatar Mar 07 '17 09:03 MielHostens

Guys, just checked, it indeed works. How can we make this persistent via the docker build or docker compose?

MielHostens avatar Mar 07 '17 20:03 MielHostens

Would it make sense to create a spark-defaults.conf and use ADD as part of Dockerfile (or dynamically generate it as part of docker-compose), instead of having to modify spark-class?

Seems like a easier way to modify spark.driver.memory, spark.driver.maxResultSize, spark.executor.memory, etc.

ozskywalker avatar Jul 07 '17 20:07 ozskywalker

ZEPPELIN_MEM variable is for Zeppelin main process (ZeppelinServer). ZEPPELIN_INTP_MEM is for every zeppelin interpreter (except spark submit). spark.driver.memory is for run the spark driver when there is a spark submit, default values is 1GB. I think that is the property value that you want to increase.

jhonderson avatar Mar 07 '18 00:03 jhonderson

@jhonderson I have a question about ZEPPELIN_INTP_MEM, does this mean a Markdown interpreter would also take the memory I set in ZEPPELIN_INTP_MEM?

maziyarpanahi avatar May 24 '18 15:05 maziyarpanahi