BigDL-2.x icon indicating copy to clipboard operation
BigDL-2.x copied to clipboard

init on k8s will throw error if bigdl is not installed by pip

Open qiuxin2012 opened this issue 4 years ago • 1 comments

Command is

$SPARK_HOME/bin/spark-submit \
    --master $RUNTIME_SPARK_MASTER \
    --deploy-mode client \
    --name analytics-zoo-ncf \
    --conf spark.executor.instances=$RUNTIME_EXECUTOR_INSTANCES \
    --conf spark.driver.host=$RUNTIME_DRIVER_HOST \
    --conf spark.driver.port=$RUNTIME_DRIVER_PORT \
    --conf spark.kubernetes.container.image=$RUNTIME_K8S_SPARK_IMAGE \
    --conf spark.kubernetes.executor.podTemplateFile=/ppml/trusted-big-data-ml/spark-executor-template.yaml \
    --conf spark.kubernetes.executor.deleteOnTermination=false \
    --conf spark.driver.memory=8g \
    --executor-cores 5 \
    --total-executor-cores 5 \
    --executor-memory 128G \
    --conf spark.network.timeout=10000000 \
    --conf spark.executor.heartbeatInterval=10000000 \
    --conf spark.executor.extraClassPath=/ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-jar-with-dependencies.jar,/ppml/trusted-big-data-ml/work/bigdl-jar-with-dependencies.jar \
    --conf spark.driver.extraClassPath=/ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-jar-with-dependencies.jar,/ppml/trusted-big-data-ml/work/bigdl-jar-with-dependencies.jar \
    --properties-file /ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/conf/spark-analytics-zoo.conf \
    --jars /ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-jar-with-dependencies.jar,/ppml/trusted-big-data-ml/work/bigdl-jar-with-dependencies.jar \
    --py-files /ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-python-api.zip,/ppml/trusted-big-data-ml/work/bigd-python-api.zip \
    --verbose \
    --files /ppml/trusted-big-data-ml/work/data/ml-1m/ratings_new.dat.2 \
    /ppml/trusted-big-data-ml/work/data/ncf/ncf-dataframe.py

error is

Traceback (most recent call last):
  File "/ppml/trusted-big-data-ml/work/data/ncf/ncf-dataframe.py", line 24, in <module>
    cores=4) # run in local mode
  File "/ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-python-api.zip/zoo/orca/common.py", line 244, in init_orca_context
  File "/ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-python-api.zip/zoo/common/nncontext.py", line 257, in init_spark_on_k8s
  File "/ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-python-api.zip/zoo/util/spark.py", line 268, in init_spark_on_k8s
  File "/ppml/trusted-big-data-ml/work/analytics-zoo-0.12.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.13.0-spark_3.1.2-0.12.0-SNAPSHOT-python-api.zip/zoo/util/utils.py", line 145, in get_zoo_bigdl_classpath_on_driver
AssertionError: Cannot find BigDL classpath, please check your installation

qiuxin2012 avatar Sep 22 '21 06:09 qiuxin2012

export BIGDL_CLASSPATH to work around.

qiuxin2012 avatar Sep 23 '21 02:09 qiuxin2012