containers icon indicating copy to clipboard operation
containers copied to clipboard

[bitnami/spark] Issue on spark write to container

Open ashishmgofficial opened this issue 3 years ago • 0 comments

Name and Version

bitnami/spark:3.3.0

What steps will reproduce the bug?

Docker Compose :

spark:
    image: bitnami/spark:3.3.0
    user: root # Run container as root container: https://docs.bitnami.com/tutorials/work-with-non-root-containers/
    hostname: spark
    environment:
      - SPARK_MODE=master
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
    volumes:
      - ./data:/opt/application/data
    ports:
      - "8181:8080"
      - "7077:7077"

  spark-worker-1:
    image: bitnami/spark:3.3.0
    user: root
    environment:
      - SPARK_MODE=worker
      - SPARK_MASTER_URL=spark://spark:7077
      - SPARK_WORKER_MEMORY=1G
      - SPARK_WORKER_CORES=1
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
     volumes:
       - ./data:/opt/application/data
    depends_on:
      - spark

On spark write command :

java.io.FileNotFoundException: File ***** _temporary/0 does not exist [2022-07-31, 16:42:26 UTC] {spark_submit.py:485} INFO - at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:597)

Also getting Mkdirs failed error warnings for the same _temporary/0 :

ERROR FileOutputCommitter: Mkdirs failed to create file:/opt/application/data/processed/_temporary/0

My whole target is to write to a location in the container which is mounted as a volume to my local system. On verifying, it seems Bitnami spark writes and creates folders as part of spark as a spark user. Maybe that's the issue?

What is the expected behavior?

Successful write to a container location.

ashishmgofficial avatar Jul 31 '22 18:07 ashishmgofficial