zeppelin
zeppelin copied to clipboard
[ZEPPELIN-6067] Add docker-compose file for running with Spark
What is this PR for?
Provide a YAML file that creates Apache Zeppelin and Apache Spark containers together using the docker compose command.
What type of PR is it?
Improvement
Todos
- [x] - Update .gitignore
- [x] - Add docker-compose-with-spark.yml
- [x] - Update READMD.md
What is the Jira issue?
https://issues.apache.org/jira/projects/ZEPPELIN/issues/ZEPPELIN-6067
How should this be tested?
- Install Spark binary
cd scripts/docker/zeppelin-quick-start
wget https://archive.apache.org/dist/spark/spark-3.5.2/spark-3.5.2-bin-hadoop3.tgz
tar -xvf spark-3.5.2-bin-hadoop3.tgz
- docker compose -f docker-compose-with-spark.yml up
- Run paragraph
%spark.conf
SPARK_HOME /opt/spark
spark.master spark://spark-master:7077
- Run paragraph
%spark
val sdf = spark.createDataFrame(Seq((0, "park", 13, 70, "Korea"), (1, "xing", 14, 80, "China"), (2, "john", 15, 90, "USA"))).toDF("id", "name", "age", "score", "country")
sdf.printSchema
sdf.show()
- Result
root
|-- id: integer (nullable = false)
|-- name: string (nullable = true)
|-- age: integer (nullable = false)
|-- score: integer (nullable = false)
|-- country: string (nullable = true)
+---+----+---+-----+-------+
| id|name|age|score|country|
+---+----+---+-----+-------+
| 0|park| 13| 70| Korea|
| 1|xing| 14| 80| China|
| 2|john| 15| 90| USA|
+---+----+---+-----+-------+
Screenshots (if appropriate)
-
Zeppelin UI (http://localhost:8080)
-
Spark Master UI (http://localhost:18080)
-
Spark Worker UI (http://localhost:18081)
Questions:
- Does the license files need to update? No.
- Is there breaking changes for older versions? No.
- Does this needs documentation? No.