docker-spark-cluster
docker-spark-cluster copied to clipboard
Refactoring infrastructure
- Change base image (Oracle Entreprise Linux 8).
- Improved the use of variables.
- Redid the installation of python packages on pip.
- Redid installing jvm components on sdkman.
I like how $SPARK_HOME
is used here. It makes things more clear for me.
I like how
$SPARK_HOME
is used here. It makes things more clear for me.
The main application in these containers is Spark, so it's important to know where it's installed.
If you need to enter a container, the SPARK_HOME
directory will be used as the base directory when you enter, thanks to WORKDIR ${SPARK_HOME}
, this will make it easy to see the installed components simply by using ls
or running commands from the relative path ./
. Also, the main process startup scripts use this variable instead of the hardcoded absolute paths.