gaffer-docker
gaffer-docker copied to clipboard
Enable providing own hadoop for pyspark notebook image
In the hdfs and Accumulo Dockerfiles, users can provide their own builds of Accumulo, ZooKeeper and Hadoop to be used instead of building them inside the image: https://github.com/gchq/gaffer-docker/blob/e26dbe7e0575d1bcc078a38a624032dfabe68f5d/docker/accumulo/Dockerfile#L50-L54 This can save a lot of time with repeated builds. This cannot be done, however, for building hadoop inside the pyspark notebook Dockerfile: https://github.com/gchq/gaffer-docker/blob/e26dbe7e0575d1bcc078a38a624032dfabe68f5d/docker/gaffer-pyspark-notebook/Dockerfile#L34-L39
It would be great if this was added to that Dockerfile also.