spark
spark copied to clipboard
Initial runtime docker image files
This PR contains a build script, Dockerfile(s), README.md and supporting files to create a docker image that can be used to run .NET for Apache Spark in a container.
An initial description for the runtime image, along with the folder structure can be found here: https://github.com/indy-3rdman/spark/tree/runtime-docker-image/docker/images/runtime
There will be two other images. Dev and interactive. All of them would reside under the /docker/images directory as shown below.
At the moment some files still include references to my original images on the docker hub and to some of my blog posts.
- As @rapoth asked me to get the PR process started already, I'll have to update the image/container references as soon as I get them from him.
- Some of the READMEs link to further articles at https://3rdman.de. Not sure if it is OK to leave them in, though.
If you have any further questions, please let know.
When I attempted to run build.sh
, I was able to successfully finish building the images. However, I encountered this:
Is this normal?
When I attempted to run
build.sh
, I was able to successfully finish building the images. However, I encountered this:
Is this normal?
I've updated the Dockerfile to fix this. It should not have an impact on the functionality of the image however.
The initial commands in the readme did not work for me after building the container - latest wasn't tagged, but when I switched to the actual tag from docker ps
it worked fine. Maybe a note to grab the tag?
docker images | grep 'mcr.microsoft.com/dotnet-spark' | awk '{print $2}'
The initial commands in the readme did not work for me after building the container - latest wasn't tagged, but when I switched to the actual tag from
docker ps
it worked fine. Maybe a note to grab the tag?docker images | grep 'mcr.microsoft.com/dotnet-spark' | awk '{print $2}'
Thanks a lot @JeremyLikness for testing!
@rapoth is currently trying to figure out the official image name(s) that we are going to use and I will update the scripts once that has been clarified.
I was thinking about tagging for latest as part of creating/pushing the manifest to the registry via the docker image build pipline, in case we'd need to support different os/architecture combinations in the future. Something like:
docker manifest create mcr.microsoft.com/dotnet-spark:runtime-latest mcr.microsoft.com/dotnet-spark:2.4.6-0.12.1-runtime
docker manifest annotate mcr.microsoft.com/dotnet-spark:runtime-latest mcr.microsoft.com/dotnet-spark:2.4.6-0.12.1-runtime --os linux --arch amd64
docker manifest mcr.microsoft.com/dotnet-spark:runtime-latest --purge
I'll update the readme to not use latest, as the build script supports specifying different versions of Apache Spark and .NET for Apache Spark. However, only an image with the latest versions of both should be tagged as latest, in my opinion.