datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

Publish Docker images to Apache repo in DockerHub

Open andygrove opened this issue 3 years ago • 2 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do. I would like users to be able to download Ballista docker images instead of having to build them themselves.

Describe the solution you'd like Publish Docker images to Apache repo in DockerHub. Based this on how Spark does this?

Describe alternatives you've considered None

Additional context None

andygrove avatar Sep 17 '22 16:09 andygrove

I 1000% support this. Every other major software has pre-packaged docker image you can just pull and run. Pushing it to docker hub is easy, the harder part is that we need an apache account with credentials, then to put that secret in CI. Even if it is in CI, I don't know how we could prevent it leaking from PRs, but I could look into it.

avantgardnerio avatar Sep 18 '22 15:09 avantgardnerio

I filed a ticket with INFRA: https://issues.apache.org/jira/browse/INFRA-23709

andygrove avatar Sep 21 '22 14:09 andygrove

It looks like the easiest path is to use GitHub Container Registry. This doesn't rule out looking at publishing to Dockerhub in the future, but I recommend that we try this first.

We can use GITHUB_TOKEN within GitHub actions for authentication.

https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry

andygrove avatar Oct 06 '22 13:10 andygrove

My understanding is that the Docker image URLs would be

ghcr.io/apache/arrow-ballista-scheduler
ghcr.io/apache/arrow-ballista-executor

We may also want to publish some client images to make it easier for users who don't already have appropriate Rust or Python setups.

ghcr.io/apache/arrow-ballista-cli
ghcr.io/apache/arrow-ballista-python

andygrove avatar Oct 06 '22 13:10 andygrove

Just a heads up, gcr.io is google's container registry, ghcr.io is github's.

tfeda avatar Oct 15 '22 14:10 tfeda

Here is the updated proposal for how this will work:

We can use a GitHub action to push to ghcr.io when we push release tags to the repo.

When we tag the repo with a release candidate tag e.g. 0.10.0-rc1, we start a vote that covers both the source release (tarball) and also the published Docker image release candidate.

Once the vote passes, we push the 0.10.0 tag, and this will cause the image to be published with the 0.10.0 version, and we can then announce that as an official release.

andygrove avatar Nov 09 '22 23:11 andygrove

@avantgardnerio fyi

andygrove avatar Nov 09 '22 23:11 andygrove

S&t will 100% volunteer for this 🤠

avantgardnerio avatar Nov 10 '22 03:11 avantgardnerio

Thanks @avantgardnerio. I assume that this is too much to try and get into 0.10.0 (planning on releasing this in the next few days), so I have tagged this for 0.11.0. Let me know if do you want to try and do this for 0.10.0 (we an always delay the release to accommodate this).

andygrove avatar Nov 10 '22 15:11 andygrove

Adding for reference: currently blocked by https://issues.apache.org/jira/browse/INFRA-23897

avantgardnerio avatar Nov 15 '22 18:11 avantgardnerio