ara
ara copied to clipboard
Add per release docker image on the DockerHub
What is the idea ?
Thanks for the amazing product of ara for better consolidation of ansible run results from everywhere! I have a suggestion about the DockerHub docker image management here. It would be great to have docker hub images not only with latest tag, but also the previous old release tags, so that it would be more deterministic for local usage without surprise when using the latest tag of docker image.
Hi @lordofire and thanks for the feedback.
I agree that it would be nice but the images are in fact somewhat limited by design in order to reduce the scope of support and maintenance since this isn't a funded or commercial project.
Users are encouraged to build and tweak their own optimized images starting from the scripts that are used to publish the existing images: https://github.com/ansible-community/ara/tree/master/contrib/container-images
I won't close the issue because it doesn't mean that we won't do it one day but in the meantime, for less surprise, you can use the "*-pypi-latest" tags (such as fedora35-pypi-latest
which is off of the latest version published to pypi rather than the latest commit from source.
@dmsimard Hey! I kinda agree with @lordofire on this. I think it would be a great feature, especially for close systems where we can't upgrade ARA API to latest version because of dependencies, etc.
But, that does not mean tags should be supported. I think, once a tag is released, it is considered as finished and if a user want another functionality and/or bug fix, he/she should update to an upper version.
I would also like that, because the current docker images are not usable in my company and we have to build an image by ourselve.
I agree that it would be nice but the images are in fact somewhat limited by design in order to reduce the scope of support and maintenance since this isn't a funded or commercial project.
@dmsimard : With the current approach we have to maintain 5 different docker images, but with this approach it could be reduced to one. Thus I would say that this will not increase the effort.
My idead would be:
- only build one Docker image sourced from an official Python image and a possibilty to change the gunicorn settings
- when releasing a new version of ara, creating a tag equals the version and pointing latest as alias to it
- optional: on each change in master branch pushing a new image with "dev" tag.
I would guess that this would really help a lot of people, because it tackles various possibilities:
- you can always go with the latest release
- you can pinn the version of your server, to e.g. be compatible with the clients
- you can use docker images to directly test the newest development in the master branch
This approach would also reduce the complexity, because we can focus on building one container image.
Automatically creating tagged images with Github Actions is super easy. I can work on a PR if needed.
@gaby we already build and publish new images on every commit, that part works fine (though it did break on a number of occasions) and it is designed to support any number of images.
I can tell you more about it if you're curious but tl;dr: it's an ansible playbook that wraps around simple scripts that builds images with buildah in-lieu of Dockerfiles. The script could build from a Dockerfile instead, it doesn't really matter ¯\(ツ)/¯
The part that publishes the images is another playbook and we could add a version tag task in there.
I've tried to keep this short but felt I needed to elaborate, sorry.
I would also like that, because the current docker images are not usable in my company and we have to build an image by ourselve.
Hopefully the project has made it simple enough for you and others to do it. Something like teaching people to fish instead of giving them a fish ? No offense !
What I mean is that I think it is a good practice to build your own images instead of pulling from a public registry in the context of supply chain security. I mean to encourage that, especially for a tool that can contain sensitive information like ara.
Images can be compromised in a number of spectacular, interesting or boring ways and then every image that builds off of that image are automatically compromised. It's kind of scary. There's many documented cases of it happening, a quick search gave me this one but there's different flavors to it (remember log4j or left pad disappearing?).
Tangentially related, there was recently an interesting case that compromised pytorch via GitHub pull requests: https://marcyoung.us/post/zuckerpunch/
Anyway, I guess what I am trying to say is that people should build their own images. We should show them how and make it as easy and simple as possible for them but these things are in contrib and published as a convenience because I am not interested in being on the hook for them in the current circumstances.
Hope that makes sense.
@dmsimard : With the current approach we have to maintain 5 different docker images, but with this approach it could be reduced to one. Thus I would say that this will not increase the effort.
The number of images isn't as problematic as the expectation (and burden) of support and maintenance.
My idead would be: * only build one Docker image sourced from an official Python image and a possibilty to change the gunicorn settings
I am open to share this burden if you would like to help and so I have no problem with a PR that adds a script to build an image based off the python image. Feel free to take inspiration from the existing scripts but it could build off of a Dockerfile if you really want to.
Tunable gunicorn settings would be useful as well, it could probably be an environment variable.
* when releasing a new version of ara, creating a tag equals the version and pointing latest as alias to it * optional: on each change in master branch pushing a new image with "dev" tag.
I wrote a bit how it works right now in a previous comment, the challenge is not technical because we already do something very similar to this.
I would guess that this would really help a lot of people, because it tackles various possibilities: * you can always go with the latest release * you can pinn the version of your server, to e.g. be compatible with the clients
I feel this is already possible by pinning to a specific hash, like specifying :latest@sha256:758a16f46276567202a2b58239980cd9f3834b6da662fa2ad9cc1c5ee25fe3fd
instead of :latest
.
I also think it is a good practice to pin to a specific hash like that to prevent your image from unexpectedly changing under your feet and dealing with a broken or vulnerable production.
* you can use docker images to directly test the newest development in the master branch
There are currently images that build off of PyPI (so currenty 1.5.8) and there are images that build off of source. They are tagged accordingly -- are we missing something ?
This approach would also reduce the complexity, because we can focus on building one container image.
That implies dropping the existing images and I don't think that's a great outcome.
I can review PRs and we can discuss the specifics there.
we already build and publish new images on every commit, that part works fine (though it did break on a number of occasions) and it is designed to support any number of images.
There would be no need to build the pypi images on every commit. Would be enough to build them after a new Ara version is released. That would avoid that they are changing that often.
Anyway, I guess what I am trying to say is that people should build their own images.\nWe should show them how and make it as easy and simple as possible for them but these things are in contrib and published as a convenience because I am not interested in being on the hook for them in the current circumstances.
Understood, but then the question why we pushing images to Dockerhub at all? Also the README in Dockerhub says nothing about it that those images are just examples and that it would be better to build own images. (In the documentation we have a section about that).
I feel this is already possible by pinning to a specific hash, like specifying :latest@sha256:758a16f46276567202a2b58239980cd9f3834b6da662fa2ad9cc1c5ee25fe3fd instead of :latest.
yes possible, but not really convenient and user friendly. As soon as new images are pushed, the digest is also not visible anymore in Dockerhub, because we always overwrite the tags.
There are currently images that build off of PyPI (so currenty 1.5.8) and there are images that build off of source. They are tagged accordingly -- are we missing something ?
yes we are missing to tag the images to a specific ara version. Maybe someone wants to stay on 1.5.8 or wants to run an earlier version, currently he has no possiblity. As soon as 1.6.0 is released, fedora36-pypi-latest
will come with that, without any transparency.
On the other hand we are tagging the Fedora Version (35, 36, ...), but who cares about that? Same for CentOS vs Fedora.
Don't get me wrong, I don't have a problem to build an image by my own and for production I would probably anyhow do that. But if we are already doing some work and providing docker images for the users, we could do it in a more convenient way, thus that the benefit for the user is much higher.
Coming back to this for 1.6.1, we started using image hashes after the 1.6.0 update which broke everything ara-related due to it not being backward compatible with 1.5. I don't trust ara not to break between versions and hashes are a pain to update. I have to go to docker hub, look for the specific tag I want, search for the hash, instead of just s/1.6.0/1.6.1/
.
Using versioned image tags is widely regarded as a good practice for reproducible and declarative deployments. This is especially helpful for Helm chart deployments, where the image tag and Helm chart version are typically updated together. Currently with the default values of the helm chart (https://github.com/lib42/charts/blob/main/charts/ara/values.yaml#L3C13-L3C25) , the deployed version of ara depends not only on when we do the chart installation, but when the ara pod may start/restart, and also what node the ara pod happens to run on! Worse, this means that forced (undesired) updates can happen anytime and we have no way to roll back if a breaking change occurs because the previous version is not available. (Doesn't that increase the burden of support?)
It is true that supply chain and other security considerations around image building are increasingly relevant, but can't it be like other aspects of open source development, where everyone can benefit from the more eyes looking and the more people working together on a common solution, instead of everyone building their own separately?
Also there can be a big warning "images are provided as is for convenience only"; people can make an informed choice and may choose to trust you anyway but I hope that does not feel like a burden.
This approach would also reduce the complexity, because we can focus on building one container image. That implies dropping the existing images and I don't think that's a great outcome.
I think the intent of the original comment might have been to encourage working together on one community image instead of everyone building their own.
Rather, the current situation where there are only 'latest' tags is a bit more like dropping existing images because the latest tag gets overwritten with a newer image, removing pre-existing images.
I would add the possibility of building docker images for other architectures (i.e. I'm using ARM64 on dev).
Additionally I think will be good to add some entrypoint script to add some 'intelligent wrap' on starting ARA:
For example personally I think lacking the support to provide sensitive information like ARA_DATABASE_PASSWORD or ARA_SECRET_KEY via docker secrets should be a must on any docker image and doesn't need to drop support for existing config/env settings.