katib
katib copied to clipboard
Automate Katib Releases
Currently, to make Katib releases we have to follow this manual process: https://github.com/kubeflow/katib/tree/master/docs/release
We run make release
command, build and publish the release Docker images locally, and publish Katib SDK version.
Since we build docker images locally, our release images don't support multi OS arch: https://hub.docker.com/layers/kubeflowkatib/katib-controller/v0.14.0/images/sha256-51ca80d6005010ff08853a5f7231158cb695ea899b623200076cbc01509fc0b5?context=repo.
The release process should be automated. For example, we can utilise GitHub Actions to make Katib releases.
cc @tenzen-y @johnugeorge
Love this feature? Give it a 👍 We prioritize the features with the most 👍
We can use a workflow_dispatch
or release
trigger in GHA
@andreyvelich Thanks for proposing this.
Since we build docker images locally, our release images don't support multi OS arch
That's right. For now, we can not release multi-platform images by that documentation's steps.
The release process should be automated. For example, we can utilise GitHub Actions to make Katib releases.
I agree with you.
We can use a workflow_dispatch or release trigger in GHA
I prefer to use the release trigger.
That's right. For now, we can not release multi-platform images by that documentation's steps. The release process should be automated. For example, we can utilise GitHub Actions to make Katib releases.
If we could prepare arm-machine self-hosted runner(or use github action arm runner with extra charge), we could make the automate release. How could we prepare the arm machine ?
That's right. For now, we can not release multi-platform images by that documentation's steps. The release process should be automated. For example, we can utilise GitHub Actions to make Katib releases.
If we could prepare arm-machine self-hosted runner(or use github action arm runner with extra charge), we could make the automate release. How could we prepare the arm machine ?
@anencore94
I mean we need to modify the make release
command since we can not build multiplatform images using that command.
Or Does that mean we should prepare arm-machine runners to run tests for arm env?
@tenzen-y I mean if we prepare arm-machine runners, we could build arm-platform images at github-action workflows much easier and then publish them by manifests including both amd and arm image. WDYT ?
I'm not sure we need to enable make release
to build multiplatform images at local. But I think it would be better to publish multiplatform image at release
@tenzen-y I mean if we prepare arm-machine runners, we could build arm-platform images at github-action workflows much easier and then publish them by manifests including both amd and arm image. WDYT ?
I'm not sure we need to enable
make release
to build multiplatform images at local. But I think it would be better to publish multiplatform image at release
@anencore94 I see. We can build multiplatform images using the default amd64 runner. Actually, we publish multi-platform images for every commit like this.
Probably, we don't need arm64 runners for the multi-platform build.
Does that sound good to you?
We can build multiplatform images using the default amd64 runner. Actually, we publish multi-platform images for every commit like this. Probably, we don't need arm64 runners for the multi-platform build.
Sure, But building an arm-image in amd64-runner would be much slower since it uses some kind of virtualizer like QEMU to build arm-image. So if we could prepare arm64 runner, then it would be better. However, if it is not affordable, then yes I agree with to build it with amd64-runner. @tenzen-y
We can build multiplatform images using the default amd64 runner. Actually, we publish multi-platform images for every commit like this. Probably, we don't need arm64 runners for the multi-platform build.
Sure, But building an arm-image in amd64-runner would be much slower since it uses some kind of virtualizer like QEMU to build arm-image. So if we could prepare arm64 runner, then it would be better. However, if it is not affordable, then yes I agree with to build it with amd64-runner. @tenzen-y
@anencore94 I see. That's a great idea, I agree with your idea. It makes speed up building time if we could prepare arm64 runners.
Maybe, docker build create --append
command and remote build instance help us.
Using multiple native nodes provide better support for more complicated cases that are not handled by QEMU and generally have better performance. You can add additional nodes to the builder instance using the --append flag.
Assuming contexts node-amd64 and node-arm64 exist in docker context ls;
docker buildx create --use --name mybuild node-amd64 docker buildx create --append --name mybuild node-arm64 docker buildx build --platform linux/amd64,linux/arm64 .
https://docs.docker.com/build/building/multi-platform/#building-multi-platform-images
The Buildx remote driver allows for more complex custom build workloads, allowing you to connect to externally managed BuildKit instances. This is useful for scenarios that require manual management of the BuildKit daemon, or where a BuildKit daemon is exposed from another source.
docker buildx create \ --name remote-unix \ --driver remote \ unix://$HOME/buildkitd.sock
https://docs.docker.com/build/drivers/remote/
@anencore94 @tenzen-y Currently, we are not using self hosted runners. We need to review this sometime if we can use self hosted runners in AWS
I'm willing to help create the flows for the release. Do let me know if you guys need any help once we have some agreement on runners.
Hi @midhun1998, that would be great! Currently, we follow this manual process for our releases: https://github.com/kubeflow/katib/tree/master/docs/release. We can discuss how to automate it (e.g. using GitHub Actions) on the upcoming AutoML + Training WG Meeting.
I'd like to contribute on this automation too. see you on the next meeting :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/remove-lifecycle stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/lifecycle frozen /help
@andreyvelich: This request has been marked as needing help from a contributor.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help
command.
In response to this:
/lifecycle frozen /help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/good-first-issue
This is good issue to work on if you are familiar with GitHub actions and can help us to automate releases for Katib/Training Operator. Feel free to propose your ideas/suggestions.
@andreyvelich: This request has been marked as suitable for new contributors.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue
command.
In response to this:
/good-first-issue
This is good issue to work on if you are familiar with GitHub actions and can help us to automate releases for Katib/Training Operator. Feel free to propose your ideas/suggestions.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/assign