manifests
manifests copied to clipboard
Refactor test workflows
Description of your changes:
Upgrade actions/checkout
to version v4
.
This is going to be required as v3
does not run on node20
which will be required in a while.
[edit] As discussed, this PR also contains some refactoring on the gh workflows, e.g., renaming and dependencies
Checklist:
- [x] Unit tests pass:
Make sure you have installed kustomize == 5.2.1+
-
make generate-changed-only
-
make test
-
Here the link to the Github statement https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/
- consistent nameing e.g. vwa_kind_test.yaml -> volumes-web-application.yaml
- nb_controller_m2m_kind_test.yaml -> notebook_controller.yaml
- consistent and complete paths for triggering the workflow
- merging similar tests for the same component. E.g. the m2m tests are usually newer and better.
For a second PR:
I also want to get rid of /go.mod /go.sum /.flake8 and /prow_config if they are not needed somewhere. /docs also looks crazy outdated. KustomizeBestPractices.md, ObsoleteKfctl.md, TestFramework.md and dex-auth might also be obsolete.
Hi @juliusvonkohout, I think I addressed most of the things we discussed in today meeting:
- Renamed workflows with a more meaningful convention, I preferred to keep
_test
in the name to differentiate any other gha that is not test releated (e.g.,triage_issue
or any other in future) - Fixed file dependencies that will trigger those workflows, I hope I did not miss anything but in the worst case they can be added as soon as they are noticed
- Upgrade actions/checkout action to
v4
The only thing I did not do in this PR is the workflow merge as I am not completely sure if two workflows can be really merged together, e.g., kserve
and kserve_m2m
.
Build & Apply BentoML Yatai Stack manifests in KinD / build (pull_request)
Looks like it is failing but I don't think it is related to this change.
@lampajr can you fix the bentoml tests?
@lampajr can you fix the bentoml tests?
I can try to take a look but looking at the previous runs of the test [1], it looks like it never worked :thinking:
Alternatively, we could create a different issue to fix bentoml test, as I think it is not directly related to this PR, wdyt?
[1] https://github.com/kubeflow/manifests/actions/workflows/bentoml_kind_test.yaml
Hey @juliusvonkohout,
I did some additional investigation and it looks like the contrib/bentoml
directory is pretty outdatated, thus in the logs I see:
2024-05-02T20:04:08.5386818Z 1.714680241638799e+09 ERROR controller-runtime.source if kind is a CRD, it should be installed before calling Start {"kind": "HorizontalPodAutoscaler.autoscaling", "error": "no matches for kind \"HorizontalPodAutoscaler\" in version \"autoscaling/v2beta2\""}
2024-05-02T20:04:08.5389859Z sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1
2024-05-02T20:04:08.5391173Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/source/source.go:139
2024-05-02T20:04:08.5392765Z k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
2024-05-02T20:04:08.5394133Z /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:235
2024-05-02T20:04:08.5395093Z k8s.io/apimachinery/pkg/util/wait.WaitForWithContext
2024-05-02T20:04:08.5396257Z /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:662
2024-05-02T20:04:08.5397237Z k8s.io/apimachinery/pkg/util/wait.poll
2024-05-02T20:04:08.5398196Z /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:596
2024-05-02T20:04:08.5399440Z k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext
2024-05-02T20:04:08.5400688Z /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:547
2024-05-02T20:04:08.5401926Z sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1
2024-05-02T20:04:08.5403159Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/source/source.go:132
2024-05-02T20:04:13.5355408Z + kubectl -n kubeflow wait --for=condition=available --timeout=600s deploy/fraud-detection
2024-05-02T20:04:13.5830234Z Error from server (NotFound): deployments.apps "fraud-detection" not found
2024-05-02T20:04:13.5852648Z make: *** [Makefile:22: test] Error 1
It looks like it is complaining about missing HorizontalPodAutoscaler
CRD in autoscaling/v2beta2
api version which should be present in older versions of Kubernetes (?) - not sure if that is the real root cause, but it could be.
Then, just as a test, I changed the version to (not the latest but at least newer versions):
BENTOML_YATAI_IMAGE_BUILDER_VERSION ?= 1.1.25
BENTOML_YATAI_DEPLOYMENT_VERSION ?= 1.1.21
And I re-generated the contrib/bentoml
resources:
modified: bentoml-yatai-stack/bases/yatai-deployment/resources.yaml
modified: bentoml-yatai-stack/bases/yatai-image-builder/resources.yaml
And with those new resources, the test seems working (at least the kubectl wait conditions are met).
In conclusion it seems like the issue is not in the test itself, but it looks like the bentoml
manifests are not really up to date, is this expected/wanted?
Anyway I think this could be a topic that should be discussed separately from this PR, unless we already know the answer (do you?).
Yes, please update the bentoml stuff, because we have to pass all tests for now. If it is a small change even in this PR here.
Yes, please update the bentoml stuff, because we have to pass all tests for now. If it is a small change even in this PR here.
I created a separate PR as it looks like the is an error while running the test.sh
script inside contrib/bentoml
. See https://github.com/kubeflow/manifests/pull/2704
Rebased after https://github.com/kubeflow/manifests/pull/2704 merge
/lgtm /approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: juliusvonkohout, lampajr
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [juliusvonkohout]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment