kuberhealthy
kuberhealthy copied to clipboard
Manual trigger a check with an API call
Describe the feature you would like and why you want it
given there is a KuberhealthyCheck
that is called http
,
I want to trigger a check and get the result only by using an API call (or two).
I want to do so even if I don't have access to the cluster itself and kuberhealthy is exposed using a loadbalancer.
an example for an async API could be:
- /trigger?name=http&namesapce=app which would return a result ID
- /results/ID which would have the result
Additional context #528 add the support for job CRD, this request added two other features:
- By triggering an existing check, the user doesn't need to duplicate the podSpec
- By using an API the user doesn't need to have access to the cluster
Hey @OmerKahani - we've seen this feature request come through before. What keeps this from being an obvious add is the addition of authentication requirements. Any endpoint of Kuberhealthy that results in resources within the cluster being created needs to be authenticated to prevent abuse. We could have a service account credential (from k8s) passed to this new Kuberhealthy endpoint, but if the user has that, they probably can just talk to the kube-apiserver endpoint...
I feel like the best course of action here is still to just talk directly to kube-apiserver. In circumstances where that is unreachable, some other web service could be created to take requests from users and create KuberhealthyJob
resources within the cluster. If that add-on web service existed, I would probably like that under the github.com/kuberhealthy
organization, but I am not sure it would be appropriate to include in the main binary.
Hi, @integrii thanks for the response. I think this is similar to ArgoCD / Argo Workflow. Both have a UI that enables creation and deletion of objects in the cluster. They are also a good example of how UI helps get more community, as it creates a better UX than CMD.
I like the idea of having it in a different repo, and the requirement to have good user management upfront. This should also this feature to progress
hey @integrii we at external-secrets had a similar feature request and solved it by allowing a user to simply put an annotation on a Custom Resource and this triggers a full reconciliation.
In the context of kuberhealthy this could also work.
kubectl annotate khcheck my-check force-sync=$(date +%s) --overwrite
- if no job runs then start a job
- when a job already runs re-queue (so it'll start after the current one)
This is just a thought. That allows us to leverage the kube-apiserver instead of having to implement tls, authn, authz etc.
Sidenote: cert-manager has a kubectl plugin. It uses IIRC .Status.Conditions
to trigger a refresh. That's the same mechanics, just a different field.
original issue: https://github.com/external-secrets/external-secrets/issues/129#issuecomment-850146658
Thanks for info. External-secrets looks pretty useful. I will keep that in my toolkit!
Revisiting my thoughts here, I agree with @OmerKahani that a UI would be awesome for Kuberhealthy. I also agree that it should be in another repository. We actually have some items listed in our milestones for the creation of a UI sometime in the future.
It seems like there is a bit of a chicken and an egg in the original proposal and this annotation workaround... This assumes that the khjob
(or khcheck
) exists already, which often times it will not...
Basically we would be going from this workflow:
- create a
khjob
- let the job run
- check the
khstate
of the job - If you want to run the job again, delete it and make it again
to this workflow:
- create a
khjob
- let the job run
- check the
khstate
of the job - if you want to run the job again, use
kubectl annotate khjob my-check force-sync=$(date +%s) --overwrite
If you feel this is worth it, we could add this without much trouble (I think), but I feel like the improvement in this approach isn't huge...
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment on the issue or this will be closed in 15 days.
This issue was closed because it has been stalled for 15 days with no activity. Please reopen and comment on the issue if you believe it should stay open.