rudr
rudr copied to clipboard
Health scope should provide APIs that return descriptive errors when deployments fail
-
The health scope should have APIs that I can query after a deployment fails (and assuming the YAML was valid), that will describe to me what went wrong with the deployment.
-
For example, if I deploy an ApplicationConfiguration and there is an error, the Health Scope should be queryable and return information on which components failed to come up and some deeper information as to why.
-
kubectl describe healthscope
and it returns a description of the scope and the problems within that scope
I have given a PR(#473) to solve how we give errors back to the user. My answer is k8s events.
In that PR, I also give an example, when I deploy an appconfig with something wrong, I could use kubectl describe
to see what's wrong.
$ kubectl describe cfg first-app
Name: first-app
Kind: ApplicationConfiguration
...
Spec:
Components:
Component Name: helloworld-python-v1
Instance Name: first-app-helloworld-python-v1
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ApiError NotFound ("componentschematics.core.oam.dev \"helloworld-python-v1\" not found") 3s creating AppConfig first-app error
Yeah, I know provide API from health scope is better, but I don't think this is a high priority issue.
I am troubleshooting a similar issue but no events at this point with the alpha 1 version. Anyway that I can get the warning events you got from k8s today? Mine is empty. No deployments for this instance yet.
kubectl describe cfg bikesharing-app
Name: bikesharing-app
Namespace: default
Labels:
I think events needs to be deduped otherwise they expire?
Also it seems what added here https://github.com/oam-dev/rudr/pull/473/files wasn't a summary of what really goes wrong during the deployment. it's mroe like a cfg log event