katib
katib copied to clipboard
Support postgres as a katib db
What this PR does / why we need it:
- Implement postgres for katib backend and support it as a built-in feature of katib.
- Add kustomization configurations for using postgres.
- Support e2e test on both mysql and postgres.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):
Fixes #915
Checklist:
- [ ] Docs included if any changes are user facing
Coverage increased (+0.6%) to 73.433% when pulling 13cf73be7d5a2ab64430aca5c78ccb9c5727c798 on anencore94:feature/postgres-for-katib-backend-db into 42bc6a9d1151af0d666da2dc8d4cc7884b18b80b on kubeflow:master.
/assign @johnugeorge @andreyvelich
Should we keep add postgres support as a separate Kustomization file similar to katib-standalone?
https://github.com/kubeflow/katib/tree/master/manifests/v1beta1/installs
It is good to add this in GitHub actions. In that case, We have to add an extra option in the setup script to take db manager backend(mysql, postgres) https://github.com/kubeflow/katib/blob/master/.github/workflows/template-e2e-test/action.yaml#L27
What do you think?
Should we keep add postgres support as a separate Kustomization file similar to katib-standalone? It is good to add this in GitHub actions. In that case, We have to add an extra option in the setup script to take db manager backend(mysql, postgres)
Yeap, I also think that looks more safe to use both Database :) So you mean that
- add another folder with kustomization configurations with postgres.
- add option to setup katib with postgres so that we could run all e2e test for both database. right ? @johnugeorge
Should we keep add postgres support as a separate Kustomization file similar to katib-standalone? It is good to add this in GitHub actions. In that case, We have to add an extra option in the setup script to take db manager backend(mysql, postgres)
Yeap, I also think that looks more safe to use both Database :) So you mean that
- add another folder with kustomization configurations with postgres.
- add option to setup katib with postgres so that we could run all e2e test for both database. right ? @johnugeorge
Yes. Correct.
@johnugeorge Ok, I'll add that feature with this PR. Thanks for your comments :)
I'll remove WIP
label when it's ready for review about e2e test.
It's hard to figure out why validating webhook doesn't work in e2e (https://github.com/kubeflow/katib/actions/runs/2825815578) with postgresql.
As you can see, other e2e tests succeess when I skip test to create invalid experiment : (https://github.com/kubeflow/katib/pull/1921/commits/68ba5957fa303488eca0f1270bb3f279d550a8d6). (https://github.com/kubeflow/katib/actions/runs/2831710091)
Is there any clue why this happens?
It's hard to figure out why validating webhook doesn't work in e2e (https://github.com/kubeflow/katib/actions/runs/2825815578) with postgresql.
As you can see, other e2e tests succeess when I skip test to create invalid experiment : (68ba595).
Is there any clue why this happens?
Did you face the same error on your local machine?
Did you face the same error on your local machine?
No, In my local minikube environments, validatingwebhook works well : When I create invalid-expereiment.yaml, both enviroments(with mysql or with postgresql) reports expected error message as follows:
Error from server: error when creating "invalid-experiment.yaml": admission webhook "validator.experiment.katib.kubeflow.org" denied the request: unable to get Suggestion config data for algorithm invalid-algorithm: failed to find suggestion config for algorithm: invalid-algorithm in ConfigMap: katib-config
@tenzen-y
Did you face the same error on your local machine?
No, In my local minikube environments, validatingwebhook works well : When I create invalid-expereiment.yaml, both enviroments(with mysql or with postgresql) reports expected error message as follows:
Error from server: error when creating "invalid-experiment.yaml": admission webhook "validator.experiment.katib.kubeflow.org" denied the request: unable to get Suggestion config data for algorithm invalid-algorithm: failed to find suggestion config for algorithm: invalid-algorithm in ConfigMap: katib-config
@tenzen-y
@anencore94 I see.
Can you insert kubectl logs -l "katib.kubeflow.org/component in (controller,db-manager,cert-generator)" -n kubeflow --all-containers=true
to this section?
@kubeflow/wg-automl-leads Can you re-kick fail GH-Action Jobs?
@tenzen-y Thanks for your detailed reviews :) I applied your comments.
Great work. One question: Is it better to add readiness probe than static 30 sec sleep?
Great work. One question: Is it better to add readiness probe than static 30 sec sleep?
Absolutely agree. I'll add such feature to katib-controller as a follow-up PR. @johnugeorge
Thanks /lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: anencore94, johnugeorge
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [johnugeorge]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
@anencore94 It might be better to add documentation about env vars for PostgreSQL to this section.
@tenzen-y Sure, Thanks for checking :)