katib icon indicating copy to clipboard operation
katib copied to clipboard

Support postgres as a katib db

Open anencore94 opened this issue 2 years ago • 14 comments

What this PR does / why we need it:

  • Implement postgres for katib backend and support it as a built-in feature of katib.
  • Add kustomization configurations for using postgres.
  • Support e2e test on both mysql and postgres.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #915

Checklist:

  • [ ] Docs included if any changes are user facing

anencore94 avatar Jul 30 '22 12:07 anencore94

Coverage Status

Coverage increased (+0.6%) to 73.433% when pulling 13cf73be7d5a2ab64430aca5c78ccb9c5727c798 on anencore94:feature/postgres-for-katib-backend-db into 42bc6a9d1151af0d666da2dc8d4cc7884b18b80b on kubeflow:master.

coveralls avatar Jul 30 '22 12:07 coveralls

/assign @johnugeorge @andreyvelich

gaocegege avatar Aug 02 '22 00:08 gaocegege

Should we keep add postgres support as a separate Kustomization file similar to katib-standalone?

https://github.com/kubeflow/katib/tree/master/manifests/v1beta1/installs

It is good to add this in GitHub actions. In that case, We have to add an extra option in the setup script to take db manager backend(mysql, postgres) https://github.com/kubeflow/katib/blob/master/.github/workflows/template-e2e-test/action.yaml#L27

What do you think?

johnugeorge avatar Aug 02 '22 10:08 johnugeorge

Should we keep add postgres support as a separate Kustomization file similar to katib-standalone? It is good to add this in GitHub actions. In that case, We have to add an extra option in the setup script to take db manager backend(mysql, postgres)

Yeap, I also think that looks more safe to use both Database :) So you mean that

  1. add another folder with kustomization configurations with postgres.
  2. add option to setup katib with postgres so that we could run all e2e test for both database. right ? @johnugeorge

anencore94 avatar Aug 05 '22 14:08 anencore94

Should we keep add postgres support as a separate Kustomization file similar to katib-standalone? It is good to add this in GitHub actions. In that case, We have to add an extra option in the setup script to take db manager backend(mysql, postgres)

Yeap, I also think that looks more safe to use both Database :) So you mean that

  1. add another folder with kustomization configurations with postgres.
  2. add option to setup katib with postgres so that we could run all e2e test for both database. right ? @johnugeorge

Yes. Correct.

johnugeorge avatar Aug 05 '22 16:08 johnugeorge

@johnugeorge Ok, I'll add that feature with this PR. Thanks for your comments :)

anencore94 avatar Aug 06 '22 04:08 anencore94

I'll remove WIP label when it's ready for review about e2e test.

anencore94 avatar Aug 08 '22 13:08 anencore94

It's hard to figure out why validating webhook doesn't work in e2e (https://github.com/kubeflow/katib/actions/runs/2825815578) with postgresql.

As you can see, other e2e tests succeess when I skip test to create invalid experiment : (https://github.com/kubeflow/katib/pull/1921/commits/68ba5957fa303488eca0f1270bb3f279d550a8d6). (https://github.com/kubeflow/katib/actions/runs/2831710091)

Is there any clue why this happens?

anencore94 avatar Aug 12 '22 04:08 anencore94

It's hard to figure out why validating webhook doesn't work in e2e (https://github.com/kubeflow/katib/actions/runs/2825815578) with postgresql.

As you can see, other e2e tests succeess when I skip test to create invalid experiment : (68ba595).

Is there any clue why this happens?

Did you face the same error on your local machine?

tenzen-y avatar Aug 12 '22 08:08 tenzen-y

Did you face the same error on your local machine?

No, In my local minikube environments, validatingwebhook works well : When I create invalid-expereiment.yaml, both enviroments(with mysql or with postgresql) reports expected error message as follows:

Error from server: error when creating "invalid-experiment.yaml": admission webhook "validator.experiment.katib.kubeflow.org" denied the request: unable to get Suggestion config data for algorithm invalid-algorithm: failed to find suggestion config for algorithm: invalid-algorithm in ConfigMap: katib-config

@tenzen-y

anencore94 avatar Aug 12 '22 12:08 anencore94

Did you face the same error on your local machine?

No, In my local minikube environments, validatingwebhook works well : When I create invalid-expereiment.yaml, both enviroments(with mysql or with postgresql) reports expected error message as follows:

Error from server: error when creating "invalid-experiment.yaml": admission webhook "validator.experiment.katib.kubeflow.org" denied the request: unable to get Suggestion config data for algorithm invalid-algorithm: failed to find suggestion config for algorithm: invalid-algorithm in ConfigMap: katib-config

@tenzen-y

@anencore94 I see. Can you insert kubectl logs -l "katib.kubeflow.org/component in (controller,db-manager,cert-generator)" -n kubeflow --all-containers=true to this section?

tenzen-y avatar Aug 12 '22 14:08 tenzen-y

@kubeflow/wg-automl-leads Can you re-kick fail GH-Action Jobs?

tenzen-y avatar Aug 15 '22 02:08 tenzen-y

@tenzen-y Thanks for your detailed reviews :) I applied your comments.

anencore94 avatar Aug 16 '22 11:08 anencore94

Great work. One question: Is it better to add readiness probe than static 30 sec sleep?

johnugeorge avatar Aug 17 '22 03:08 johnugeorge

Great work. One question: Is it better to add readiness probe than static 30 sec sleep?

Absolutely agree. I'll add such feature to katib-controller as a follow-up PR. @johnugeorge

anencore94 avatar Aug 17 '22 04:08 anencore94

Thanks /lgtm

johnugeorge avatar Aug 17 '22 04:08 johnugeorge

/approve

johnugeorge avatar Aug 19 '22 14:08 johnugeorge

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: anencore94, johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Aug 19 '22 14:08 google-oss-prow[bot]

@anencore94 It might be better to add documentation about env vars for PostgreSQL to this section.

tenzen-y avatar Aug 21 '22 11:08 tenzen-y

@tenzen-y Sure, Thanks for checking :)

anencore94 avatar Aug 21 '22 11:08 anencore94