Helm Chart Templates For Katib
What this PR does / why we need it: Helm Templates For Katib
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #
Checklist:
- [ ] Docs included if any changes are user facing
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign johnugeorge for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
cc: @juliusvonkohout
@kunal-511: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.
In response to this:
/retest
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/ok-to-test
/retest
@juliusvonkohout should this PR be linked to https://github.com/kubeflow/community/pull/832 ?
cc @kunal-511
@kunal-511 let us know once this PR is ready for review. It seems there's a hold + two checks failing.
Additionally, can you add some description and comments on the PR, we have 58 files to review and it'll be great to have more comments on it to help us with the approval.
Another thing I think it will be great if we can involve the Katib working group on this review. thoughts @juliusvonkohout cc @andreyvelich
/ok-to-test
The tree structure of the Kubeflow Katib Helm chart directory with explanations for each file and folder.
katib(helm charts)/
├── Chart.yaml # Helm chart metadata and version information
├── README.md # Chart documentation and usage instructions
├── values.yaml # Default configuration values for the chart
├── ci/ # CI/CD configuration files for testing
│ ├── values-cert-manager.yaml # Test values for cert-manager integration
│ ├── values-enterprise.yaml # Test values for enterprise deployment
│ ├── values-external-db.yaml # Test values for external database configuration
│ ├── values-kubeflow.yaml # Test values for Kubeflow integration
│ ├── values-leader-election.yaml # Test values for leader election setup
│ ├── values-openshift.yaml # Test values for OpenShift platform
│ ├── values-postgres.yaml # Test values for PostgreSQL database
│ ├── values-production.yaml # Test values for production deployment
│ └── values-standalone.yaml # Test values for standalone deployment
├── crds/ # Custom Resource Definitions for Katib
│ ├── experiment.yaml # CRD for hyperparameter tuning experiments
│ ├── suggestion.yaml # CRD for algorithm suggestions
│ └── trial.yaml # CRD for individual experiment trials
├── templates/ # Helm template files for Kubernetes resources
│ ├── _helpers.tpl # Template helper functions and common definitions
│ ├── autoscaling/ # Horizontal Pod Autoscaler configurations
│ │ └── hpa.yaml # HPA template for scaling components
│ ├── config/ # Configuration files and ConfigMaps
│ │ └── configmap.yaml # Main configuration settings for Katib
│ ├── controller/ # Katib controller deployment resources
│ │ ├── deployment.yaml # Controller deployment specification
│ │ ├── leader-election-rbac.yaml # RBAC for leader election functionality
│ │ ├── rbac.yaml # Role-based access control for controller
│ │ ├── service.yaml # Service for controller communication
│ │ ├── serviceaccount.yaml # ServiceAccount for controller pod
│ │ └── trial-templates-configmap.yaml # Trial template configurations
│ ├── database/ # Database deployment resources
│ │ ├── mysql-deployment.yaml # MySQL database deployment
│ │ ├── mysql-pvc.yaml # MySQL persistent volume claim
│ │ ├── mysql-service.yaml # MySQL service for database access
│ │ ├── postgres-deployment.yaml # PostgreSQL database deployment
│ │ ├── postgres-pvc.yaml # PostgreSQL persistent volume claim
│ │ ├── postgres-service.yaml # PostgreSQL service for database access
│ │ └── secret.yaml # Database credentials and secrets
│ ├── db-manager/ # Database manager component
│ │ ├── deployment.yaml # DB manager deployment specification
│ │ └── service.yaml # Service for DB manager communication
│ ├── db/ # Additional database configurations
│ ├── istio/ # Istio service mesh integration
│ │ ├── authorization-policy.yaml # Istio authorization policies
│ │ └── virtual-service.yaml # Istio virtual service configuration
│ ├── monitoring/ # Monitoring and observability resources
│ │ └── servicemonitor.yaml # Prometheus ServiceMonitor for metrics
│ ├── namespace/ # Namespace creation resources
│ │ └── namespace.yaml # Namespace definition for Katib
│ ├── openshift/ # OpenShift-specific configurations
│ ├── rbac/ # Additional RBAC configurations
│ │ └── kubeflow-roles.yaml # Kubeflow-specific role definitions
│ ├── security/ # Security-related configurations
│ │ ├── network-policy.yaml # Network policies for pod communication
│ │ └── pod-disruption-budget.yaml # Pod disruption budget for availability
│ ├── ui/ # Katib UI component resources
│ │ ├── deployment.yaml # UI deployment specification
│ │ ├── rbac.yaml # RBAC for UI component
│ │ ├── service.yaml # Service for UI access
│ │ └── serviceaccount.yaml # ServiceAccount for UI pod
│ └── webhook/ # Admission webhook configurations
│ ├── certificate.yaml # TLS certificates for webhook
│ ├── mutating-webhook.yaml # Mutating admission webhook
│ └── validating-webhook.yaml # Validating admission webhook
@juliusvonkohout @varodrig
@anencore94
@Electronic-Waste
can you help here ?
/ok-to-test
/ok-to-test
great job @kunal-511 thank so much for working on this. There's a lot of work and thought process on understanding the current katib kustomize files, installation and helm .
I appreciate the amazing work you have done.
We need @Electronic-Waste to do a final review. @Electronic-Waste I left a few comments for you, please review ci in detail and there are good practices implemented from security constraints, PDB, network policies and service monitoring ensure you are ok with this implementation.
@Electronic-Waste @andreyvelich can you review please? thank you
cc @juliusvonkohout
@Electronic-Waste @andreyvelich can you review please? thank you
cc @juliusvonkohout
First the tests must be fixed and we need a rebase to master.
@Electronic-Waste @andreyvelich can you review please? thank you cc @juliusvonkohout
First the tests must be fixed and we need a rebase to master.
rebase to master is done Fixing the tests
/retest
@andreyvelich for approval