Yatai
Yatai copied to clipboard
Failed to run Yatai server in on-premise K8S
Hello, bentoML team.
I'm recently trying to use bentoML and Yatai on our on-premise K8S cluster, but somehow it failed because we don't have LB service on our cluster. Is there any guide or workarounds to deploy Yatai on non-cloud K8S?
Thank you.
Followings are a few error messages.
The error appears when I tried to push bento to yatai (yatai login is succeeded)
And I found that bentoml push
queries to the pods naemd deployment-yatai-deployment-comp-operator
under yatai-operator
namespace, and it shows following error, and it shows there's no externalIP
in yatai-ingress-controller-ingress-nginx-controller
2022-06-07T06:36:31.318Z INFO controller-runtime.manager.controller.deployment getting Deployment ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.318Z INFO controller-runtime.manager.controller.deployment Deployment getting successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.318Z INFO controller-runtime.manager.controller.deployment creating namespace yatai-components ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.318Z INFO controller-runtime.manager.controller.deployment namespace yatai-components creation successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.322Z INFO controller-runtime.manager.controller.deployment Installing CertManagerComponent ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.322Z INFO controller-runtime.manager.controller.deployment crd certificates.cert-manager.io already exists, so skipping install cert-manager {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.322Z INFO controller-runtime.manager.controller.deployment Installed CertManagerComponent successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.325Z INFO controller-runtime.manager.controller.deployment Installing YataiDeploymentOperatorComponent ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.326Z INFO controller-runtime.manager.controller.deployment installing crd from file helm-charts/yatai-deployment-operator/crds/deployments.yaml ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.361Z INFO controller-runtime.manager.controller.deployment crd bentodeployments.serving.yatai.ai updated successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.361Z INFO controller-runtime.manager.controller.deployment getting helm release yatai ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.368Z INFO controller-runtime.manager.controller.deployment found helm release yatai, status: deployed {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.369Z INFO controller-runtime.manager.controller.deployment Installed YataiDeploymentOperatorComponent successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.373Z INFO controller-runtime.manager.controller.deployment Installing CSIDriverImagePopulatorComponent ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.373Z INFO controller-runtime.manager.controller.deployment getting helm release yatai-csi-driver-image-populator ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.376Z INFO controller-runtime.manager.controller.deployment found helm release yatai-csi-driver-image-populator, status: deployed {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.377Z INFO controller-runtime.manager.controller.deployment Installed CSIDriverImagePopulatorComponent successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.380Z INFO controller-runtime.manager.controller.deployment Installing IngressControllerComponent ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.382Z INFO controller-runtime.manager.controller.deployment getting helm release yatai-ingress-controller ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.390Z INFO controller-runtime.manager.controller.deployment found helm release yatai-ingress-controller, status: failed {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.393Z INFO controller-runtime.manager.controller.deployment Installed IngressControllerComponent successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.396Z INFO controller-runtime.manager.controller.deployment Installing MinioComponent ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.396Z INFO controller-runtime.manager.controller.deployment installing crd from file helm-charts/minio-operator/crds/minio.min.io_tenants.yaml ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.627Z INFO controller-runtime.manager.controller.deployment crd tenants.minio.min.io updated successfully {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.627Z INFO controller-runtime.manager.controller.deployment getting helm release yatai-minio ... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.639Z INFO controller-runtime.manager.controller.deployment found helm release yatai-minio, status: failed {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.640Z INFO controller-runtime.manager.controller.deployment getting ingress-controller service external ip... {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.640Z ERROR controller-runtime.manager.controller.deployment getting ingress-controller service external ip failed {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": "", "error": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!", "errorVerbose": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*IngressControllerComponent).getIngressControllerServiceIps\n\t/workspace/controllers/deployment_controller.go:294\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*MinioComponent).Install\n\t/workspace/controllers/deployment_controller.go:510\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile\n\t/workspace/controllers/deployment_controller.go:211\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile\n\t/workspace/controllers/deployment_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
github.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile
/workspace/controllers/deployment_controller.go:211
github.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile
/workspace/controllers/deployment_controller.go:126
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-06-07T06:36:31.641Z ERROR controller-runtime.manager.controller.deployment Failed to install MinioComponent {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": "", "error": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!", "errorVerbose": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*IngressControllerComponent).getIngressControllerServiceIps\n\t/workspace/controllers/deployment_controller.go:294\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*MinioComponent).Install\n\t/workspace/controllers/deployment_controller.go:510\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile\n\t/workspace/controllers/deployment_controller.go:211\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile\n\t/workspace/controllers/deployment_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
github.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile
/workspace/controllers/deployment_controller.go:126
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-06-07T06:36:31.649Z ERROR controller-runtime.manager.controller.deployment Reconciler error {"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": "", "error": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!", "errorVerbose": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*IngressControllerComponent).getIngressControllerServiceIps\n\t/workspace/controllers/deployment_controller.go:294\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*MinioComponent).Install\n\t/workspace/controllers/deployment_controller.go:510\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile\n\t/workspace/controllers/deployment_controller.go:211\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile\n\t/workspace/controllers/deployment_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
Hi @thechaos16! We don't see that case very often, but we've got a particular config that might help:
ingress:
enabled: false
This will disable the creation of the ingress, which is for people who don't want to expose yatai with an external ip. Not sure all of your environment, but could you try that as a helm option?
cc @yetone
@thechaos16 Thanks for your report! Yatai deployment operators always need a load balancer, another solution is not to use the built-in Minio, but to manually specify the s3 configuration.
https://github.com/bentoml/Yatai/blob/main/docs/admin-guide.md#aws-s3
Thank you for the quick reply.
@timliubentoml, I've tried to disable ingress from https://github.com/bentoml/yatai-chart/blob/main/values.yaml#L91, but it still shows the same error. I guess updating helm chart of yatai-chart
cannot control operators' setup.
@yetone, I passed external S3 info by filling https://github.com/bentoml/yatai-chart/blob/main/values.yaml#L50-L58 blocks, but it still fails. Could you let me know if there is another way to not use the built-in Minio? In my K8S dashboard, there are two pods (minio-operator and yatai-minio-console) running.
@thechaos16 There is an error in the docs that shows setting ENDPOINT
as https://s3.amazonaws.com
but you need to actually set it to s3.amazonaws.com
for me this was resolved after i deleted the default postgres pvc. the log comes out as no user postgres in yatai.
$ k logs pod/yatai-7f97bc87fb-qkc25 -n yatai-system
Error: migrate up db: cannot create migrate: pq: password authentication failed for user "postgres"
deleted the whole yatai, yatai postgresql
$ kubectl create secret generic yatai-postgresql --from-literal=passwordExistingSecret=cqUIVv6S4q -n yatai-system
copied the initial secret and created a new postgresql secret. when i put existing secret with the new secret, it logins in as charm
values.yaml
postgresql:
enabled: true
nameOverride: ""
postgresqlUsername: postgres
postgresqlDatabase: yatai
## In case of postgresql.enabled = true, allow the usage of existing secrets for postgresql
##
existingSecret: yatai-postgresql #""
i managed to run it with values.yaml. didnt work if i only change the values.yaml and updating it with argocd.
$ kubectl create secret generic yatai-ceph-secret --from-literal=accesskey=access-key --from-literal=secretkey=secret-key -n yatai-system
$ values.yaml
externalS3:
enabled: true #false
endpoint: '192.168.*.*9:300*1' #my ceph object storage endpoint(or minio)
region: ''
bucketName: 'hgkim'
secure: false #true
existingSecret: 'yatai-ceph-secret'
existingSecretAccessKeyKey: 'accesskey' #'access_key'
existingSecretSecretKeyKey: 'secretkey' #'secret_key'
after i do bentoml push
it shows on the ui and object storage under bentoml/default
bentoml push iris_classifier:latest
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Successfully pushed model "iris_clf:h7hjmrr276ld23vw" │
│ Successfully pushed Bento "iris_classifier:khydmnr276cwg3vw" │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Pushing Bento "iris_classifier:khydmnr276cwg3vw" ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 5.8/5.8 kB • ? • 0:00:00
Uploading model "iris_clf:h7hjmrr276ld23vw" ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 2.0/2.0 kB • ? • 0:00:00