bundle-kubeflow icon indicating copy to clipboard operation
bundle-kubeflow copied to clipboard

Stuck on Waiting for Units Settled Down - Istio Pilot

Open sys-ops opened this issue 2 years ago • 4 comments

istio-pilot app in kubeflow gets stuck with "waiting for units settled down" message

I have installed kubeflow using juju according to steps described here: https://charmed-kubeflow.io/docs/install (kubernetes 1.23.10 on Ubuntu 20.04)

  731  26/09/22 14:05:30 juju add-k8s myk8s
  732  26/09/22 14:07:53 juju bootstrap myk8s my-controller
  733  26/09/22 14:15:47 juju add-model kubeflow --debug --show-log
  734  26/09/22 14:16:14 juju status
  735  26/09/22 14:21:11 juju deploy kubeflow-lite --trust --debug --show-log
  736  26/09/22 14:24:24 juju status

Current juju status is that only istio-pilot app is still waiting.

Model     Controller     Cloud/Region  Version  SLA          Timestamp
kubeflow  my-controller  myk8s         2.9.32   unsupported  14:44:41+02:00

App                        Version                    Status   Scale  Charm                    Channel         Rev  Address        Exposed  Message
admission-webhook          res:oci-image@84a4d7d      active       1  admission-webhook        1.6/stable       50  10.235.5.249   no
argo-controller            res:oci-image@669ebd5      active       1  argo-controller          3.3/stable       99                 no
dex-auth                                              active       1  dex-auth                 2.31/stable     129  10.235.29.122  no
istio-ingressgateway                                  active       1  istio-gateway            1.11/stable     114  10.235.18.33   no
istio-pilot                                           waiting      1  istio-pilot              1.11/stable     131  10.235.58.26   no       waiting for units settled down
jupyter-controller         res:oci-image@8f4ec33      active       1  jupyter-controller       1.6/stable      138                 no
jupyter-ui                 res:oci-image@cde6632      active       1  jupyter-ui               1.6/stable       99  10.235.22.193  no
kfp-api                    res:oci-image@1b44753      active       1  kfp-api                  2.0/stable       81  10.235.51.2    no
kfp-db                     mariadb/server:10.3        active       1  charmed-osm-mariadb-k8s  stable           35  10.235.54.104  no       ready
kfp-persistence            res:oci-image@31f08ad      active       1  kfp-persistence          2.0/stable       76                 no
kfp-profile-controller     res:oci-image@d86ecff      active       1  kfp-profile-controller   2.0/stable       61  10.235.48.169  no
kfp-schedwf                res:oci-image@51ffc60      active       1  kfp-schedwf              2.0/stable       80                 no
kfp-ui                     res:oci-image@55148fd      active       1  kfp-ui                   2.0/stable       80  10.235.43.43   no
kfp-viewer                 res:oci-image@7190aa3      active       1  kfp-viewer               2.0/stable       79                 no
kfp-viz                    res:oci-image@67e8b09      active       1  kfp-viz                  2.0/stable       74  10.235.21.155  no
kubeflow-dashboard         res:oci-image@6fe6eec      active       1  kubeflow-dashboard       1.6/stable      154  10.235.61.244  no
kubeflow-profiles          res:profile-image@0a46ffc  active       1  kubeflow-profiles        1.6/stable       82  10.235.39.226  no
kubeflow-roles                                        active       1  kubeflow-roles           1.6/stable       31  10.235.18.114  no
kubeflow-volumes           res:oci-image@cc5177a      active       1  kubeflow-volumes         1.6/stable       64  10.235.51.226  no
metacontroller-operator                               active       1  metacontroller-operator  2.0/stable       48  10.235.28.135  no
minio                      res:oci-image@1755999      active       1  minio                    ckf-1.6/stable   99  10.235.28.158  no
oidc-gatekeeper            res:oci-image@32de216      active       1  oidc-gatekeeper          ckf-1.6/stable   76  10.235.43.71   no
seldon-controller-manager  res:oci-image@eb811b6      active       1  seldon-core              1.14/stable      92  10.235.7.86    no
training-operator                                     active       1  training-operator        1.5/stable       65  10.235.9.251   no

Unit                          Workload  Agent  Address         Ports              Message
admission-webhook/0*          active    idle   10.235.79.98    4443/TCP
argo-controller/0*            active    idle   10.235.115.47
dex-auth/0*                   active    idle   10.235.72.72
istio-ingressgateway/0*       active    idle   10.235.115.40
istio-pilot/0*                waiting   idle   10.235.72.15                       Waiting for ip address
jupyter-controller/0*         active    idle   10.235.115.41
jupyter-ui/0*                 active    idle   10.235.118.178  5000/TCP
kfp-api/0*                    active    idle   10.235.115.46   8888/TCP,8887/TCP
kfp-db/0*                     active    idle   10.235.118.180  3306/TCP           ready
kfp-persistence/0*            active    idle   10.235.72.78
kfp-profile-controller/0*     active    idle   10.235.109.79   80/TCP
kfp-schedwf/0*                active    idle   10.235.79.101
kfp-ui/0*                     active    idle   10.235.118.183  3000/TCP
kfp-viewer/0*                 active    idle   10.235.72.11
kfp-viz/0*                    active    idle   10.235.109.76   8888/TCP
kubeflow-dashboard/0*         active    idle   10.235.72.13    8082/TCP
kubeflow-profiles/0*          active    idle   10.235.79.103   8080/TCP,8081/TCP
kubeflow-roles/0*             active    idle   10.235.79.99
kubeflow-volumes/0*           active    idle   10.235.109.78   5000/TCP
metacontroller-operator/0*    active    idle   10.235.72.73
minio/0*                      active    idle   10.235.72.77    9000/TCP,9001/TCP
oidc-gatekeeper/0*            active    idle   10.235.118.182  8080/TCP
seldon-controller-manager/0*  active    idle   10.235.109.77   8080/TCP,4443/TCP
training-operator/0*          active    idle   10.235.115.42

All pods in the kubeflow namespace are up and running including istio-pilot-0.

$ kubectl get pod -n kubeflow -o wide
NAME                                             READY   STATUS    RESTARTS      AGE     IP               NODE              NOMINATED NODE   READINESS GATES
admission-webhook-5b67c68685-h6q67               1/1     Running   0             2d      10.235.79.98     devgto-worker-0   <none>           <none>
admission-webhook-operator-0                     1/1     Running   0             2d      10.235.79.184    devgto-worker-2   <none>           <none>
argo-controller-5bf749d697-6jdpf                 1/1     Running   0             2d      10.235.115.47    devgto-worker-7   <none>           <none>
argo-controller-operator-0                       1/1     Running   0             2d      10.235.118.176   devgto-worker-3   <none>           <none>
dex-auth-0                                       2/2     Running   0             2d      10.235.72.72     devgto-worker-4   <none>           <none>
istio-ingressgateway-0                           1/1     Running   0             2d      10.235.115.40    devgto-worker-7   <none>           <none>
istio-ingressgateway-workload-7cbf5464b7-b4rt5   1/1     Running   0             2d      10.235.115.45    devgto-worker-7   <none>           <none>
istio-pilot-0                                    1/1     Running   0             5h27m   10.235.72.15     devgto-worker-1   <none>           <none>
istiod-8f9d76cdb-wj8rg                           1/1     Running   0             2d      10.235.79.187    devgto-worker-2   <none>           <none>
jupyter-controller-c486f467b-v9sr6               1/1     Running   0             2d      10.235.115.41    devgto-worker-7   <none>           <none>
jupyter-controller-operator-0                    1/1     Running   0             2d      10.235.79.185    devgto-worker-2   <none>           <none>
jupyter-ui-5674b7b859-wcskr                      1/1     Running   0             2d      10.235.118.178   devgto-worker-3   <none>           <none>
jupyter-ui-operator-0                            1/1     Running   0             2d      10.235.118.177   devgto-worker-3   <none>           <none>
kfp-api-6898dc6956-pl5bd                         1/1     Running   0             2d      10.235.115.46    devgto-worker-7   <none>           <none>
kfp-api-operator-0                               1/1     Running   0             2d      10.235.79.186    devgto-worker-2   <none>           <none>
kfp-db-0                                         1/1     Running   0             2d      10.235.118.180   devgto-worker-3   <none>           <none>
kfp-db-operator-0                                1/1     Running   0             2d      10.235.72.75     devgto-worker-4   <none>           <none>
kfp-persistence-855d7b9667-xsv9s                 1/1     Running   0             2d      10.235.72.78     devgto-worker-4   <none>           <none>
kfp-persistence-operator-0                       1/1     Running   0             2d      10.235.109.74    devgto-worker-5   <none>           <none>
kfp-profile-controller-657c457c47-vvq2k          1/1     Running   0             2d      10.235.109.79    devgto-worker-5   <none>           <none>
kfp-profile-controller-operator-0                1/1     Running   0             2d      10.235.79.100    devgto-worker-0   <none>           <none>
kfp-schedwf-744d845449-hdp42                     1/1     Running   0             2d      10.235.79.101    devgto-worker-0   <none>           <none>
kfp-schedwf-operator-0                           1/1     Running   0             2d      10.235.118.179   devgto-worker-3   <none>           <none>
kfp-ui-76d669cdb6-86xd7                          1/1     Running   0             2d      10.235.118.183   devgto-worker-3   <none>           <none>
kfp-ui-operator-0                                1/1     Running   0             2d      10.235.72.10     devgto-worker-1   <none>           <none>
kfp-viewer-69b8f759cf-62mdp                      1/1     Running   0             2d      10.235.72.11     devgto-worker-1   <none>           <none>
kfp-viewer-operator-0                            1/1     Running   0             2d      10.235.109.75    devgto-worker-5   <none>           <none>
kfp-viz-67d8cd48f7-fwl4f                         1/1     Running   0             2d      10.235.109.76    devgto-worker-5   <none>           <none>
kfp-viz-operator-0                               1/1     Running   0             2d      10.235.72.74     devgto-worker-4   <none>           <none>
kubeflow-dashboard-6c6b4f744-8xmn5               1/1     Running   0             2d      10.235.72.13     devgto-worker-1   <none>           <none>
kubeflow-dashboard-operator-0                    1/1     Running   0             2d      10.235.72.76     devgto-worker-4   <none>           <none>
kubeflow-profiles-5668f4f8cf-xfnvf               2/2     Running   0             2d      10.235.79.103    devgto-worker-0   <none>           <none>
kubeflow-profiles-operator-0                     1/1     Running   0             2d      10.235.72.12     devgto-worker-1   <none>           <none>
kubeflow-roles-0                                 1/1     Running   0             2d      10.235.79.99     devgto-worker-0   <none>           <none>
kubeflow-volumes-75b44964c9-ssgxh                1/1     Running   0             2d      10.235.109.78    devgto-worker-5   <none>           <none>
kubeflow-volumes-operator-0                      1/1     Running   0             2d      10.235.79.102    devgto-worker-0   <none>           <none>
metacontroller-operator-0                        1/1     Running   0             2d      10.235.72.73     devgto-worker-4   <none>           <none>
metacontroller-operator-charm-0                  1/1     Running   0             2d      10.235.72.9      devgto-worker-1   <none>           <none>
minio-0                                          1/1     Running   0             2d      10.235.72.77     devgto-worker-4   <none>           <none>
minio-operator-0                                 1/1     Running   0             2d      10.235.118.181   devgto-worker-3   <none>           <none>
modeloperator-94b59f649-fxsck                    1/1     Running   0             2d      10.235.79.97     devgto-worker-0   <none>           <none>
oidc-gatekeeper-78969987b5-9d2lt                 1/1     Running   0             2d      10.235.118.182   devgto-worker-3   <none>           <none>
oidc-gatekeeper-operator-0                       1/1     Running   0             2d      10.235.115.44    devgto-worker-7   <none>           <none>
seldon-controller-manager-546d45c8b7-swwhd       1/1     Running   1 (13h ago)   2d      10.235.109.77    devgto-worker-5   <none>           <none>
seldon-controller-manager-operator-0             1/1     Running   0             2d      10.235.115.43    devgto-worker-7   <none>           <none>
training-operator-0                              2/2     Running   0             2d      10.235.115.42    devgto-worker-7   <none>           <none>

I have deleted the istio-pilot-0 POD, but it did not solve the "waiting" problem.

Here are some logs after the POD was restarted:

$ k logs -f -n kubeflow istio-pilot-0
Defaulted container "charm" out of: charm, charm-init (init)
2022-09-28 07:19:43 INFO juju.cmd supercommand.go:56 running containerAgent [2.9.32 c360f3d92b40458cf15512a7fe5eddb0e7ae57b2 gc go1.18.3]
starting containeragent unit command
containeragent unit "unit-istio-pilot-0" start (2.9.32 [gc])
2022-09-28 07:19:43 INFO juju.cmd.containeragent.unit runner.go:556 start "unit"
2022-09-28 07:19:43 INFO juju.worker.upgradesteps worker.go:60 upgrade steps for 2.9.32 have already been run.
2022-09-28 07:19:43 INFO juju.worker.probehttpserver server.go:157 starting http server on [::]:3856
2022-09-28 07:19:43 INFO juju.api apiclient.go:688 connection established to "wss://controller-service.controller-my-controller.svc.cluster.local:17070/model/4e703fa3-7139-4657-8e51-5e2525440518/api"
2022-09-28 07:19:43 INFO juju.worker.apicaller connect.go:163 [4e703f] "unit-istio-pilot-0" successfully connected to "controller-service.controller-my-controller.svc.cluster.local:17070"
2022-09-28 07:19:43 INFO juju.api apiclient.go:1055 cannot resolve "controller-service.controller-my-controller.svc.cluster.local": lookup controller-service.controller-my-controller.svc.cluster.local: operation was canceled
2022-09-28 07:19:43 INFO juju.api apiclient.go:688 connection established to "wss://10.235.48.109:17070/model/4e703fa3-7139-4657-8e51-5e2525440518/api"
2022-09-28 07:19:43 INFO juju.worker.apicaller connect.go:163 [4e703f] "unit-istio-pilot-0" successfully connected to "10.235.48.109:17070"
2022-09-28 07:19:43 INFO juju.worker.caasupgrader upgrader.go:113 abort check blocked until version event received
2022-09-28 07:19:43 INFO juju.worker.caasupgrader upgrader.go:119 unblocking abort check
2022-09-28 07:19:43 INFO juju.worker.migrationminion worker.go:142 migration phase is now: NONE
2022-09-28 07:19:43 INFO juju.worker.logger logger.go:120 logger worker started
2022-09-28 07:19:43 WARNING juju.worker.proxyupdater proxyupdater.go:282 unable to set snap core settings [proxy.http= proxy.https= proxy.store=]: exec: "snap": executable file not found in $PATH, output: ""
2022-09-28 07:19:43 INFO juju.worker.leadership tracker.go:194 istio-pilot/0 promoted to leadership of istio-pilot
2022-09-28 07:19:43 INFO juju.agent.tools symlinks.go:20 ensure jujuc symlinks in /var/lib/juju/tools/unit-istio-pilot-0
2022-09-28 07:19:43 INFO juju.worker.uniter uniter.go:329 unit "istio-pilot/0" started
2022-09-28 07:19:43 INFO juju.worker.uniter uniter.go:347 hooks are retried true
2022-09-28 07:19:44 INFO juju.worker.uniter.charm bundles.go:78 downloading ch:amd64/focal/istio-pilot-131 from API server
2022-09-28 07:19:44 INFO juju.downloader download.go:110 downloading from ch:amd64/focal/istio-pilot-131
2022-09-28 07:19:48 INFO juju.downloader download.go:93 download complete ("ch:amd64/focal/istio-pilot-131")
2022-09-28 07:19:49 INFO juju.downloader download.go:173 download verified ("ch:amd64/focal/istio-pilot-131")
2022-09-28 07:20:08 INFO juju.worker.uniter resolver.go:148 found queued "upgrade-charm" hook
2022-09-28 07:20:10 INFO juju-log Running legacy hooks/upgrade-charm.
2022-09-28 07:20:14 INFO juju.worker.uniter.operation runhook.go:146 ran "upgrade-charm" hook (via hook dispatching script: dispatch)
2022-09-28 07:20:14 INFO juju.worker.uniter resolver.go:148 found queued "config-changed" hook
2022-09-28 07:20:17 INFO juju.worker.uniter.operation runhook.go:146 ran "config-changed" hook (via hook dispatching script: dispatch)
2022-09-28 07:20:17 INFO juju.worker.uniter resolver.go:76 reboot detected; triggering implicit start hook to notify charm
2022-09-28 07:20:18 INFO juju-log Running legacy hooks/start.
2022-09-28 07:20:22 INFO juju.worker.uniter.operation runhook.go:146 ran "start" hook (via hook dispatching script: dispatch)
2022-09-28 07:25:03 INFO juju-log No gateway-info relation found
2022-09-28 07:25:03 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-09-28 07:30:12 INFO juju-log No gateway-info relation found
2022-09-28 07:30:12 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-09-28 07:34:15 INFO juju-log No gateway-info relation found
2022-09-28 07:34:15 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-09-28 07:40:00 INFO juju-log No gateway-info relation found
2022-09-28 07:40:01 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)

I have found one similar issue: https://github.com/canonical/bundle-kubeflow/issues/469 "Stuck on Waiting for Istio Pilot information"

However the solution was not related to my case because to install kubeflow I used a "clean" kubernetes cluster. By saying "clean" I mean there was no istio installation in this cluster at all, no istio objects before kubeflow installation.

Anyway, I took a look at some istio Roles.

$ kubectl get role -n kubeflow istio-ingressgateway -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
    controller.juju.is/id: 22411adb-dcc5-4bfc-8a3c-b905f121db68
    juju.is/version: 2.9.32
    model.juju.is/id: 4e703fa3-7139-4657-8e51-5e2525440518
  creationTimestamp: "2022-09-26T12:22:02Z"
  labels:
    app.kubernetes.io/managed-by: juju
    app.kubernetes.io/name: istio-ingressgateway
  name: istio-ingressgateway
  namespace: kubeflow
  resourceVersion: "1393849"
  uid: 8b628c75-18ea-49c9-abc0-1ff39cd4deca
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'
 
$ kubectl get role -n kubeflow istio-ingressgateway-workload-sds -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  creationTimestamp: "2022-09-26T12:26:01Z"
  labels:
    app.juju.is/created-by: istio-ingressgateway
    install.operator.istio.io/owning-resource: unknown
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio
  name: istio-ingressgateway-workload-sds
  namespace: kubeflow
  resourceVersion: "1396520"
  uid: 7aafa544-4a33-47eb-a3e5-20613a9eacf7
rules:
- apiGroups:
  - ""
  resources:
  - secrets
  verbs:
  - get
  - watch
  - list
  
  
$ kubectl patch role -n kubeflow istio-ingressgateway-workload-sds -p '{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"Role","metadata":{"name":"istio-ingressgateway-workload-sds"},"rules":[{"apiGroups":["*"],"resources":["*"],"verbs":["*"]}]}'

role.rbac.authorization.k8s.io/istio-ingressgateway-workload-sds patched

$ kubectl get role -n kubeflow istio-ingressgateway-workload-sds -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  creationTimestamp: "2022-09-26T12:26:01Z"
  labels:
    app.juju.is/created-by: istio-ingressgateway
    install.operator.istio.io/owning-resource: unknown
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio
  name: istio-ingressgateway-workload-sds
  namespace: kubeflow
  resourceVersion: "2301670"
  uid: 7aafa544-4a33-47eb-a3e5-20613a9eacf7
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'

Patching istio-ingressgateway-workload-sds Role did not solve the "waiting" problem.

What else could I do?

sys-ops avatar Sep 29 '22 07:09 sys-ops

Hello @sys-ops, istio requires a loadbalancing service to enable the gateway, is it possible that you are missing loadbalancing in your k8s cluster?

If you are unable to have loadbalancing you can patch the istio-ingressgateway service to something like NodePort.

DomFleischmann avatar Sep 29 '22 12:09 DomFleischmann

Hi @DomFleischmann. We do have a loadbalancer, but it is outside of the cluster... I have changed istio-ingressgateway-workload service from LoadBalancer to NodePort, but nothing has changed.

istio-pilot app still has the "waiting" status with "waiting for units settled down" message.

sys-ops avatar Oct 07 '22 10:10 sys-ops

Hi @sys-ops thanks for additional info can you evaluate if the gateway is presented?

kubectl get gateway -A

if not you can run

juju run --unit istio-pilot/0 -- "export JUJU_DISPATCH_PATH=hooks/config-changed; ./dispatch"

Give it a minute and check the status.

If you will still have problems you can send us juju debug-log --replay

misohu avatar Oct 13 '22 10:10 misohu

Hi @misohu , it looks like kubeflow-gateway Gateway gets created every ~5 minutes. I did run the command you provided anyway, but nothing has changed. Where should I send the log file?

$ kubectl get gateway -n kubeflow kubeflow-gateway -o yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  creationTimestamp: "2022-10-18T10:11:13Z" <=============================
  generation: 1
  labels:
    app.istio-pilot.io/is-workload-entity: "true"
    app.juju.is/created-by: istio-pilot
  name: kubeflow-gateway
  namespace: kubeflow
  resourceVersion: "11251847"
  uid: f47a024a-7aae-4931-8e5c-0c81e7c1adb9
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
      

$ kubectl logs -f -n kubeflow istio-ingressgateway-0 --tail=2
Defaulted container "charm" out of: charm, charm-init (init)
2022-10-18 10:11:32 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-10-18 10:16:02 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)


$ kubectl get gateway -n kubeflow kubeflow-gateway -o yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  creationTimestamp: "2022-10-18T10:15:27Z" <=============================
  generation: 1
  labels:
    app.istio-pilot.io/is-workload-entity: "true"
    app.juju.is/created-by: istio-pilot
  name: kubeflow-gateway
  namespace: kubeflow
  resourceVersion: "11253163"
  uid: 4d3e83a2-3e5d-4aba-ba37-cd8d6da8face
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
      
      
NAMESPACE                       NAME                                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                 AGE    SELECTOR
kubeflow                        istio-ingressgateway                 ClusterIP   10.235.18.33    <none>        65535/TCP                               21d    app.kubernetes.io/name=istio-ingressgateway
kubeflow                        istio-ingressgateway-endpoints       ClusterIP   None            <none>        <none>                                  21d    app.kubernetes.io/name=istio-ingressgateway
kubeflow                        istio-ingressgateway-workload        NodePort    10.235.57.16    <none>        80:31954/TCP,443:32077/TCP              21d    istio=ingressgateway


NAMESPACE                       NAME                                                    READY   STATUS             RESTARTS       AGE    IP               NODE              NOMINATED NODE   READINESS GATES
kubeflow                        istio-ingressgateway-0                                  1/1     Running            0              21d    10.235.115.40    devgto-worker-7   <none>           <none>
kubeflow                        istio-ingressgateway-workload-7cbf5464b7-qt58j          1/1     Running            0              18d    10.235.72.83     devgto-worker-4   <none>           <none>


$ kubectl get svc -n kubeflow istio-ingressgateway-workload -o jsonpath='{.metadata.labels}'
{"app":"istio-ingressgateway","app.juju.is/created-by":"istio-ingressgateway","install.operator.istio.io/owning-resource":"unknown","istio":"ingressgateway","istio.io/rev":"default","operator.istio.io/component":"IngressGateways","release":"istio"}


$ kubectl get svc -n kubeflow istio-ingressgateway -o jsonpath='{.metadata.labels}'
{"app.kubernetes.io/managed-by":"juju","app.kubernetes.io/name":"istio-ingressgateway"}


$ kubectl get pod -n kubeflow istio-ingressgateway-0 -o jsonpath='{.metadata.labels}'
{"app.kubernetes.io/name":"istio-ingressgateway","controller-revision-hash":"istio-ingressgateway-74f7659644","statefulset.kubernetes.io/pod-name":"istio-ingressgateway-0"}


$ kubectl get pod -n kubeflow istio-ingressgateway-workload-7cbf5464b7-qt58j -o jsonpath='{.metadata.labels}'
{"app":"istio-ingressgateway","chart":"gateways","heritage":"Tiller","install.operator.istio.io/owning-resource":"unknown","istio":"ingressgateway","istio.io/rev":"default","operator.istio.io/component":"IngressGateways","pod-template-hash":"7cbf5464b7","release":"istio","service.istio.io/canonical-name":"istio-ingressgateway-workload","service.istio.io/canonical-revision":"latest","sidecar.istio.io/inject":"false"}


$ juju run --unit istio-pilot/0 -- "export JUJU_DISPATCH_PATH=hooks/config-changed; ./dispatch"


$ kubectl logs -f -n kubeflow  istio-pilot-0 --tail=12
Defaulted container "charm" out of: charm, charm-init (init)
2022-10-18 10:11:14 INFO juju-log No gateway-info relation found
2022-10-18 10:11:14 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-10-18 10:15:27 INFO juju-log No gateway-info relation found
2022-10-18 10:15:28 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-10-18 10:20:12 INFO juju-log No gateway-info relation found
2022-10-18 10:20:12 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-10-18 10:25:18 INFO juju-log No gateway-info relation found
2022-10-18 10:25:19 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-10-18 10:30:38 INFO juju-log No gateway-info relation found
2022-10-18 10:30:38 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)
2022-10-18 10:35:53 INFO juju-log No gateway-info relation found
2022-10-18 10:35:53 INFO juju.worker.uniter.operation runhook.go:146 ran "update-status" hook (via hook dispatching script: dispatch)

sys-ops avatar Oct 18 '22 11:10 sys-ops

Hi @sys-ops, thanks for reporting the issue.

You can send the juju debug-log --replay --include istio-pilot/0 --include istio-ingressgateway/0 logs here if you have an Ubuntu One account, otherwise just attach them to this github issue.

I can see you sent a list of your istio-ingressgateway services. Could you also list the istio-pilot and istiod services? Thank you!

natalian98 avatar Oct 25 '22 16:10 natalian98

Hi @sys-ops, it looks like this issue is rather related to istio-pilot waiting for an IP address, which it obtains from a LoadBalancer (in future versions we are adding support for ClusterIP). Changing from LoadBalancer to NodePort requires extra configuration for istio-pilot (and therefore istio-ingressgateway) and other components to get things up and running.

We can provide a detailed guide for configuring the different kind of services for istio-pilot so you can try in case you still have issues with the deployment.

DnPlas avatar Jan 31 '23 22:01 DnPlas

This issue has been around for a while, since it is not a bug, but a configuration issue, I am closing it. @sys-ops feel free to re-open if you encounter the same or other issues.

DnPlas avatar Feb 09 '23 13:02 DnPlas

@DnPlas I am getting a similar issue. Attaching screenshot.

image

wizardrshah avatar Mar 16 '23 16:03 wizardrshah