gloo icon indicating copy to clipboard operation
gloo copied to clipboard

The Gloo pod tries to list the kubernetes endpoints at the cluster scope

Open bcollard opened this issue 2 years ago • 4 comments

Gloo Edge Version

1.8.x

Kubernetes Version

1.22.x

Describe the bug

Why does Gloo need to list all the endpoints at the cluster level and not just on the namespace where our apps are installed? We only watch this single namespace. Settings:

 watchNamespaces:
  - some-namespace

Error:

E0118 06:53:19.785253 1 reflector.go:127] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:156: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:clx-gloo-tst:gloo" cannot list resource "endpoints" in API group "" at the cluster scope

Is there a reason why Gloo wants to list endpoints at the cluster scope?

Steps to reproduce the bug

more info to come.

Expected Behavior

no error in the Gloo pod

Additional Context

Using OCP 4.9

bcollard avatar Feb 04 '22 16:02 bcollard

We have tested if the endpoints can be listed with the gloo service account with this commands:

APP_NAMESPACE=app-namespace
GLOO_NAMESPACE=gloo-namespace
oc auth can-i get endpoints -n $DBAN_NAMESPACE --as=system:serviceaccount:$GLOO_NAMESPACE:gloo
oc auth can-i --list -n $DBAN_NAMESPACE --as=system:serviceaccount:$GLOO_NAMESPACE:gloo

This results in yes and the following list of verbs (only endpoints shown):

Resources                                                            Non-Resource URLs                     Resource Names                                    Verbs
endpoints                                                            []                                    []                                                [get list watch get list watch]

Below the relevant serviceaccount, roles and rolebindings:

  • app.zip the role and rolebinding in the application namespace
  • gloo.zip the service account, role and rolebinding in the gloo namespace

Here the kubectl version (OpenShift):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2-0-g52c56ce", GitCommit:"6082e941e6d62f3a0c6ca8ba52927100948b1d0d", GitTreeState:"clean", BuildDate:"2020-10-22T08:26:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6+b4b4813", GitCommit:"cefce093e4e5bc9a1916eb5a489ed37c7d467f6f", GitTreeState:"clean", BuildDate:"2021-12-15T00:02:57Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}

Here the versions of the docker images:

solo-io/extauth-ee:1.8.7
solo-io/gateway:1.8.8
gloo-ee-envoy-wrapper:1.8.7
gloo-ee:1.8.7
rate-limit-ee:1.8.7

anessi avatar Feb 04 '22 16:02 anessi

I can reproduce this in 1.10.8. this seems to happen if there is not an existing upstream in the watch namespaces. if there is an upstream, I do not see that gloo tries to list endpoints in the cluster scope and is able to start correctly.

jack0 avatar Mar 09 '22 14:03 jack0

This also happens on 1.11.38. We don't want to add cluster roles to the gloo service account.

The main problem is that the Gloo POD does not come up with the mentioned error. So if you have a CI/CD pipeline that has dependencies between Gloo and the settings (like virtual services, routes, upstreams, ...) Gloo will never get up. The only workaround is to reverse the order of installation (e.g. install virtual services, routes, upstreams, ... before actually installing Gloo) which is definitely not a nice approach.

anessi avatar Sep 15 '22 15:09 anessi