lens icon indicating copy to clipboard operation
lens copied to clipboard

OpenLens 6.3.0 on macOS does not show pods and config/network/storage sections are empty, all work fine in 6.2.6.

Open dmarjanovic opened this issue 2 years ago • 25 comments

Describe the bug OpenLens 6.3.0 on macOS does not show pods and config/network/storage sections are empty, all work fine in 6.2.6.

To Reproduce Steps to reproduce the behavior:

  1. git checkout v6.3.0 # or master d6531f2
  2. make clean && make dev
  3. When window appears, connect to a namespace (there's already configured single namespace in Namespace/Accessible Namespaces)
  4. Expand 'Workloads'
  5. Do not see Pods, but can see Deployments, DaemonSets, StatefulSets, ReplicaSets, Jobs, CronJobs
  6. Scroll down to 'Config' and expand
  7. Do not see ConfigMaps, Secrets, Resource Quotas, Limit Ranges, HPA, Pod Disruption Budgets
  8. Scroll down to 'Network' and expand
  9. Do not see Services, Endpoints, Ingresses, Network Policies but can only see Port Forwarding
  10. Scroll down to 'Storage'
  11. Can't expand as it's empty (can't see Persistent Volume Claims')

Expected behavior 5. I see Pods in 'Workloads' section 7. I see ConfigMaps, Secrets, Resource Quotas, Limit Ranges, HPA, Pod Disruption Budgets in 'Config' section 9. I see Services, Endpoints, Ingresses, Network Policies in 'Network' section 11. I see Persistent Volume Claims in 'Storage' section

Screenshots Wrong one from OpenLens 6.3.0: Screenshot 2023-01-02 at 16 21 36

Correct one from OpenLens 6.2.6: Screenshot 2023-01-02 at 16 14 32

Environment (please complete the following information):

  • Lens Version: 6.3.0
  • OS: macOS
  • Installation method (e.g. snap or AppImage in Linux): make clean && make dev # from github tag

Kubeconfig:

apiVersion: v1
clusters:
- cluster:
    server: https://api-kube.***
  name: cluster1
...
contexts:
- context:
    cluster: cluster1
    namespace: ns1
    user: some-user
  name: cluster1-ns1
...
current-context: cluster1-ns1
kind: Config
preferences: {}
users:
- name: some-user
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - oidc-login
      - get-token
      - --oidc-issuer-url=https://***
      - --oidc-client-id=api-kube
      - --oidc-client-secret=***
      - --oidc-extra-scope=groups
      command: kubectl
      env: null
      interactiveMode: IfAvailable
      provideClusterInfo: false

dmarjanovic avatar Jan 02 '23 15:01 dmarjanovic

What sort of kube distro is this cluster for?

Nokel81 avatar Jan 03 '23 14:01 Nokel81

Same (or even worse) issue for me. In my case I cannot see any resources besides CRDs which work fine.

Were running on VMWare Tanzu, kubernetes version 1.22.9

I have cluster admin permissions inside the cluster and can get/describe all resources via the CLI.

Installed lens 6.2.6 and was upgraded to 6.3.0, in the old version everything worked as expected.

Fantaztig avatar Jan 05 '23 09:01 Fantaztig

Fixed by https://github.com/lensapp/lens/pull/6880

Nokel81 avatar Jan 05 '23 13:01 Nokel81

@Nokel81

What sort of kube distro is this cluster for?

We're with aws eks, if that helps.

Fixed by #6880

The #6880 contribution feels much more stable (faster) than 3.6.0 but the problem (and screenshots) described in this issue are still not solved with it. I've verified with latest commit d34a13fad2 in farodin91:set-correct-group-in-rbac remote:branch

Any other ideas? Tnx

dmarjanovic avatar Jan 05 '23 16:01 dmarjanovic

It certainly feels like an RBAC related issue. What is the response for the following command?

kubectl create -f - -o yaml << EOF
apiVersion: authorization.k8s.io/v1
kind: SelfSubjectRulesReview
spec:
  namespace: default
EOF

Nokel81 avatar Jan 05 '23 17:01 Nokel81

@Nokel81 I can't reveal config details sorry. Is there something specific you're looking for? Or let me know if there's another way to help diagnosing this issue. Thank you

dmarjanovic avatar Jan 09 '23 10:01 dmarjanovic

@dmarjanovic Okay fair enough. How about this...

if you run kubectl get --raw /api you should get back a JSON object that has a field called versions. What is its value? On GKE and minikube it is ["v1"]

Nokel81 avatar Jan 09 '23 15:01 Nokel81

I have the same problem, 6.2.5 or k9s works though

gfarcas avatar Jan 09 '23 15:01 gfarcas

@gfarcas Can you run the command that I asked @dmarjanovic to run?

Nokel81 avatar Jan 09 '23 15:01 Nokel81

Same issue on Windows, with an AKS cluster. I run the command from @dmarjanovic and got the same result than him.

HuguesJ avatar Jan 09 '23 16:01 HuguesJ

Can you please also run kubectl get --raw /api/v1 then? Do you get a list that includes pods?

Nokel81 avatar Jan 09 '23 16:01 Nokel81

@Nokel81 Not sure what you mean by pods. kubectl get --raw /api/v1 returns a json object with some elements with pods in their name like pods,pods/attach, pods/binding , it doesn't return the list of all pods.

HuguesJ avatar Jan 09 '23 16:01 HuguesJ

No that is what I meant and expected you to get. Hmmm.... 🤔

Nokel81 avatar Jan 09 '23 16:01 Nokel81

@Nokel81 Interesting enough, this problem happens only in one cluster (AKS) out of 4 (AKS and GKE).

HuguesJ avatar Jan 09 '23 16:01 HuguesJ

@HuguesJ Thanks for the info, will investigate against an AKS cluster

Nokel81 avatar Jan 09 '23 16:01 Nokel81

Here is some more debugging that would be helpful

  1. Does kubectl get namespaces succeed? (I assume it does)
  2. Running the following
kubectl create -f - -o yaml << EOF
apiVersion: authorization.k8s.io/v1
kind: SelfSubjectRulesReview
spec:
  namespace: default
EOF

both succeeds and returns something like:

apiVersion: authorization.k8s.io/v1
kind: SelfSubjectRulesReview
metadata:
  creationTimestamp: null
spec: {}
status:
  incomplete: false
  nonResourceRules:
  - nonResourceURLs:
    - /healthz
    - /livez
    - /readyz
    - /version
    - /version/
    verbs:
    - get
  - nonResourceURLs:
    - '*'
    verbs:
    - '*'
  - nonResourceURLs:
    - /api
    - /api/*
    - /apis
    - /apis/*
    - /healthz
    - /livez
    - /openapi
    - /openapi/*
    - /readyz
    - /version
    - /version/
    verbs:
    - get
  resourceRules:
  - apiGroups:
    - authorization.k8s.io
    resources:
    - selfsubjectaccessreviews
    - selfsubjectrulesreviews
    verbs:
    - create
  - apiGroups:
    - '*'
    resources:
    - '*'
    verbs:
    - '*'

Note the things I am interested about is the .status.resourceRules part, is there an entry similar to:

- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'

If not, are the fields apiGroups and resources empty, or maybe are they missing entirely?

Nokel81 avatar Jan 09 '23 21:01 Nokel81

@Nokel81 sry for late answer, I'm unable to iterate quickly.

Re.

if you run kubectl get --raw /api you should get back a JSON object that has a field called versions. What is its value? On GKE and minikube it is ["v1"]

$ kubectl get --raw /api
{"kind":"APIVersions","versions":["v1"]...

Re.

  1. Does kubectl get namespaces succeed? (I assume it does)

Yes, there are multiple namespaces and the same regression is present in all.

  1. Running the following kubectl create -f - -o yaml << EOF...

In .status.resourceRules in the output there's not the below part you expect to be shown:

- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'

Instead, I can see (among few other info that I had to remove) 20 times repeated this part (btw not sure why it's redundant):

- apiGroups:
    - ""
    resources:
    - nodes
    - namespaces
    verbs:
    - get
    - list
    - watch
  - apiGroups:
    - metrics.k8s.io
    resources:
    - nodes
    verbs:
    - get
    - list
    - watch
  - apiGroups:
    - apiextensions.k8s.io
    resources:
    - customresourcedefinitions
    verbs:
    - get
    - list
    - watch

In addition, I see:

- apiGroups:
  - policy
  resourceNames:
  - eks.privileged
  resources:
  - podsecuritypolicies
  verbs:
  - use

which is probably the way how to configure it in AWS EKS in idiomatic way but I am not aware of technical details except that part is also documented in https://docs.aws.amazon.com/eks/latest/userguide/pod-security-policy.html

Anyways, I would expect that some OpenLens code relevant to k8s resources discovery changed in 6.3.0 compared to 6.2.6, not aware of diff though. Any other hints on how to diagnose this further? Thank you for looking into this.

dmarjanovic avatar Jan 10 '23 10:01 dmarjanovic

Yes there was a bug fix that did change the related code.

In the list is there something like:

  - apiGroups:
    - ""
    resources:
    - pods
    verbs:
    - get
    - list
    - watch

?

Nokel81 avatar Jan 10 '23 12:01 Nokel81

@Nokel81 no, there's not such part. I assume EKS does it differently as mentioned with policy from my previous comment, but not sure.

dmarjanovic avatar Jan 10 '23 13:01 dmarjanovic

@Nokel81 btw, there's one detail connected to resources discovery that might be or might not be connected, but we noticed even with older (working) OpenLens versions similar "buggy" behaviour of not seeing those resources (including pods) mentioned in this ticket when connecting to some context for the first time. Only after namespace name is explicitly set (see screenshot) then disconnect+connect would show all the resources in this namespace - this "workaround" worked fine until 6.2.6 but not now in 6.3.0, not sure if it's connected though. We have permission scheme configured per namespace, not cluster.

Screenshot 2023-01-10 at 14 09 04

dmarjanovic avatar Jan 10 '23 13:01 dmarjanovic

So that list is for when the user does not have permissions to list namespaces. We do still read that list, but the change that is present in 6.3.0 attempts to fix the bug where only the first 10 namespaces are checked for these permissions.

Nokel81 avatar Jan 10 '23 13:01 Nokel81

@Nokel81 is it possible to pin point to specific PR causing this behaviour, maybe #6657?

dmarjanovic avatar Jan 10 '23 13:01 dmarjanovic

That PR fixes issues with https://github.com/lensapp/lens/pull/6614 to support "incomplete" responses like those which GKE return.

Nokel81 avatar Jan 10 '23 13:01 Nokel81

Yes there was a bug fix that did change the related code.

In the list is there something like:

  - apiGroups:
    - ""
    resources:
    - pods
    verbs:
    - get
    - list
    - watch

?

@Nokel81 I'm working with AWS EKS (using OpenLens 6.3.0) and I have the same issue. When I do a SelfSubjectRulesReview I get (among other things):

- apiGroups:
    - ""
    resources:
    [...]
    - pods
    [...]
    - services
    [...]
    verbs:
    - get
    - list
    - watch

Hope this helps.

kienhoefr avatar Jan 10 '23 15:01 kienhoefr

Yes that helps a lot, thanks

Nokel81 avatar Jan 10 '23 15:01 Nokel81

@Nokel81 thank you for the fix.

Btw #6900 only partially fixes this issue for us in a way that after the fix is applied the issue is not reproducible for some namespaces and still persists for the other ones. To be more precisely: the "working" or "non-working" namespaces are now different after OpenLens app restart meaning on 1st start the namespace A is "working" and B is "non-working" but on 2nd run A may become "non-working" and B "working" etc. Weird behaviour. Also, I didn't see anything weird (or different) in logs comparing to v6.2.6.

dmarjanovic avatar Jan 13 '23 11:01 dmarjanovic

By working what do you mean exactly? We attempt to list all the namespaces and then attempt to compute which resources are allowed to be listed in at least one of those namespaces. What namespace you select in the UI shouldn't matter (except for the "Accessible Namespaces" setting which just overrides the list namespace step above).

Nokel81 avatar Jan 13 '23 12:01 Nokel81

@Nokel81 sorry for using confusing terminology. I built and run app from master c361852dd2 (6.4.0-alpha.3). All below screens were from single run, no restarts performed.

We use single ~/.kube/config file with single eks cluster, many namespaces, 1 context per namespace. It appears in Lens Catalog/Clusters as 11 "clusters" but matches basically 11 contexts. Btw, it says Distro "eks" but not for all contexts, not sure why. 02_openlens_6 4 0-alpha 3_11_clusters_connected

Steps I did: was clicking from the top starting with the 1st pinned "cluster" (context) then clicked on 2nd and so on until the last 11th pinned "cluster". Some are "working" meaning pods are shown and the other ones are "failed" meaning no pods are shown. See first 11 screenshots: 03_1 cluster_fails 04_2 cluster_ok 05_3 cluster_fails 06_4 cluster_ok 07_5 cluster_fails 07_6 cluster_ok 08_7 cluster_fails 09_8 cluster_fails 10_9 cluster_fails 11_10 cluster_ok 12_11 cluster_ok

  • Then I picked and disconnected only one "failed" "cluster" - I was thinking that reconnecting might solve the issue

  • OpenLens screen got all empty, no buttons or anything (like on screenshot below) Screenshot 2023-01-13 at 22 53 10

  • clicked cmd+r to reload

  • UI appeared but it seemed like all the "clusters" were disconnected and had to be connected again, though there were green dots in pinned icons shown (looks like a bug)

  • I was clicking first on those 5 previously "failed" "clusters" and they were now "working": 13_9 cluster_ok_after_disconnect_ctrl-r_connect 14_1 cluster_ok 15_3 cluster_ok 16_8 cluster_ok 18_5 cluster_ok

I started clicking on other pinned "clusters" that were "working" before but now they were changing from "working" to "failed" or from "working to failed to working" and even those that were "failed" then "working" started going to "failed" then some of them to "working" again without any meaningful pattern. 17_4 cluster_fails 19_8 cluster_fails 20_10 cluster_fails 21_9 cluster_fails 22_6 cluster_fails 23_5 cluster_fails 24_1 cluster_fails 25_8 cluster_ok 26_9 cluster_ok 27_10 cluster_ok

At this point I was clicking on all the 11 pinned "clusters" but their state was not changing any more.

Now I did restart and repeat clicking on 11 pinned "clusters" and now "working" and "failed" ones were different than in previous run. Totally random behaviour comparing to previous run. No pattern found. Looked to me like there was some race condition going on. While doing screenshots I did breaks to work on screen, not sure if that influenced behaviour too.

Note that all these namespaces (the 11 contexts) defined in ~/.kube/config do have pods and are shown from all the other tools correctly including k9s, kubectl, OpenLens 6.2.6.

dmarjanovic avatar Jan 13 '23 22:01 dmarjanovic

Okay thanks.

One question. Do you have list namespace permissions for this Kube cluster?

Nokel81 avatar Jan 13 '23 22:01 Nokel81

@Nokel81 yes, I can list namespaces everywhere.

dmarjanovic avatar Jan 13 '23 23:01 dmarjanovic