zero-to-jupyterhub-k8s icon indicating copy to clipboard operation
zero-to-jupyterhub-k8s copied to clipboard

resourceLock = endpoints, endpointsleases or leases ?

Open yilas opened this issue 2 years ago • 6 comments
trafficstars

Bug description

That issue is close to the issue https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/3128.

The value of resourceLock used is not correct. Indeed, I have that value in the ConfigMap user-scheduler :

apiVersion: v1
data:
  config.yaml: |
    apiVersion: kubescheduler.config.k8s.io/v1beta3
    kind: KubeSchedulerConfiguration
    leaderElection:
      resourceLock: endpointsleases
      resourceName: user-scheduler-lock
      resourceNamespace: "jupyterhub"
(...)

but the log from the pod user-scheduler returns :

(...)
I0719 09:20:29.519934       1 serving.go:348] Generated self-signed cert in-memory
I0719 09:20:29.521209       1 configfile.go:59] "KubeSchedulerConfiguration v1beta3 is deprecated in v1.26, will be removed in v1.29"
E0719 09:20:29.521499       1 run.go:74] "command failed" err="leaderElection.resourceLock: Invalid value: \"endpointsleases\": resourceLock value must be \"leases\""

Expected behaviour

That value shoud be leases as returned by the log.

The value is hardcoded. Maybe a comparison with the k8s version should be done like this ⬇️ but I do not know from which version of k8s the value leases should be used.

    {{- if semverCompare ">=1.24.0-0" .Capabilities.KubeVersion.Version }}
    apiVersion: kubescheduler.config.k8s.io/v1
    {{- else }}

Actual behaviour

Change the value of resourceLock to leases.

Your personal set up

Here is the version of k8s (eks) I'm using :

clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-16T17:32:40Z"
  compiler: gc
  gitCommit: 78c8293d1c65e8a153bf3c03802ab9358c0e1a14
  gitTreeState: clean
  gitVersion: v1.27.3-eks-a5565ad
  goVersion: go1.20.5
  major: "1"
  minor: 27+
  platform: linux/amd64

yilas avatar Jul 19 '23 09:07 yilas

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar Jul 19 '23 09:07 welcome[bot]

Argh this is related to the semverCompare and eks reported version i figure.

Could you render a dummy resource that renders what the helm value {{ .Capabilities.KubeVersion.Version }} renders to and play around with what the if / else clause renders to if you adjust that from ita dynamic value to a hardcoded string you experiment with to understand what is wrong with the semverCompare statement.

To do this, do "helm create dummy-chart", update a configmap to include an if/else statement and let it render a comment or another comment, and use "helm template ." to render it against your k8s cluster with the version 1.27 on eks.

/ From a mobile, on vacation, i'll review any PRs if you open one about this, note that we need to ensure all semverCompare is updated in the chart

consideRatio avatar Jul 20 '23 07:07 consideRatio

Hello 👋🏻

Here is an example output with an EKS version 1.27.

The .yaml file :

apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ .Release.Name }}-configmap
data:
  Capabilities.KubeVersion.Version: {{ .Capabilities.KubeVersion.Version }}
  Capabilities.KubeVersion.Major: {{ .Capabilities.KubeVersion.Major }}
  Capabilities.KubeVersion.Minor: {{ .Capabilities.KubeVersion.Minor }}
  {{- if semverCompare ">=1.28.0-0" .Capabilities.KubeVersion.Version }}
  Capabilities.KubeVersion.Version: if__{{ .Capabilities.KubeVersion.Version }}
  {{- else }}
  Capabilities.KubeVersion.Version: else__{{ .Capabilities.KubeVersion.Version }}
  {{- end }}

The output : (with the command helm install testchart . --dry-run) :

---
# Source: dummy-chart/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: testchart-configmap
data:
  Capabilities.KubeVersion.Version: v1.27.3-eks-a5565ad
  Capabilities.KubeVersion.Major: 1
  Capabilities.KubeVersion.Minor: 27+
  Capabilities.KubeVersion.Version: else__v1.27.3-eks-a5565ad

With that condition, the file jupyterhub/templates/scheduling/user-scheduler/configmap.yaml could be changed as :

diff --git a/jupyterhub/templates/scheduling/user-scheduler/configmap.yaml b/jupyterhub/templates/scheduling/user-scheduler/configmap.yaml
index 0f142b01..d4d959d2 100644
--- a/jupyterhub/templates/scheduling/user-scheduler/configmap.yaml
+++ b/jupyterhub/templates/scheduling/user-scheduler/configmap.yaml
@@ -29,7 +29,11 @@ data:
     {{- end }}
     kind: KubeSchedulerConfiguration
     leaderElection:
-      resourceLock: endpointsleases
+      {{- if semverCompare ">=1.27.0-0" .Capabilities.KubeVersion.Version }}
+      resourceLock: leases
+      {{- else }}
+      resourceLock: endpoints
+      {{- end }}
       resourceName: {{ include "jupyterhub.user-scheduler-lock.fullname" . }}
       resourceNamespace: "{{ .Release.Namespace }}"
     profiles:

yilas avatar Jul 20 '23 14:07 yilas

I think the issue is that you ended up with v1beta3 in the first place, using k8s 1.27, semverCompare should have rendered you to get v1 of that kube-scheduler config file. You should also have got "leases" in your PR instead of "endpoints". So, why doesnt semverCompare work with the EKS provided version?

Why does semverCompare used by you result in the wrong version?

What version of helm are you using?

consideRatio avatar Jul 20 '23 17:07 consideRatio

Also, are you using helm directly, or using it indirectly via argocd or similar? If templates arent rendered against a k8s api-server, what value does the .Version field take?

consideRatio avatar Jul 21 '23 07:07 consideRatio

I don't think there is a bug because this is tested to work I think, but I may be wrong. I'm happy to look into this further given information about:

  • [x] K8s cluster version: 1.27 on EKS
  • [ ] What Helm chart version was used?
  • [ ] Was the scheduling.userScheduler.image.tag default value overridden?
  • [ ] Was tooling used to render helm chart templates before installing them, like ArgoCD? This can cause you to have incorrect evaluation of chart template logic depending on the k8s version as the helm template rendering is done in isolation from the k8s api-server.

consideRatio avatar Aug 17 '23 09:08 consideRatio