actions-runner-controller icon indicating copy to clipboard operation
actions-runner-controller copied to clipboard

Github webhook server implementation does not match documentation

Open kmlevy opened this issue 3 years ago • 11 comments

Describe the bug I followed current instructions to implement the Github webhook server:

helm upgrade --install --namespace actions-runner-system --create-namespace
--wait actions-runner-controller actions-runner-controller/actions-runner-controller
--set "githubWebhookServer.enabled=true,githubWebhookServer.ports[0].nodePort=33080"

However, when I look at the service details, the webhook server is defined as a ClusterIP service with no reference to the nodePort defined in the helm chart installation

Checks

  • [x] My actions-runner-controller version (v0.22.3) does support the feature
  • [ ] I'm using an unreleased version of the controller I built from HEAD of the default branch

To Reproduce Steps to reproduce the behavior:

  1. Run webhook server installation per documented instructions: helm upgrade --install --namespace actions-runner-system --create-namespace
    --wait actions-runner-controller actions-runner-controller/actions-runner-controller
    --set "githubWebhookServer.enabled=true,githubWebhookServer.ports[0].nodePort=33080"
  2. Describe service: kubectl describe service actions-runner-controller-github-webhook-server

Expected behavior Webhook server configuration should match that as described in documentation. I would expect it to be defined as a NodePort service. Please update documentation to match that of the underlying implementation.

Screenshots $ kubectl describe service actions-runner-controller-github-webhook-server Name: actions-runner-controller-github-webhook-server Namespace: Labels: app.kubernetes.io/instance=actions-runner-controller app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=actions-runner-controller app.kubernetes.io/version=0.22.3 helm.sh/chart=actions-runner-controller-0.17.3 Annotations: meta.helm.sh/release-name: actions-runner-controller meta.helm.sh/release-namespace: Selector: app.kubernetes.io/instance=actions-runner-controller-github-webhook-server,app.kubernetes.io/name=actions-runner-controller Type: ClusterIP IP Families: IP: x.x.x.x IPs: Port: http 80/TCP TargetPort: http/TCP Endpoints: y.y.y.y:8000 Session Affinity: None Events:

kmlevy avatar Apr 26 '22 19:04 kmlevy

This is the latest config in the helm chart: service: type: ClusterIP annotations: {} ports: - port: 80 targetPort: http protocol: TCP name: http #nodePort: someFixedPortForUseWithTerraformCdkCfnEtc

kmlevy avatar Apr 26 '22 20:04 kmlevy

@kmlevy Thanks! I'd appreciate it very much if you could submit a pull request to fix the doc if you find any mistakes. We're a small team and can't handle everything in a timely manner, and therefore we are focusing more on implementing new features and fixes in the controller.

mumoshu avatar Apr 26 '22 23:04 mumoshu

Part of my question stems from the fact that I do not know the internals of Kubernetes well enough, so I wouldn't know what to change the documentation to in order to make it match the implementation. Sorry.

kmlevy avatar Apr 27 '22 01:04 kmlevy

@kmlevy Thanks for confirming. At glance, probably it would work if you add --set githubWebhookServer.service.type=NodePort` to helm values. Would you mind giving it a shot?

mumoshu avatar Apr 27 '22 01:04 mumoshu

@mumoshu I applied the change to my K8s instance and it did create service as a NodePort service. However, it did not use port 33080 as specified. It chose a random port to assign to the nodePort. Seeing as your helm template provisions the GH webhook server as a Cluster IP service, would it make sense to document it as such? In either case, it looks as if significant changes need to happen in doc.

kmlevy avatar Apr 27 '22 19:04 kmlevy

Looks like it's handled here: #1354

kmlevy avatar Apr 27 '22 20:04 kmlevy

@kmlevy Apparently we missed the intermediate key service in-between.

https://github.com/actions-runner-controller/actions-runner-controller/blob/4053ab3e113e2cbc06d98319cee8a9ea797e2d76/charts/actions-runner-controller/values.yaml#L206-L214

It's not that significant I think but worth a fix

mumoshu avatar Apr 27 '22 20:04 mumoshu

I had this same problem and the PR https://github.com/actions-runner-controller/actions-runner-controller/pull/1354 is the fix I implemented for it. I am hoping we can merge it soon, but @kmlevy if you want you should be able to follow the instructions on that PR.

VinGarcia avatar Apr 28 '22 02:04 VinGarcia

@VinGarcia Hey! Thanks for your contribution. It took a quick glance on your change, and saw you completely omitted ,githubWebhookServer.ports[0].nodePort=33080. I think, you'd better just fix that plus adding the service type as noted in https://github.com/actions-runner-controller/actions-runner-controller/issues/1393#issuecomment-1111471637, so that becomes a more meaning full example, because without nodePort the webhookserver is quite useless without another ingress controller.

mumoshu avatar Apr 28 '22 03:04 mumoshu

Hey, @mumoshu thanks for reviewing it.

This configuration is actually not working as expected because there is no githubWebhookServer.ports config in the values.yaml file of the Helm Chart.

We are discussing how to solve this here:

https://github.com/actions-runner-controller/actions-runner-controller/issues/1352#issuecomment-1101587630

And on the PR itself, in the PR I am adding an example Ingress as a possible solution. @toast-gear is considering whether this is a good idea and @rushikesh suggested a fix for the helm chart using attributes from values.yaml that are still present, i.e.:

helm upgrade --install --namespace actions-runner-system --create-namespace --wait \
  actions-runner-controller actions-runner-controller/actions-runner-controller \
  --set "githubWebhookServer.enabled=true,githubWebhookServer.service.type=LoadBalancer" 

The only thing that is getting me confused about this NodePort is that I believe Github expects a TSL-protected route, and since we are already using cert-manager I would expect the user to set up an Ingress instead of a NodePort as in the PR.

VinGarcia avatar Apr 29 '22 12:04 VinGarcia

@kmlevy the PR I created was now merged maybe it fixes the issue you experienced here:

https://github.com/actions-runner-controller/actions-runner-controller/pull/1354

If it does we might proceed to close this ticket maybe

VinGarcia avatar Jun 08 '22 16:06 VinGarcia

I came across this issue after several hours of debugging 🤣

In my latest experiments, there is still a mismatch between the actual and documented behavior. Specifically, when I run this helm command:

$ helm upgrade --install --namespace actions-runner-system --create-namespace \
             --wait actions-runner-controller actions-runner-controller/actions-runner-controller \
             --set "githubWebhookServer.enabled=true,service.type=NodePort,githubWebhookServer.ports[0].nodePort=33080"

I end up observing NodePort 32192 was exposed instead of 33080:

$ kubectl describe -n actions-runner-system svc actions-runner-controller-webhook
Name:                     actions-runner-controller-webhook
Namespace:                actions-runner-system
Labels:                   app.kubernetes.io/instance=actions-runner-controller
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=actions-runner-controller
                          app.kubernetes.io/version=0.27.3
                          helm.sh/chart=actions-runner-controller-0.23.2
Annotations:              meta.helm.sh/release-name: actions-runner-controller
                          meta.helm.sh/release-namespace: actions-runner-system
Selector:                 app.kubernetes.io/instance=actions-runner-controller,app.kubernetes.io/name=actions-runner-controller
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.96.36.176
IPs:                      10.96.36.176
Port:                     https  443/TCP
TargetPort:               9443/TCP
NodePort:                 https  32192/TCP
Endpoints:                10.244.0.8:9443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Why is this happening?

yhtang avatar May 02 '23 21:05 yhtang