Eventlistener using wrong URL for clusterinterceptor
Expected Behavior
Event listener should use right address for interceptors. I have configured ClusterInterceptors to use https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/cel
apiVersion: triggers.tekton.dev/v1alpha1
kind: ClusterInterceptor
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"triggers.tekton.dev/v1alpha1","kind":"ClusterInterceptor","metadata":{"annotations":{},"name":"cel"},"spec":{"clientConfig":{"service":{"name":"tekton-triggers-core-interceptors","namespace":"tekton-pipelines","path":"cel"}}}}
creationTimestamp: "2022-06-07T19:52:29Z"
generation: 4
name: cel
resourceVersion: "498224521"
uid: 37c35519-10c6-4784-9eda-7d09b40d890a
spec:
clientConfig:
caBundle: <redcated>
url: https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/cel
status:
address:
url: https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/cel
Actual Behavior
The listener may be the wrong URL or there is some issue with throttling?
el-listener-interceptor-79d9648974-hwshf event-listener I0607 22:39:07.736328 1 request.go:665] Waited for 1.194655471s due to client-side throttling, not priority and fairness, request: GET:https://10.100.0.1:443/apis/triggers.tekton.dev/v1alpha1/clusterinterceptors/cel
The service URL for tekton-triggers-core-interceptors service is below:
tekton-triggers-core-interceptors ClusterIP 10.100.14.118 <none> 8443/TCP 168m
Additional Info
-
Kubernetes version:
Output of
kubectl version:
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:51:05Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8-gke.201", GitCommit:"2dca91e5224568a093c27d3589aa0a96fd3ddc9a", GitTreeState:"clean", BuildDate:"2022-05-11T18:39:02Z", GoVersion:"go1.16.14b7", Compiler:"gc", Platform:"linux/amd64"}
-
Tekton Pipeline version:
Output of
tkn versionorkubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
Client version: 0.23.1
Pipeline version: v0.36.0
Triggers version: v0.20.0
Dashboard version: v0.26.0
I'm getting a similar issue with the same version:
EL log
{"level":"error","ts":"2022-06-17T09:58:36.185Z","logger":"eventlistener","caller":"sink/sink.go:381","msg":"Post \"https://tekton-triggers-core-interceptors.tekton-pipelines.svc:80/github\": dial tcp 10.4.2.204:80: i/o timeout","eventlistener":"default","namespace":"platform","/triggers-eventid":"76aa89a7-f421-4efb-8601-a3b12850dd09","eventlistenerUID":"7c4a390c-ee58-42bb-825f-5ce8c16147e6","/triggers-eventid":"76aa89a7-f421-4efb-8601-a3b12850dd09","/trigger":"infrastructure-utils-publish","stacktrace":"github.com/tektoncd/triggers/pkg/sink.Sink.processTrigger\n\tgithub.com/tektoncd/triggers/pkg/sink/sink.go:381\ngithub.com/tektoncd/triggers/pkg/sink.Sink.HandleEvent.func1\n\tgithub.com/tektoncd/triggers/pkg/sink/sink.go:196"}
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
tekton-triggers-core-interceptors ClusterIP 10.4.2.204 <none> 8443/TCP 48d
It's trying to hit https://tekton-triggers-core-interceptors.tekton-pipelines.svc:80 when the service is configured for 8443. I tried manually changing the svc port (8443 -> 80) and got:
2022/06/17 10:06:43 http: TLS handshake error from 10.0.2.139:58702: remote error: tls: bad certificate
Actually, I fixed the ClusterInterceptor (both cel and github) config's spec.clientConfig.service.port to 443, modified the svc/tekton-triggers-core-interceptors port to 443 and restarted the EventListener to see if the certs get created and my issue was resolved
Maybe related to https://github.com/tektoncd/triggers/issues/1368
Hey @quant-daddy sorry for the late response Are you still facing this issue ?
If yes could you tell me the steps you have executed
Note: If you give
clientConfig:
caBundle: <redcated>
url: https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/cel
url field in the clientConfig that means you have written your own https interceptor and caBundle indicates that you clusterinterceptor use that to verify connection
even while writing k8s service for clusterinterceptor you need to take care of ports part Here is the reference PR https://github.com/tektoncd/triggers/pull/1379 for the same
I believe this issue is more related to https://github.com/tektoncd/triggers/issues/1284? @quant-daddy can you confirm that the IP in your URL is you API server? I'm also seeing this error in the 0.20.1 release of triggers. The issue got better after 0.19.0 but this message still show up and the timeouts still happen
@joaosilva15 Yes the URL was for the API server. I think this could be related to the API request throttling in newer version of kubernetes. With the introduction of ClusterInterceptor CRD, I think the event listener has to query the API server repeatedly for the data in the CRD for each event received. If we receive a lot of events (most of them not useful), it triggers the rate limit / throttling by the API server. I'm speaking from the little research I did when facing the issue few weeks back. To solve the issue, I temporarily disabled event emission for internal tekton events for task/pipeline runs but this is of course a temporary fix.
@savitaashture I can confirm that right port and URI was being used and the connection was successful.
Hmm, yeah we might be making direct calls to the API server instead of going via the lister cache
Hope it gets fixed soon! @dibyom Thanks
Hi @quant-daddy
Could you try with latest v0.20.2 Triggers release
even after using v0.20.2 release still there is an issue please provide the steps to reproduce the issue
Thank you
I am currently facing this issue. A fresh https://github.com/tektoncd/triggers/releases/tag/v0.20.2 triggers release is installed. I have a ClusterInterceptor installed like:
apiVersion: triggers.tekton.dev/v1alpha1
kind: ClusterInterceptor
metadata:
creationTimestamp: "2022-08-17T02:26:29Z"
generation: 2
labels:
server/type: https
name: gitlab
resourceVersion: "64651695"
uid: 68d1afbd-54f6-4e13-8088-5f4462d99e69
spec:
clientConfig:
caBundle: <removed>
service:
name: tekton-triggers-core-interceptors
namespace: tekton-pipelines
path: gitlab
port: 8443
A service like:
apiVersion: v1
kind: Service
metadata:
annotations:
creationTimestamp: "2022-08-17T19:26:39Z"
labels:
app: tekton-triggers-core-interceptors
app.kubernetes.io/component: interceptors
app.kubernetes.io/instance: default
app.kubernetes.io/name: tekton-triggers-core-interceptors
app.kubernetes.io/part-of: tekton-triggers
app.kubernetes.io/version: v0.20.2
triggers.tekton.dev/release: v0.20.2
version: v0.20.2
name: tekton-triggers-core-interceptors
namespace: tekton-pipelines
resourceVersion: "65222209"
uid: 2f0f7659-3d52-495a-aed6-9adca0cef587
spec:
clusterIP: 10.98.17.155
clusterIPs:
- 10.98.17.155
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 8443
protocol: TCP
targetPort: 8443
selector:
app.kubernetes.io/component: interceptors
app.kubernetes.io/instance: default
app.kubernetes.io/name: core-interceptors
app.kubernetes.io/part-of: tekton-triggers
sessionAffinity: None
type: ClusterIP
An eventlistener like:
---
apiVersion: triggers.tekton.dev/v1beta1
kind: EventListener
metadata:
name: gitlab-fedora-kickstart
namespace: sway-sig
spec:
serviceAccountName: tekton-triggers
triggers:
- name: gitlab-pipeline-events-trigger
interceptors:
- name: "verify-gitlab-payload"
ref:
name: "gitlab"
kind: ClusterInterceptor
params:
- name: secretRef
value:
secretName: "gitlab-webhook"
secretKey: "secretToken"
- name: eventTypes
value:
- "Pipeline Hook"
- name: "CEL filter: only when pipelines are sucessful on sway branch"
ref:
name: "cel"
params:
- name: "success"
value: "body.object_attributes.status == 'success'"
- name: "no-mr"
value: "body.object_attributes.source != 'merge_request_event'"
template:
spec:
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: kickstart-iso-to-prod
namespace: sway-sig
spec:
pipelineRef:
name: kickstart-to-prod
workspaces:
- name: prod
persistentVolumeClaim:
claimName: sway-nginx
Which when the eventlistener is triggered results in
"2022/08/22 17:06:39 http: TLS handshake error from 10.244.9.106:59974: remote error: tls: bad certificate"
"{\"level\":\"error\",\"ts\":\"2022-08-22T17:06:39.345Z\",\"logger\":\"eventlistener\",\"caller\":\"sink/sink.go:381\",\"msg\":\"Post \\\"https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/gitlab\\\": x509: certificate signed by unknown authority\",\"eventlistener\":\"gitlab-fedora-kickstart\",\"namespace\":\"sway-sig\",\"/triggers-eventid\":\"70a73222-0462-463e-8556-cd5e141cf5c2\",\"eventlistenerUID\":\"5d5233b3-119c-4827-9789-dbdd7dbe55b3\",\"/triggers-eventid\":\"70a73222-0462-463e-8556-cd5e141cf5c2\",\"/trigger\":\"gitlab-pipeline-events-trigger\",\"stacktrace\":\"github.com/tektoncd/triggers/pkg/sink.Sink.processTrigger\\n\\tgithub.com/tektoncd/triggers/pkg/sink/sink.go:381\\ngithub.com/tektoncd/triggers/pkg/sink.Sink.HandleEvent.func1\\n\\tgithub.com/tektoncd/triggers/pkg/sink/sink.go:196\"}"
Hi @anthr76 I have tried the steps you mentioned but i dont see such error
Because when we remove caBundle from ClusterInterceptor
apiVersion: triggers.tekton.dev/v1alpha1
kind: ClusterInterceptor
metadata:
creationTimestamp: "2022-08-17T02:26:29Z"
generation: 2
labels:
server/type: https
name: gitlab
resourceVersion: "64651695"
uid: 68d1afbd-54f6-4e13-8088-5f4462d99e69
spec:
clientConfig:
caBundle: <removed>
service:
name: tekton-triggers-core-interceptors
namespace: tekton-pipelines
path: gitlab
port: 8443
Triggers do watch on the core ClusterInterceptor for every minute and if there is no caBundle it will add it and because of that
we don't see this error tls: bad certificate"
Could you provide me step by step instruction which you have followed and because of that you observed above issue
Sure I will provide live manifests to see if that helps?
I deploy Tekton like: https://github.com/anthr76/infra/blob/tekton-sway-sig/k8s/base/tekton-pipelines/deploy/kustomization.yaml
Set up an event listener like: https://github.com/anthr76/infra/blob/tekton-sway-sig/k8s/base/sway-sig/eventlisteners/gitlab-listener.yaml
Put an ingress on the eventlistener: https://github.com/anthr76/infra/blob/tekton-sway-sig/k8s/base/sway-sig/eventlisteners/ingress.yaml
Have gitlab send a POST to the ingress which results in Hook executed successfully: HTTP 202
Observe the el-gitlab-fedora-kickstart pod throw the error:
{"level":"error","ts":"2022-08-23T15:20:16.862Z","logger":"eventlistener","caller":"sink/sink.go:381","msg":"Post \"https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/gitlab\": x509: certificate signed by unknown authority","eventlistener":"gitlab-fedora-kickstart","namespace":"sway-sig","/triggers-eventid":"5dd5ff68-ccd0-4059-8e45-2c1167dbde23","eventlistenerUID":"85bf0ae7-a6d5-4338-8954-1c0b75f5d667","/triggers-eventid":"5dd5ff68-ccd0-4059-8e45-2c1167dbde23","/trigger":"gitlab-pipeline-events-trigger","stacktrace":"github.com/tektoncd/triggers/pkg/sink.Sink.processTrigger\n\tgithub.com/tektoncd/triggers/pkg/sink/sink.go:381\ngithub.com/tektoncd/triggers/pkg/sink.Sink.HandleEvent.func1\n\tgithub.com/tektoncd/triggers/pkg/sink/sink.go:196"}
Observe the tekton-triggers-core-interceptors pod throw the error:
2022/08/23 15:20:16 http: TLS handshake error from 10.244.7.132:36832: remote error: tls: bad certificate
Try basic debugging in a netshoot pod:
tmp-shell-1 ~ curl -k https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/gitlab
failed to parse body as InterceptorRequest: unexpected end of JSON input
tmp-shell-1 ~ curl https://tekton-triggers-core-interceptors.tekton-pipelines.svc:8443/gitlab
curl: (60) SSL certificate problem: self signed certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
To me it seems like the eventlistener is unaware of the SSL to communicate with the cluster interceptor
The eventlistener is configured with this RBAC https://github.com/anthr76/infra/blob/tekton-sway-sig/k8s/base/sway-sig/eventlisteners/rbac.yaml
After looking closer at this issue and the lack of others able to reproduce I ended up removing all CRDs related to tekton and the namespace itself (tekton-pipelines) after doing so this error has went away. Not exactly sure of the lingering resource that hurt me here but if I find out I will update this post.
@anthr76 Thanks a lot
Considering your comment will be closing this for now and do open it if you face it again
So closing it for now