kubernetes-ingress-controller icon indicating copy to clipboard operation
kubernetes-ingress-controller copied to clipboard

Errors during start KIC

Open oleksandrs-adorama opened this issue 11 months ago • 6 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

we updated kubernetes-ingress-controller version 2.8 to kubernetes-ingress-controller version 3.0 we started observe strange issues during start KIC.

2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "default", "error": "Service default/**** not found"}
2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "default", "error": "Service default/**** not found"}
2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "****", "error": "Service ****/**** not found"}
2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "default", "error": "Service default/**** not found"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret ****/****.api-key not found	{"name": "****", "namespace": "****", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****-service.public" failure: Failed to fetch secret: Secret ****/****-service.public not found	{"name": "****.public-user", "namespace": "****", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret default/****.api-key not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "*****.public" failure: Failed to fetch secret: Secret default/*****.public not found	{"name": "****public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret default/c****.api-key not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.public" failure: Failed to fetch secret: Secret default/****.public not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret default/****.api-key not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:58Z	info	Successfully synced configuration to Kong	{"url": "https://localhost:8444", "update_strategy": "InMemory", "v": 0}
2024-03-13T16:47:06Z	error	Failed to fetch KongPlugin resource	{"kongplugin_name": "****.response-transformer", "kongplugin_namespace": "default", "error": "no KongPlugin or KongClusterPlugin was found for default/***.response-transformer"}
2024-03-13T16:47:06Z	error	Failed to fetch KongPlugin resource	{"kongplugin_name": "***.acl", "kongplugin_namespace": "default", "error": "no KongPlugin or KongClusterPlugin was found for default/****.acl"}

After those errors KIC works as we expected Time to time we can see errors in log

2024/03/14 14:32:29 http: TLS handshake error from 10.50.57.190:35458: EOF
2024/03/14 14:32:29 http: TLS handshake error from 10.50.56.159:55946: EOF
2024-03-14T17:43:45Z	info	Successfully synced configuration to Kong	{"url": "https://localhost:8444", "update_strategy": "InMemory", "v": 0}
2024-03-14T17:46:30Z	info	Successfully synced configuration to Kong	{"url": "https://localhost:8444", "update_strategy": "InMemory", "v": 0}
2024/03/14 22:09:32 http: TLS handshake error from 10.50.56.159:46498: EOF

or

time="2024-03-15T08:08:08Z" level=error msg="checking config status failed" error="making HTTP request: Get \"https://localhost:8444/status\": read tcp 127.0.0.1:46354->127.0.0.1:8444: read: connection reset by peer"
time="2024-03-15T09:51:12Z" level=error msg="failed to fetch KongIngress resource for Services default/***" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:13Z" level=error msg="failed to fetch KongIngress resource for Services default/***" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:13Z" level=error msg="failed to fetch KongIngress resource for Services default/****" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:13Z" level=error msg="failed to fetch KongIngress resource for Services default/****" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:15Z" level=info msg="successfully synced configuration to kong."

Expected Behavior

No response

Steps To Reproduce

during start pod KIC version version 3.0

Kong Ingress Controller version

kong/kubernetes-ingress-controller:3.0

Kubernetes version

1.27.8-gke.1067004

Anything else?

No response

oleksandrs-adorama avatar Mar 16 '24 14:03 oleksandrs-adorama

@oleksandrs-adorama Looks like the connection inside your k8s cluster ( connection between KIC pod and k8s apiserver, and connection between KIC and Kong gateway admin API) is not very stable so the cache inside KIC's controller runtime may not be synced with k8s apiserver. Do you know the what pods own IPs 10.50.56.159 and 10.50.57.190? This can help us to locate the problems.

randmonkey avatar Mar 19 '24 08:03 randmonkey

10.50.56.159 and 10.50.57.190 - konnectivity-agent

oleksandrs-adorama avatar Mar 20 '24 12:03 oleksandrs-adorama

also i can see errors

2024-03-20T12:05:21Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "ce309f6f-d757-48ac-a553-f6722ecbd207", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}
2024-03-20T12:05:22Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "bd3cf33d-d5b4-4c97-9c4a-b749b9950d06", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}

oleksandrs-adorama avatar Mar 20 '24 12:03 oleksandrs-adorama

also i can see errors

2024-03-20T12:05:21Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "ce309f6f-d757-48ac-a553-f6722ecbd207", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}
2024-03-20T12:05:22Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "bd3cf33d-d5b4-4c97-9c4a-b749b9950d06", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}

This seems to be that the KongConsumer in the cache is outdated. While based on k8s's eventual consistency mechanism, it will be translated and applied on Kong gateway finally. It might take longer time for controller cache to be synced with k8s apiserver if your cluster is heavy loaded or network is not stable.

randmonkey avatar Mar 21 '24 03:03 randmonkey

we have three separate environments test, dev and prod. In three env ea have the same behavior.

oleksandrs-adorama avatar Mar 24 '24 15:03 oleksandrs-adorama

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 22 '24 05:04 stale[bot]