Single bad ingress prevents other ingresses from working
Describe the bug
K8 controller fails to handle one of ingresses and as a result YARP configuration is empty.
To Reproduce
Likely related to a service not having an endpoint because the corresponding pod is in a crash loop.
warn: Yarp.Kubernetes.Controller.Services.Reconciler[0]
Uncaught exception occured while reconciling
System.NullReferenceException: Object reference not set to an instance of an object.
at Yarp.Kubernetes.Controller.Converters.YarpParser.HandleIngressRulePath(YarpIngressContext ingressContext, V1ServicePort servicePort, List`1 endpoints, IList`1 defaultSubsets, V1IngressRule rule, V1HTTPIngressPath path, YarpConfigContext configContext) in src\Kubernetes.Controller\Converters\YarpParser.cs:line 79
at Yarp.Kubernetes.Controller.Converters.YarpParser.HandleIngressRule(YarpIngressContext ingressContext, List`1 endpoints, IList`1 defaultSubsets, V1IngressRule rule, YarpConfigContext configContext) in src\Kubernetes.Controller\Converters\YarpParser.cs:line 44
at Yarp.Kubernetes.Controller.Converters.YarpParser.ConvertFromKubernetesIngress(YarpIngressContext ingressContext, YarpConfigContext configContext) in src\Kubernetes.Controller\Converters\YarpParser.cs:line 31
at Yarp.Kubernetes.Controller.Services.Reconciler.ProcessAsync(CancellationToken cancellationToken) in src\Kubernetes.Controller\Services\Reconciler.cs:line 47
Further technical details
Two issues here:
- Exception handling in Reconciler.cs makes me think that single bad ingress prevents others from working. Shall it be moved inside the loop?
- NullReferenceException that I'm trying to debug.
Another issue I noticed here with WorkQueue. After the exception reconciler stops doing anything despite changes to K8 resources.
NullReferenceException is caused by ports mismatch in service and ingress specs which makes servicePort variable to be null.
E.g. Service spec has:
ports:
- name: https
port: 10000
protocol: TCP
targetPort: 89
and Ingress spec has:
port:
number: 443
Exception handling in Reconciler.cs makes me think that single bad ingress prevents others from working. Shall it be moved inside the loop? (https://github.com/microsoft/reverse-proxy/blob/7dbcd58ce86241eec5e9998607f3c805c95c437c/src/Kubernetes.Controller/Services/Reconciler.cs#L57 )
This parts still stands out.
@MihaZupan could you reopen this?
I've published a fix for the issue.
What's missing from the fix is the following case:
- An Ingress is added and successfully applied to YARP configuration.
- The Ingress is updated and now Ingress Controller cannot process it. The Ingress spec is valid, but the Ingress Controller has a bug.
Ideally, we want to keep the previous good configuration of the Ingress in place for this case. But current implementation will wipe it out.