reverse-proxy icon indicating copy to clipboard operation
reverse-proxy copied to clipboard

Single bad ingress prevents other ingresses from working

Open specialforest opened this issue 3 years ago • 5 comments

Describe the bug

K8 controller fails to handle one of ingresses and as a result YARP configuration is empty.

To Reproduce

Likely related to a service not having an endpoint because the corresponding pod is in a crash loop.

warn: Yarp.Kubernetes.Controller.Services.Reconciler[0]
      Uncaught exception occured while reconciling
      System.NullReferenceException: Object reference not set to an instance of an object.
         at Yarp.Kubernetes.Controller.Converters.YarpParser.HandleIngressRulePath(YarpIngressContext ingressContext, V1ServicePort servicePort, List`1 endpoints, IList`1 defaultSubsets, V1IngressRule rule, V1HTTPIngressPath path, YarpConfigContext configContext) in src\Kubernetes.Controller\Converters\YarpParser.cs:line 79
         at Yarp.Kubernetes.Controller.Converters.YarpParser.HandleIngressRule(YarpIngressContext ingressContext, List`1 endpoints, IList`1 defaultSubsets, V1IngressRule rule, YarpConfigContext configContext) in src\Kubernetes.Controller\Converters\YarpParser.cs:line 44
         at Yarp.Kubernetes.Controller.Converters.YarpParser.ConvertFromKubernetesIngress(YarpIngressContext ingressContext, YarpConfigContext configContext) in src\Kubernetes.Controller\Converters\YarpParser.cs:line 31
         at Yarp.Kubernetes.Controller.Services.Reconciler.ProcessAsync(CancellationToken cancellationToken) in src\Kubernetes.Controller\Services\Reconciler.cs:line 47

Further technical details

Two issues here:

  1. Exception handling in Reconciler.cs makes me think that single bad ingress prevents others from working. Shall it be moved inside the loop?
  2. NullReferenceException that I'm trying to debug.

specialforest avatar May 24 '22 21:05 specialforest

Another issue I noticed here with WorkQueue. After the exception reconciler stops doing anything despite changes to K8 resources.

specialforest avatar May 25 '22 17:05 specialforest

NullReferenceException is caused by ports mismatch in service and ingress specs which makes servicePort variable to be null.

E.g. Service spec has:

ports:
- name: https
  port: 10000
  protocol: TCP
  targetPort: 89

and Ingress spec has:

port:
  number: 443

specialforest avatar May 25 '22 20:05 specialforest

Exception handling in Reconciler.cs makes me think that single bad ingress prevents others from working. Shall it be moved inside the loop? (https://github.com/microsoft/reverse-proxy/blob/7dbcd58ce86241eec5e9998607f3c805c95c437c/src/Kubernetes.Controller/Services/Reconciler.cs#L57 )

This parts still stands out.

specialforest avatar May 27 '22 17:05 specialforest

@MihaZupan could you reopen this?

specialforest avatar May 31 '22 17:05 specialforest

I've published a fix for the issue.

What's missing from the fix is the following case:

  1. An Ingress is added and successfully applied to YARP configuration.
  2. The Ingress is updated and now Ingress Controller cannot process it. The Ingress spec is valid, but the Ingress Controller has a bug.

Ideally, we want to keep the previous good configuration of the Ingress in place for this case. But current implementation will wipe it out.

specialforest avatar May 31 '22 21:05 specialforest