microbean-kubernetes-controller icon indicating copy to clipboard operation
microbean-kubernetes-controller copied to clipboard

Reflector#start() does not retry infinitely

Open ljnelson opened this issue 7 years ago • 3 comments
trafficstars

I am not entirely sure but I believe that the relevant Go code ends up retrying forever:

https://github.com/kubernetes/client-go/blob/dcf16a0f3b52098c3d4c1467b6c80c3e88ff65fb/tools/cache/reflector.go#L128-L137

But Reflector#start() will bomb out if it can't list things:

https://github.com/microbean/microbean-kubernetes-controller/blob/ee05d9ece9f591bc7052a2569a6422e804be42e8/src/main/java/org/microbean/kubernetes/controller/Reflector.java#L625

That line will throw a KubernetesClientException, and that's it. I think this whole method should (internally) be trying this list-and-watch loop forever.

For more details, see also the ListAndWatch function:

https://github.com/kubernetes/client-go/blob/dcf16a0f3b52098c3d4c1467b6c80c3e88ff65fb/tools/cache/reflector.go#L165-L275

Another way to put this is that currently Reflector#start() really just models ListAndWatch, but it should be modeling Run in reflector.go as well.

Complicating matters is the fact that the fabric8 client will automatically try to reconnect watches if they fail.

ljnelson avatar Oct 09 '18 23:10 ljnelson

I believe this code:

https://github.com/kubernetes/client-go/blob/dcf16a0f3b52098c3d4c1467b6c80c3e88ff65fb/tools/cache/reflector.go#L226-L274

…is actually implemented by the fabric8 client, more or less:

https://github.com/fabric8io/kubernetes-client/blob/cee8550053e671bb0b3489e37196924f4c949a3f/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/WatchConnectionManager.java#L293-L333

…so the problem may simply reduce to the body of the start() method needing to run infinitely, i.e. to ensure that the listing behavior is also retried ad nauseam.

ljnelson avatar Oct 10 '18 00:10 ljnelson

Ah, wait; this is already handled:

https://github.com/microbean/microbean-kubernetes-controller/blob/ee05d9ece9f591bc7052a2569a6422e804be42e8/src/main/java/org/microbean/kubernetes/controller/Reflector.java#L1031-L1057

There could be better logging here. I'll add logging and then close this issue.

ljnelson avatar Oct 10 '18 19:10 ljnelson

Need to also review the Go code to see if the "full replace" operation occurs in the event of a failure-and-retry occurrence.

ljnelson avatar Oct 11 '18 18:10 ljnelson