kubernetes_asyncio icon indicating copy to clipboard operation
kubernetes_asyncio copied to clipboard

Reconnect watcher if server ends empty response

Open JacobHenner opened this issue 4 years ago • 9 comments

If the server sends an empty response (e.g. a server-side timeout was exceeded), the watcher should reconnect unless the user has specified a timeout. This is similar to the behavior in the standard Python Kubernetes library.

Fixes #95

JacobHenner avatar Feb 29 '20 22:02 JacobHenner

It looks like this behavior was introduced in https://github.com/tomplus/kubernetes_asyncio/commit/61852ea42d4ec562d61cbe86d391e5de8a88e75c, as part of https://github.com/tomplus/kubernetes_asyncio/pull/22.

@olitheolix, could you clarify why the asyncio watcher should stop if the server has responded with an empty response? From my perspective, it seems like it'd be better if the watcher remained connected (reconnect) unless the user had specified a timeout. I just want to be sure I'm not missing something...

I notice that three test cases are failing - I'll address these as soon as I understand the above.

JacobHenner avatar Mar 01 '20 16:03 JacobHenner

@JacobHenner From what I can remember, it was about the separation of concerns. The function should only deal with the current event stream and return once K8s closed that stream. The caller can then implement the desired retry/reconnect/whatever logic itself. That, at least, is how I did it In my personal version of the Watcher :)

In other words, no, I do not think you are missing anything here.

olitheolix avatar Mar 02 '20 07:03 olitheolix

I'd like to keep this client similar to the official client as possible. Are we sure that empty line means that server is disconnected? I don't see such checks in the official client.

tomplus avatar Mar 02 '20 13:03 tomplus

The function should only deal with the current event stream and return once K8s closed that stream. The caller can then implement the desired retry/reconnect/whatever logic itself.

Ah, I see. From my perspective, the library should be responsible for maintaining the connection unless a user-specified timeout has elapsed or the server has responded with some sort of error condition. I've seen other comments that lead me to believe this is the expected behavior from a client - https://github.com/kubernetes/kubernetes/issues/6513#issuecomment-90721269, for example.

Are we sure that empty line means that server is disconnected?

It looks like it could mean a few things - https://github.com/kubernetes/kubernetes/blob/79e1ad2f4bbd05b1e56b7b57b63b2c1d67b90156/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/watch.go#L212-L261

  • Timeout expired
  • Marked "done"
  • Error encountered

I don't see such checks in the official client.

In the official Python client:

https://github.com/kubernetes-client/python-base/blob/d30f1e6fd4e2725aae04fa2f4982a4cfec7c682b/watch/watch.py#L141-L157

When iter_resp_lines(resp) returns a value which evaluates to False (e.g. the k8s api returns an empty response), iteration stops. Unless _stop has been set to True, the client will attempt to reconnect.

JacobHenner avatar Mar 03 '20 02:03 JacobHenner

I got it, the official client "ignore" empty lines. In my opinion it's better if a library does mandatory retries to simplify end-client's code.

tomplus avatar Mar 03 '20 08:03 tomplus

I got it, the official client "ignore" empty lines. In my opinion it's better if a library does mandatory retries to simplify end-client's code.

Should I adjust the tests accordingly?

JacobHenner avatar Mar 03 '20 14:03 JacobHenner

Yes, It'd be great.

tomplus avatar Mar 03 '20 21:03 tomplus

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@0bdd0f7). Click here to learn what that means. The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #96   +/-   ##
=========================================
  Coverage          ?   93.31%           
=========================================
  Files             ?       23           
  Lines             ?     1585           
  Branches          ?        0           
=========================================
  Hits              ?     1479           
  Misses            ?      106           
  Partials          ?        0           
Impacted Files Coverage Δ
kubernetes_asyncio/watch/watch.py 94.23% <88.88%> (ø)
kubernetes_asyncio/watch/__init__.py 100.00% <0.00%> (ø)
kubernetes_asyncio/config/kube_config_test.py 94.25% <0.00%> (ø)
kubernetes_asyncio/config/incluster_config.py 85.10% <0.00%> (ø)
kubernetes_asyncio/config/exec_provider_test.py 100.00% <0.00%> (ø)
kubernetes_asyncio/config/google_auth.py 100.00% <0.00%> (ø)
kubernetes_asyncio/config/config_exception.py 100.00% <0.00%> (ø)
kubernetes_asyncio/config/kube_config.py 93.78% <0.00%> (ø)
kubernetes_asyncio/utils/__init__.py 100.00% <0.00%> (ø)
kubernetes_asyncio/config/__init__.py 100.00% <0.00%> (ø)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0bdd0f7...ff5894a. Read the comment docs.

codecov-io avatar Apr 05 '20 01:04 codecov-io

Is there any progress on this PR?

junnplus avatar Aug 04 '20 04:08 junnplus