Retry token on 503 responses
As described in https://github.com/custom-components/zaptec/issues/357#issuecomment-3508489144 it has been observed that Zaptec may return 503, Service Unavailable. Currently _refresh_token() is failing on all non-error return codes. It should be considered to rather retry the request (with the exponential back-off).
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/503
@ivastokic @thecoldwine @bonfaceOchieng
What does it mean when the Zaptec cloud responds 503 on https://api.zaptec.com/oauth/token? We have a user that reports getting this. What should be our approach? Keep retrying with exponential back-off as per usual or is this an error condition that we shouldn't retry?
@sveinse 503 means there is an upstream error in getting token. The best way for this user is to contact support, so we do can check what is the exact error.
We also will review the way the integration requests the token, since we had a few changes with the auth last week.
@thecoldwine Thank you for answering. So beside this user with this specific problem, we should still generate an error when receiving 503 from Zaptec? Or should we improve the error handling by e.g. retrying?
@sveinse I looked through the code. Is it possible to cache token for 23 hours on the device, for example? We're in process of migrating to new auth system (and eventually deprecating password grant), and I see there might be excessive traffic from Home assistant to auth endpoints.
I will look through access logs to confirm, of course. For now I will just lax the rate limit a bit, but this needs to be addressed on the client side as well.
@thecoldwine we never expire the token when the system is running. We make the request to the API end points, and if it returns 401 UNAUTHORIZED, we do a token refresh.
@sveinse okay. And I assume user agent for those requests is not generic python/requests? The thing is, we see a lot of traffic from a couple overgeneric user-agents and we have rate limited those heavily.
If you get 503 on /oauth/token for now you should retry it with normal exponential backoff. We will keep integrators posted on upcoming auth changes where we will release new grant_type for simpler experience.
@thecoldwine The Zaptec integration is levering the http client provided by Home Assistant, so it use the user agent value provided by them. See https://github.com/home-assistant/core/blob/c0f61f6c2b40570ece4b79ca351b0e5ae9a4154a/homeassistant/helpers/aiohttp_client.py#L45-L48
This seems to currently be "HomeAssistant/x.y.x aiohttp/x.y Python/3.x"
@sveinse good, then we disregard HomeAssistant as a potential source of the suspicious traffic :) Thanks!
Great. I'm leaving this issue open to implement the backoff and retries when receiving 503.
@sveinse good, then we disregard HomeAssistant as a potential source of the suspicious traffic :) Thanks!
Hey @thecoldwine thanks for your support here! I think the evcc community is currently also getting dropped by the new rate limits :(
I have two Zaptec boxes but cannot use them at the same time.
All the guys who help with evcc there are bouncing off the Zaptec support about api questions :(
Really Sorry @sveinse for crashing this discussion.
@chris-e-codes could you please be more specific on the issue?
We are trying to help as much as we can, and the best way to communicate about integrations is with @bonfaceOchieng.
thanks for picking this up! I'm not one of the coders. I just try to connect the right people. Your response in that thread will help very much!