planet-client-python
planet-client-python copied to clipboard
Order waiter could use exponential backoff to avoid exceeding max number of tries
Ideally, order.wait should "just work" and users shouldn't have to tune the delay or the retry limit. One tried and true way to do this is to have exponential backoff. For example, we want to check often early in case the order is quickly finished. But once we've discovered that it's still running at 5 minutes, it's less useful to check back in a second, it might easily go to 10 minutes.
Specifically here we should incrementally increase the sleep time as we go: https://github.com/planetlabs/planet-client-python/blob/main/planet/clients/orders.py#L454.
tl;dr Agreed, it should just work and not require user tuning. My recommendation is that we set the (uniform) delay parameter internally and remove it from the function definition, and that we remove max_retries and let the user handle timeouts.
Context The max_retries/delay combo seems to perform the following functions:
- avoid unnecessarily hammering the API with rapid requests and
- allow the user to set a timeout (through their own calculation of max_retries*delay)
- and the real reason I implemented it, to allow tests to set delay to zero so a test could run quickly
The issue with orders and an exponential delay is: orders regularly take at least 5min. But then they complete fairly close together after that (depending loosely on the number of images in an order). To get the user the order as fast as possible after the characteristic delay, one would either want a uniform delay or logarithmic (asymptotically approaching a maximum). Logarithmic would hammer the API the most right after submitting the order, which is when the order is least likely to be ready. Uniform delay is pretty simple and we can just set a maximum delay we want to impose on the user and go from there. I think the delay used to be something like 30seconds, but the long delay between polls made for an experience where the user was not sure if the program had been stalled out - so we lowered it to 10s (I think).
I don't know how the current limit: maximum number of retries, really translates to something useful to the user. So something like cancel_after would make more sense. But, really, I wonder if an async function is the right place to work with timeouts, ultimately, using asyncio timeouts might make more sense.
For tests, we can just monkeypatch the delay parameter and set it to zero.
Hey guys, so I have a two questions:
- Is there a throttling mechanism implemented at the planet backend?
- If there is then uniform polling can be an issue. So as an order takes at least 5 minutes, the user will end up using a token of API calls that they otherwise might have not used (by setting delay parameters accordingly or adjusting their script logic differently).
- Does the planet backend services implement a backoff mechanism?
- If there is a backoff and assuming the API throttling isn't really extreme (again if throttling is significant that would be a much bigger issue) to cause an issue then we might be able to skip this logic all in all.
And finally, if this feature is to be added how about we let the user retain the control to specify these configs if needed and still implement a uniform scheme as described by @jreiberkyle that can be used if the user doesn't define a custom scheme.