`flaps`: retry on certain failure modes, such as 504s
Change Summary
What and Why: All flaps operations will retry if the error is known to be transient. This should improve reliability - especially in situations like fly deploy where one hiccup can (currently) stop an entire deploy.
How: Adds a wrapper in flapsutil over *flaps.Client that implements retrying on certain failure modes, like 504. NewClientWithOptions now returns the generic FlapsClient so that we can return the wrapper type instead of a raw *flaps.Client.
Documentation
- [x] Fresh Produce
- [ ] In superfly/docs, or asked for help from docs team
- [ ] n/a
Retrying POST/PATCH my yield unexpected results cause subsequent requests might to go to a different flaps and flyd among other issues. It's basically not Idempotent. https://flyio.discourse.team/t/flaps-what-status-codes-can-we-retry/5060/2
Retrying POST/PATCH my yield unexpected results cause subsequent requests might to go to a different flaps and flyd among other issues. It's basically not Idempotent. https://flyio.discourse.team/t/flaps-what-status-codes-can-we-retry/5060/2
ah. that's a huge bummer. I'm going to try to see if I can get this working for GET requests then, and maybe afterwards come up with a different strategy for fly deploy machine creation.
I wonder if some operations, like SetMetadata, could be retried anyway? it's setting a named value, so even if it were called multiple times there shouldn't be any side effects...