lakeFS [lakeFSFS] Add retries and configurable timeouts to lakeFS API calls

[lakeFSFS] Add retries and configurable timeouts to lakeFS API calls

Open arielshaqed opened this issue 2 years ago • 2 comments

As we did in the Spark metadata client for GC. If an API call times out and the exception leaks it can be really expensive on Spark! First the entire job is retried, this can cause partitions to be recomputed. And if it times out enough times the entire is aborted and all work is pretty much lost. Note that when lakeFS is under load it gets worse with more partitions rather than better :-/

Apr 13 '23 07:04 arielshaqed

This issue is now marked as stale after 90 days of inactivity, and will be closed soon. To keep it, mark it with the "no stale" label.

Nov 01 '23 15:11 github-actions[bot]

This is blocked by #5110 (!)

Nov 05 '23 13:11 arielshaqed

lakeFS lakeFS copied to clipboard

[lakeFSFS] Add retries and configurable timeouts to lakeFS API calls

lakeFS
lakeFS copied to clipboard