lakeFS icon indicating copy to clipboard operation
lakeFS copied to clipboard

[lakeFSFS] Add retries and configurable timeouts to lakeFS API calls

Open arielshaqed opened this issue 2 years ago • 2 comments

As we did in the Spark metadata client for GC. If an API call times out and the exception leaks it can be really expensive on Spark! First the entire job is retried, this can cause partitions to be recomputed. And if it times out enough times the entire is aborted and all work is pretty much lost. Note that when lakeFS is under load it gets worse with more partitions rather than better :-/

arielshaqed avatar Apr 13 '23 07:04 arielshaqed

This issue is now marked as stale after 90 days of inactivity, and will be closed soon. To keep it, mark it with the "no stale" label.

github-actions[bot] avatar Nov 01 '23 15:11 github-actions[bot]

This is blocked by #5110 (!)

arielshaqed avatar Nov 05 '23 13:11 arielshaqed