Add support to retry download after a read timeout
Description of the feature request:
Even though there are multiple options to retry downloads such as --http_connector_attempts and --experimental_repository_downloader_retries. Running into a download timeout during fetching will not trigger a retry.
Would it be possible to consider the case of download timeouts a possible re-trigger to experimental_repository_downloader_retries ?
I have to admit I don't have any experience with the bazel source code but I can see here here that only ContentLengthMismatchException or SocketException are considered for possible re-triggers. Would it be possible to consider SocketTimeoutException another retriable exception?
I would be happy to contribute to the implementation of this request if you can confirm that this is possible and not disallowed by design.
Which category does this issue belong to?
Core
What underlying problem are you trying to solve with this feature?
The underlying problem we're facing is that during peak request times the artifact storage can start to timeout on downloads.
These errors happened while having set --http_connector_attempts=11 and --experimental_repository_downloader_retries=10. Here in this case it would be very beneficial to also use the retries from experimental_repository_downloader_retries to re-run downloads if there is a Read timeout error.
Which operating system are you running Bazel on?
Linux
What is the output of bazel info release?
"release 7.6.0" (Also should be the same behaviour for 8+)
If bazel info release returns development version or (@non-git), tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse HEAD ?
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
here is an example from CI
clang-tidy Run
bazel build --config=clang-tidy --build_tag_filters=-no-clang-tidy,-fast-clang-tidy --target_pattern_file=tools/target_patterns/clang_tidy.params
2025-05-27T06:01:48.2971478Z ERROR: some/path/BUILD:465:8: //some/path/some_test depends on @@_main~_repo_rules~some_test_data//:srcs in repository @@_main~_repo_rules~some_test_data which failed to fetch.
no such package '@@_main~_repo_rules~some_test_data//': java.io.IOException: Error downloading [https://artifacts_registry_url/repository/path/file.zip] to /mnt/data/bazel-user-root/45ef7d2bd11527ab6fca94135f0ad0a0/external/_main~_repo_rules~some_test_data/file.zip: java.net.SocketTimeoutException: Read timed out
Sure, feel free to send a PR!
Instead of retrying on a timeout, would a longer timeout work for you? I don't know whether it's configurable yet, but allowing it to be increased seems more natural to me than retrying it.
@fmeum Thanks, that's a good point, and we do have --experimental_scale_timeouts, which might be a better fit to work around the issue? @Bechir-Braham
OK, that flag controls the repository_ctx.execute timeout, the one controls http download timeout is --http_timeout_scaling