Retry Build and Functional Test Steps when timeout occurs
Attempts to address: Connect to repository.apache.org:443 [repository.apache.org/65.109.119.155] failed: Connect timed out
max_attempts: 3
retry_wait_seconds: 600
nick-fields/retry is already used in publish
So, if I understand this correctly, this will retry the step for any failure, not just timeouts? Like, if there is a genuine test failure, it will still wait 10 minutes and then retry?
So it's my understanding that RAO will be back to normal within the week (https://the-asf.slack.com/archives/CBX4TSBQ8/p1749838588937849). These issues are temporary (but painful https://the-asf.slack.com/archives/CBX4TSBQ8/p1750077904096649). I think we should hold off on this and continue to manually retry. Auto retrying will likely lead to more issues.
nick-fields/retry does support retry_on, although I'm not confident it is passed these failures in a way that will match up.
retry_on
Optional Event to retry on. Currently supports [any (default), timeout, error].
Every CI run failing is very painful at present.
Interestingly on https://github.com/apache/grails-core/actions/runs/15731765021/job/44334333078?pr=14821 there were 0 timeouts on the first attempt, which is the first time I have seen that in a while.