homu
homu copied to clipboard
Wait for a short while before picking next in queue if the current PR fails
For Servo, there is a good chance that a failure is because of some intermittent, and would disappear after a retry.
When the queue is empty, the retry can be very efficient, because homu would reuse the result from the previous attempt, and run only the failed tasks.
However, if the queue is full (or at least has another PR in queue), the next PR would be picked immediately after the failure, and thus the failed PR would have to wait for an extended time, and it would likely have to run all the tasks again, which is a big waste for both machine time and human time, just for an intermittent.
I suggest that we allow homu to wait for a short while, e.g. 3-5min, for a failed PR before picking the next. This would allow people to check and probably retry immediately.
@glennw suggested in IRC that it may be reasonable to scale the delay time based on the priority, so that higher priority PR has longer waiting time. And @mbrubeck suggested that an explicit "r-" should abort the delay and start the next PR immediately.