Detect/mitigate stuck Docker builds
We received a report of a Docker build getting stuck, reporting no status, and never completing. It first happened after a Live Update crash fallback, but the same behavior happened after a fresh tilt up. The user was able to get things to proceed after running a docker system prune.
Given these conditions, it seems less likely to be a pure Tilt bug, but the experience is frustrating, particularly because there's no practical way to address it within Tilt.
I think there might be several possible improvements here:
- Have a (configurable?) timeout for max duration w/o any activity from BuildKit
- Possibility to trigger a from-scratch build, e.g.
docker build --no-cache ...to eliminate bad intermediary containers - Detect critically low / out of disk space [probably tricky w/ Docker for Mac/Windows and their hidden VMs] and suggest running prune (or triggering an immediate/more aggressive
tilt docker prune)
Beyond potential BuildKit issues/bugs, I saw another case recently where the actual image transfer was basically causing the build to hang indefinitely.
There's not much Tilt can do to actually solve bandwidth/slow registry issues BUT it's another case that would be helpful to detect and at least provide guidance/messaging, particularly if the transfer seems to entirely halt -- a surprise VPN disconnection can seem to wedge BuildKit where it doesn't timeout/give up but also doesn't report anything meaningful.
Hi! Any progress on this one? Still experience similar issues in 2025.