talos
talos copied to clipboard
`talosctl upgrade --image some:image` does not re-pull the image
Bug Report
Description
- Run
talosctl upgrade --image some:imagewith an invalid installer image. - Fix the image and push it with the same tag
docker push some:image - Run
talosctl upgrade --image some:imageagain, it will not re-pull the image and keep failing.
We can introduce a flag to the upgrade command like --force-pull to enforce pulling of image.
Logs
172.20.0.2: [talos] upgrade request received: preserve true, staged false, force false
172.20.0.2: [talos] validating "ghcr.io/utkuozdemir/talos-installer:test-break"
172.20.0.2: machined Unknown [/machine.MachineService/Upgrade] 2.473929476s unary error validating installer image "ghcr.io/utkuozdemir/talos-installer:test-break": failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/bin/installer": stat /bin/installer: no such file or directory: unknown (:authority=localhost;content-type=application/grpc;proxyfrom=172.20.0.2,172.20.0.3,172.20.0.4;talos-role=os:admin;user-agent=grpc-go/1.47.0)
....
....
....
172.20.0.2: [talos] upgrade request received: preserve true, staged false, force false
172.20.0.2: [talos] validating "ghcr.io/utkuozdemir/talos-installer:test-break"
172.20.0.2: machined Unknown [/machine.MachineService/Upgrade] 63.348966ms unary error validating installer image "ghcr.io/utkuozdemir/talos-installer:test-break": failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/bin/installer": stat /bin/installer: no such file or directory: unknown (:authority=localhost;content-type=application/grpc;proxyfrom=172.20.0.2,172.20.0.3,172.20.0.4;talos-role=os:admin;user-agent=grpc-go/1.47.0)
The root cause is that image is pulled and cached in the system containerd in memory (in tmpfs).
So rebooting a node is enough as a workaround.
The proper fix is to pull the image always while processing the upgrade API request, but use the cached image when running the actual upgrade.