boinc icon indicating copy to clipboard operation
boinc copied to clipboard

If can't build Docker image because disconnected, handle correctly

Open davidpanderson opened this issue 5 months ago • 1 comments

Describe the bug

If a Docker job starts while host is disconnected, image creation will fail. Detect this particular type of failure, and do a temporary exit.

Ideally there should be a variant of temporary exit that says: retry when we have a network connection

Steps to reproduce

No response

Expected behavior

No response

Screenshots

No response

System information

No response

Additional context

No response

davidpanderson avatar Jun 23 '25 08:06 davidpanderson

Notes: Docker uses HTTPS to fetch images.

Plan:

  • docker_wrapper parses the output of the create image command, checking for an error indicating lack of physical network connection (as opposed to e.g. image doesn't exist).
  • If so it does temporary_exit(), indicating that it's waiting for a physical connection (need to add this mechanism)
  • The client flags the job as waiting for physical connection.
  • If any network op succeeds, the client clears this flag, and the job can be restarted
  • If a job is in this state, the client pings the reference site every 10-30 or so min

davidpanderson avatar Jun 24 '25 19:06 davidpanderson