boinc icon indicating copy to clipboard operation
boinc copied to clipboard

docker command detected as unknown

Open lfield opened this issue 9 months ago • 23 comments

Describe the bug

I am using the latest code from master.

The client detects that I have podman. Here is the entry from the event log.

Fri 07 Mar 2025 13:23:15 CET | | Docker: version 5.0.3 (podman)

However, when the docker_wrapper runs it uses the command 'unknown'

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3387274

I can work around this by setting a symlink.

ln -s /usr/bin/podman /usr/bin/unknown

lfield avatar Mar 07 '25 12:03 lfield

I need to debug the Linux version of docker_wrapper; am working on this. The Win version (I think) should be OK.

davidpanderson avatar Mar 07 '25 22:03 davidpanderson

This issue still seems to be there.

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3387798

lfield avatar Mar 11 '25 16:03 lfield

The fix is in the scheduler (cgi). Did you update that?

davidpanderson avatar Mar 12 '25 00:03 davidpanderson

I updated the scheduler and the match making is working. The issue seems to be in the wrapper as when it tries to run the podman command. It runs the command unknown instead.

lfield avatar Mar 12 '25 11:03 lfield

I tested this on Ubuntu 24.04, using the latest version of everything (client, scheduler, docker_wrapper). The wrapper runs podman, not unknown. I suspect you need to create a new BUDA app version with current docker_wrapper code.

davidpanderson avatar Mar 13 '25 01:03 davidpanderson

However, I did encounter a problem. On this host, the filesystem is mounted using NFS. When podman does a build command, it gets an error: running docker command: build . -t boinc__boinc.berkeley.edu_test__batch_540__job_job1 -f Dockerfile_worker_2 ... Error: creating build container: copying system image from manifest list: writing blob: adding layer with blob "sha256:155ad54a8b2812a0ec559ff82c0c6f0f0dddb337a226b11879f09e15f67b69fc": processing tar file(lsetxattr /boot: operation not supported): exit status 1

I googled this, leading to e.g. https://github.com/containers/podman/issues/14655

The same build command works fine with Docker.

The bottom line: using podman on Linux with a remote FS seems to be a can of worms. So I suggest that we not support it for now.

I think this is OK because on Linux, installing Docker is as easy as installing podman.

davidpanderson avatar Mar 13 '25 02:03 davidpanderson

I have had another problem. Boinc installer doesn't add user boinc to group docker, causing "permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Head "http://%2Fvar%2Frun%2Fdocker.sock/_ping": dial unix /var/run/docker.sock: connect: permission denied"

kotenok2000 avatar Mar 17 '25 21:03 kotenok2000

Vitalii, is it possible to have the installer add boinc to the docker group?

davidpanderson avatar Mar 17 '25 22:03 davidpanderson

Yep. Will do that.

AenBleidd avatar Mar 17 '25 22:03 AenBleidd

@kotenok2000, docker group issue should be fixed in #6168

AenBleidd avatar Mar 18 '25 12:03 AenBleidd

@lfield, @davidpanderson, is there anything left here or we can just close this issue?

AenBleidd avatar Mar 20 '25 16:03 AenBleidd

I am still having this issue on a machine. Will do some more testing later today.

lfield avatar Mar 20 '25 16:03 lfield

There are still issues here. I have podman installed:

podman --version podman version 4.9.3

There is nothing in the event log to indicate it detected podman or docker.

When I run the docker_wrapper standalone it gives:

running docker command: ps --all --filter "name=boinc" sh: 1: docker: not found

Installing the compatibly wrappers makes that work

apt install podman-docker

But when a task is run it gives:

running docker command: ps --all --filter "name=boinc__lhcathomedev.cern.ch_lhcathome-dev__theory_2843-4270120-641_2" sh: 1: unknown: not found

docker --version Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg. podman version 4.9.3

After creating /etc/containers/nodocker and cleaning the client state it works and I see the following in the event log

Thu 20 Mar 2025 22:31:02 CET | | Docker: version 4.9.3 (Docker)

lfield avatar Mar 20 '25 21:03 lfield

Please do the following:

  • run the client with --exit_before_start
  • when it exits (with a task) find the init_data.xml file in the slot directory
  • send it to me

davidpanderson avatar Mar 20 '25 23:03 davidpanderson

Notes:

  • Because podman doesn't work with remote filesystems, we don't check for it on Linux.
  • When you run docker_wrapper standalone, it assumes you have Docker installed

Both of these may be changed at some point, but that's how it currently is.

davidpanderson avatar Mar 20 '25 23:03 davidpanderson

I don't agree with the last comment. Not all applications need remote filesystems so why can't podman be an option on Linux. Also for testing the same logic should apply. In any case removing this commit got it working for me.

https://github.com/BOINC/boinc/commit/350cfbee64362bf345f10f9fce5bb0d3e3c19b19

lfield avatar Mar 21 '25 13:03 lfield

Podman doesn't work if the BOINC data directory is remote (e.g. NFS) because of permissions problems.

Is there a way to find out if a volume is remote?

davidpanderson avatar Mar 21 '25 22:03 davidpanderson

I still think that both should be supported. If an application requires docker specifically, this could be specified in the plan class. The 'unknown' issue is due to the multiline output when podman-docker is installed. This can be worked around by creating /etc/containers/nodocker. It is up to you how you would like to handle this. If only docker is supported by the code, podman can be used with the podman-docker package and touching /etc/containers/nodocker. However, making the parsing more robust to handle the multiline output might be good. Feel free to fix or close.

lfield avatar Mar 22 '25 12:03 lfield

This issue is present on MAC. The instructions were followed to install docker (podman) and the client detects it as unknown and tried to run unknown.

Refs:

  • https://github.com/BOINC/boinc/wiki/Installing-Docker
  • https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3430598

lfield avatar Jun 11 '25 13:06 lfield

@lfield, have you tested this on 8.2.4 release?

AenBleidd avatar Jun 11 '25 13:06 AenBleidd

Yes, we just downloaded it today.

lfield avatar Jun 11 '25 14:06 lfield

The problem is that the Mac version docker_wrapper on the BOINC server predates Charlie's commit 48624bf1f310cb7e2d63d26600a452fe10732669, which I think fixes this bug. I'll build this and put it on the server.

davidpanderson avatar Jun 12 '25 07:06 davidpanderson

I put latest versions of docker_wrapper for all platforms here: https://github.com/BOINC/boinc/wiki/Docker-wrapper-release-notes

davidpanderson avatar Jun 12 '25 08:06 davidpanderson

@lfield, is there any update on this? I think the issue should be fixed with the new docker_wrapper, but I'd like to be sure. Thank you in advance.

AenBleidd avatar Jul 07 '25 07:07 AenBleidd

I understand that this issue is still there for Mac https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=685 We are using the latest docker_wrapper_6_x86_64-apple-darwin from 2025-06-12

lfield avatar Jul 07 '25 08:07 lfield

Thank you for information. Ok, looks like we need to test this again after the release of BOINC 8.2.5 for MacOS.

AenBleidd avatar Jul 07 '25 08:07 AenBleidd

Laurence: Would you mind updating your BOINC web code on the test project? That will show (on the host page) what version of Podman/Docker it has. -- D

On Mon, Jul 7, 2025 at 1:45 AM lfield @.***> wrote:

lfield left a comment (BOINC/boinc#6141) https://github.com/BOINC/boinc/issues/6141#issuecomment-3044017744

I understand that this issue is still there for Mac https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=685 We are using the latest docker_wrapper_6_x86_64-apple-darwin from 2025-06-12

— Reply to this email directly, view it on GitHub https://github.com/BOINC/boinc/issues/6141#issuecomment-3044017744, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHQVAO3GCHZJZJPMADQYTT3HIXTNAVCNFSM6AAAAABYRDICRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANBUGAYTONZUGQ . You are receiving this because you were mentioned.Message ID: @.***>

davidpanderson avatar Jul 09 '25 08:07 davidpanderson

I have updated to the latest master

https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=4293

lfield avatar Jul 09 '25 12:07 lfield

According to the host detail page, that host doesn't have Docker installed. Yet it was sent jobs that require Docker. Some things to check:

  • Make sure that the project's sched/plan_class_spec.xml is a copy of the current boinc/sched/plan_class_spec.xml.sample. The last entry should be
    <plan_class>
        <name>                  docker      </name>
        <docker/>
    </plan_class>
  • Make sure the project's scheduler is built from the current code.

davidpanderson avatar Jul 09 '25 21:07 davidpanderson

Wiki should mention podman-docker for Linux. I had to install it to make BOINC stop complaining about missing docker.

To install Podman on Debian/Ubuntu: sudo apt install podman podman-docker ... On Red Hat: sudo yum install podman podman-docker

UMLAUTaxl avatar Oct 19 '25 08:10 UMLAUTaxl