Docker Socket API broken after Python update
Describe the bug
Since the latest update to some Python packages it is not possible anymore to access the Docker API via unix://var/run/docker.sock. All connection attempts fail with Not supported URL scheme http+docker universally on all Photon OS servers. This entirely breaks monitoring with all Python based monitoring solutions (like Check_MK) and likely much more.
The bug is in one of those packages:
python3-jinja2-3.1.2-5.ph5.noarch
python3-requests-2.28.1-6.ph5.noarch
python3-xml-3.11.9-7.ph5.x86_64
python3-libs-3.11.9-7.ph5.x86_64
python3-3.11.9-7.ph5.x86_64
expat-2.6.0-6.ph5.x86_64
expat-libs-2.6.0-6.ph5.x86_64
Reproduction steps
- run Photon OS with all updates installed
- enable and start Docker
- install package
python3-pip - run
pip install docker==7.0.0(with latest version 7.1.0 the library is broken as well) - run this script: https://gist.github.com/felixlabrot/c96a6d2018b7ee7dcb004a032f5c323a
Expected behavior
It should be possible to connect to the Docker API like it was before.
Additional context
No response
Your script indicates a custom Photon OS compilation due to the fact that check_mk has specific libraries. On a Workstation 17.x vm deployed by a recently compiled Photon OS 5.0 iso, an api test seems to be successful. Here the sample from that machine (with a custom docker plexserver).
curl --unix-socket /var/run/docker.sock http://api/containers/json | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2157 0 2157 0 0 262k 0 --:--:-- --:--:-- --:--:-- 263k
[
{
"Id": "bec3e975976aeea0abb3e2b72420d78d4ccbba9b79b8661510ac7a0c812ca4dd",
"Names": [
"/plex"
],
"Image": "lscr.io/linuxserver/plex:latest",
"ImageID": "sha256:7c39c1281ab063d0e483b1a74f3e6be89e94e8f03b7162962933c474f734bb9f",
"Command": "/init",
"Created": 1735066503,
"Ports": [],
"Labels": {
"build_version": "Linuxserver.io version:- 1.41.3.9314-a0bfb8370-ls250 Build-date:- 2024-12-23T09:23:54+00:00",
"maintainer": "thelamer",
"org.opencontainers.image.authors": "linuxserver.io",
...
Docker version
root@photon-f2966979acd9 [ ~ ]# docker version
Client: Docker Engine - Community
Version: 27.3.1
API version: 1.47
Go version: go1.21.13
Git commit: 3ab4256
Built: Tue Dec 24 04:06:55 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 27.3.1
API version: 1.47 (minimum version 1.24)
Go version: go1.21.13
Git commit: 3ab5c7d
Built: Tue Dec 24 04:08:04 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.21
GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc:
Version: 1.1.14
GitCommit:
docker-init:
Version: 0.19.0
GitCommit: de40ad0
root@photon-f2966979acd9 [ ~ ]#
From your script, did you use the RedHat check-mk-raw-2.3.0p23-el8-38.x86_64.rpm version of 2.3.0p23 ? I'm asking because it needs quite a few dependencies during installation.
rpm -i ./check-mk-raw-2.3.0p23-el8-38.x86_64.rpm
warning: ./check-mk-raw-2.3.0p23-el8-38.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID c4503261: NOKEY
error: Failed dependencies:
bind-utils is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
binutils is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
cronie is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
dialog is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
freeradius-utils is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
glib2 is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
graphviz is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
graphviz-gd is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
httpd is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
libevent is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
libgsf is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
libpq is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
libtool-ltdl is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
logrotate is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
pango is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
perl-IO-Zlib is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
perl-Locale-Maketext-Simple is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
perl-Net-Ping is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php-cli is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php-gd is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php-json is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php-mbstring is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php-pdo is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
php-xml is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
poppler-utils is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
rpcbind is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
rpm-build is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
rsync is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
time is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
traceroute is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
uuid is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
xinetd is needed by check-mk-raw-2.3.0p23-el8-38.x86_64
Or did you use the docker version [1] as described in the [2]?
Accordingly to the latest commits in the official Photon OS 5.0 repository, see [3], afaik most changes were not related to python. There was a cve related fix for python virtualenv [4]. Can you describe ~"broken"?
edited:
The Not supported URL scheme http+docker issue seems be docker-py related only, e.g. [5]. A few users downgraded/pinned the requests module pip install requests==2.31 but that was before docker-py 7.1.0.
[1] docker container run -dit -p 8080:5000 -p 8000:8000 --tmpfs /opt/omd/sites/cmk/tmp:uid=1000,gid=1000 -v monitoring:/omd/sites --name monitoring -v /etc/localtime:/etc/localtime:ro --restart always checkmk/check-mk-raw:2.3.0p23
[2] https://checkmk.com/download?method=docker&edition=cre&version=2.3.0p23
[3] https://github.com/vmware/photon/commits/5.0/
[4] https://github.com/vmware/photon/commit/473035122eb7e5c8413070735249ff989e4a277d
[5] https://github.com/docker/docker-py/issues/3256
You are entirely wrong with those packages. You thought about the Check_MK server, which is not compatible with Photon OS. I am talking about the monitoring agent, which is not even needed to replicate the issue. The monitoring agent just calls the mentioned Python script and processes the text output it creates. To replicate the problem, there isn't even a need to install the agent. Just follow exactly my replication steps and do nothing more than that. That means your [1] and [2] are entirely not needed and are about something completely different.
The PIP package "docker-py" must not be used as per Check_MK documentation. But your hint with the "requests" package brought me onto something. The general problem with PIP is, that there isn't any command to update all packages, but that is the key to fix the problem. There is apparently something entirely broken when updating python3-requests with tdnf but then not updating it with pip as well.
This command is the magic fix to the issue:
pip list --outdated | cut -d" " -f1 | tail -n +3 | xargs -L1 pip install --upgrade
So your enhancement suggestion is that eligible 'python3-*'-packages are updating their 'pip'-version as well? Good catch.
@prashant1221 Could you discuss this enhancement suggestion internally?
Yes, because my issue has proven that a version mismatch can lead to broken packages. When both things are automatically kept in sync that can prevent such issues in the future. In my case this has caused an outage for all Docker monitoring in our datacenter. A very unpleasant thing to have.
Please re-open the issue, so the Photon OS team can have a look at it. Be patient. Thank you for your help. Thankful.
pip list --outdated | cut -d" " -f1 | tail -n +3 | xargs -L1 pip install --upgrade
this can easily lead to an unusable python subsystem in the machine. It will probably mess up the dependency chain of the python packages.
We recommend using docker-py3 which is supplied by Photon.
tdnf install --refresh -y docker-py3 should make things come back to normal.
Hi Shreenidhi,
An tdnf package docker-py 7.1.0 would help for this issue ticket.
Actually, tdnf list docker-py3 shows up the 6.0.0-5 bits, but pip list docker | grep docker shows up docker 7.1.0.
It will probably mess up the dependency chain of the python packages. Yes, this is indeed always the case with non-event driven autonomous version updates. That is today's claim: Is there an update e.g. in a Docker component? Then after a few hours the update should also be in an Automatic Version Bump branch. And many enthusiasts will then publish it with fancy texts {Boom. This is huge. Now we are talking. Gamechanger. It's so over. look at me - we are the media now, breaking news, it's crazy}... same boat :-/
Here for the sake of completeness, reproduction steps as described from Felix. See attached logfile.
- run Photon OS with all updates installed: line 18: done
- enable and start Docker: line 20: done
- install package python3-pip: line 22: done
- run pip install docker==7.0.0 (with latest version 7.1.0 the library is broken as well): line 36: done
- run this script: https://gist.github.com/felixlabrot/c96a6d2018b7ee7dcb004a032f5c323a: line 59 : not okay
Then, your suggestion:
line 64: tdnf install --refresh -y docker-py3: done
line 97: rerun the script: not okay
Then, Felix' suggestion:
line 102: pip list --outdated | cut -d" " -f1 | tail -n +3 | xargs -L1 pip install --upgrade
line 515: rerun the script: okay
The issue was initially caused by a version mismatch between the tdnf package and the pip package. My suggestion would be that every tdnf update of a Python package automatically should trigger the pip update of corresponding package as pointed out by @dcasota to prevent mismatches, which can lead to broken functionality.
@felixlabrot if you let us know the mismatched pips, we will try to fix it.
pip list --outdated | cut -d" " -f1 | tail -n +3 | xargs -L1 pip install --upgrade is not really helpful.
:-/ Sounds more like unleash hell on earth... Pypi doesn't do any vulnerability scanning or audits to packages published.
Built-in scanned Photon OS subcomponent' sources have their origin repositories in VMware internal repos, Kernel.org, Github.com, Sourgeforge.net, Fedoraproject.org, Rubygems.org, Freedesktop.org, Cpan.org, ftp-based sources e.g. gnu, mozilla, Gitlab-based, and more. "Built-in" sometimes still is handmade, but as I've understood, the team is heavily working on devsecops automation. In addition, the SRP initiative is really promising.
For python packages, there are promising projects e.g. https://github.com/pyupio/safety but status quo, I trust tdnf-curated packages more than python packages with all those input validation failures, exposed debug information, web application vulnerabilities, outdated dependencies, compromised temporary files, malicious packages, etc.
The issue was initially caused by a version mismatch between the tdnf package and the pip package. Yes to faster secure update cycles!