Unexpected tar error while un tarring jdk17 binary in ppc64le and arm32 ubuntu 2404 docker image
ref https://github.com/adoptium/infrastructure/issues/3501#issuecomment-2091101160
Hitting a tar error while building arm32 and ppc64le ubuntu 24.04 docker static containers
> [ 7/25] RUN mkdir -p /usr/lib/jvm/jdk17 && tar -xpzf /tmp/jdk17.tar.gz -C /usr/lib/jvm/jdk17 --strip-components=1:
0.295 tar: conf/security/policy/unlimited: Cannot change mode to rwxr-xr-x: Operation not permitted
0.295 tar: conf/security/policy/limited: Cannot change mode to rwxr-xr-x: Operation not permitted
0.295 tar: conf/security/policy: Cannot change mode to rwxr-xr-x: Operation not permitted
0.295 tar: conf/security: Cannot change mode to rwxr-xr-x: Operation not permitted
0.295 tar: conf/sdp: Cannot change mode to rwxr-xr-x: Operation not permitted
0.296 tar: conf/management: Cannot change mode to rwxr-xr-x: Operation not permitted
0.296 tar: conf: Cannot change mode to rwxr-xr-x: Operation not permitted
0.305 tar: legal/java.base: Cannot change mode to rwxr-xr-x: Operation not permitted
1.052 tar: jmods: Cannot change mode to rwxr-xr-x: Operation not permitted
The binaries untar without error on my local machine
Interesting ... yeah I can replicate that on one of my arm32 systems.
That's really odd ...
It's not specific to our tar file, but seems to be affecting directories extracted by tar. Sounds like a bug in the new ubuntu unless it's related to the kernel on the host. Like you I couldn't replicate with an aarch64 container with either Ubuntu 20.04 or 22.04 as the host machine. tar is at the latest version.
As an interim measure I would propose doing chmod -R a+rX /usr/lib/jvm/jdk-17-* afterwards which seems to work without problems but I'm nervous about whether this means we'll see issues elsewhere in our testing ...
Thought I'd already added this comment (Edit: yes I did but at https://github.com/adoptium/infrastructure/issues/3501#issuecomment-2093759682) but running an emulated ppc64le container on another 24.04 host system did not show a problem, which works suggest there isn't a fundamentally problem with the base container and it potentially is related to the kernel being used
This is a situation where having QPC updated with latest images would help.
JDK11 Special.openjdk, Extended.system, and Special.functional all have appear to have a similar-looking issue unpacking the build with a tar command:
22:43:12 Uncompressing file: OpenJDK11U-jdk_ppc64le_linux_hotspot_11.0.24_7-ea.tar.gz ... 22:43:14 tar: jdk-11.0.24+7/man/ja: Cannot change mode to rwxrwxr-x: Operation not permitted 22:43:20 tar: jdk-11.0.24+7/legal/jdk.jartool/LICENSE: Cannot change mode to rwxrwxr-x: Operation not permitted 22:43:20 tar: jdk-11.0.24+7/legal/jdk.jartool/ADDITIONAL_LICENSE_INFO: Cannot change mode to rwxrwxr-x: Operation not permitted 22:43:20 tar: jdk-11.0.24+7/legal/jdk.jartool/ASSEMBLY_EXCEPTION: Cannot change mode to rwxrwxr-x: Operation not permitted 22:43:20 tar: jdk-11.0.24+7/legal/jdk.internal.jvmstat/LICENSE: Cannot change mode to rwxrwxr-x: Operation not permitted 22:43:20 tar: jdk-11.0.24+7/legal/jdk.internal.jvmstat/ADDITIONAL_LICENSE_INFO: Cannot change mode to rwxrwxr-x: Operation not permitted 22:43:20 tar: jdk-11.0.24+7/legal/jdk.internal.jvmstat/ASSEMBLY_EXCEPTION: Cannot change mode to rwxrwxr-x: Operation not permitted etc etc
All three happen on a 24.04 docker host.
At the moment the 2 problem machines are test-docker-ubuntu2404-armv7-2 and test-docker-ubuntu2404-ppc64le-1. Another ubuntu2404 arm32 node is test-docker-ubuntu2404-armv7-1 and this problem does not occur on this machine. Same tar versions
jenkins@299a170b9f8f:~$ tar --version
tar (GNU tar) 1.35
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
This happened again on test-docker-ubuntu2404-armv7-2 on 2024/11/20 https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-linux-arm-temurin_SmokeTests/259/console
Interesting - that's an Ubuntu 24 container on an Ubuntu 22 host.
I can replicate this. All of the offending files appear to be symbolic links to files in legal/java.base. It only causes a problem with tar - Running chmod on the files afterwards is ok.
jenkins@dockerhost-equinix-ubuntu2204-armv8-1:~$ docker run -it aqa_u2404_arm32 bash
root@33aee01c5a84:/# wget https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.13%2B11/OpenJDK17U-jdk_arm_linux_hotspot_17.0.13_11.tar.gz
root@33aee01c5a84:/# tar xfz OpenJDK17U-jdk_arm_linux_hotspot_17.0.13_11.tar.gz
| agent | host | result |
|---|---|---|
| test-docker-ubuntu2004-armv7l-3 | dockerhost-equinix-ubuntu2404-armv8-1 | ✅ link |
| test-docker-ubuntu2404-armv7-2 | dockerhost-equinix-ubuntu2204-armv8-1 | ❌ link |
| test-docker-ubuntu2004-armv7l-6 | dockerhost-equinix-ubuntu2204-armv8-1 | ✅ link |
https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-linux-arm-temurin_SmokeTests/265/
Based on the above this is almost certainly due to running a later version of Ubuntu (which has glibc/kernel interdependencies that are too new) on an older kernel. The other possibility (which I shall aim to check tomorrow) is whether restarting docker resolves it (potentially if it has been upgraded) and also whether there are any pending docker package updates on the host that we might apply which might affect this.
Restarting docker made no difference. Restarting the machine made no difference. An aarch64 Ubuntu 24.04 container on the host works ok, so this is specific to arm32 Ubuntu 24.04 containers on Ubuntu 22.04 host.
Solution here is to deactivate test-docker-ubuntu2404-armv7l-2 which I've done.
FYI @Haroon-Khel we probably want to just decommission this particular machine now.
@Haroon-Khel If you're happy with the analysis above can you add this to your list for this iteration please?
Did a bit more digging on this. Looks like its a bug in ubuntu2404 or docker which prevents the container from running the fchmodat2 system call
https://github.com/docker/docker-ce-packaging/pull/1007#issuecomment-2064332262
If I run the ubuntu 2404 docker container with --security-opt seccomp=unconfined, I can untar files without error. I believe this option allows the container to unrestricted system calls so I am not sure this is good security wise.
https://github.com/ocaml/infrastructure/issues/121#issuecomment-2128856617 suggests Docker >= 25.0.3 and libseccomp2 >= 2.5.5 solves this, so its a matter of upgrading those packages on the problem dockerhosts
root@dockerhost-osuosl-ubuntu2404-ppc64le-1:~# docker version
Client:
Version: 26.1.3
API version: 1.43 (downgraded from 1.45)
Go version: go1.22.2
Git commit: 26.1.3-0ubuntu1~24.04.1
Built: Mon Oct 14 14:29:26 2024
OS/Arch: linux/ppc64le
Context: default
Server:
Engine:
Version: 24.0.7
API version: 1.43 (minimum version 1.12)
Go version: go1.22.2
Git commit: 24.0.7-0ubuntu4.1
Built: Fri Aug 9 02:33:20 2024
OS/Arch: linux/ppc64le
Experimental: false
containerd:
Version: 1.7.24
GitCommit:
runc:
Version: 1.1.12-0ubuntu3.1
GitCommit:
docker-init:
Version: 0.19.0
GitCommit:
To do:
- Upgrade docker on both ppc64le dockerhosts to >= 25.0.3
- Upgrade libseccomp2 on the ubuntu 2204 arm64 dockerhost and ubuntu 2004 ppc64le dockerhost to >= 2.5.5
- I suspect ubuntu 2204 and 2004 wont allow this so may have to upgrade the OS to ubuntu 2404
* Upgrade libseccomp2 on the ubuntu 2204 arm64 dockerhost to >= 2.5.5 * I suspect ubuntu 2204 wont allow this so may have to upgrade the OS to ubuntu 2404
I would test with the latest available if it doesn't have 2.5.5 in the repositories. Ubuntu (and other LTS distribution providers) will often backport important patches so even if they're showing something earlier than 2.5.5 it may be ok.
Upgraded docker on dockerhost-osuosl-ubuntu2404-ppc64le-1 to v27
Server: Docker Engine - Community
Engine:
Version: v27.4.1
API version: 1.47 (minimum version 1.24)
Go version: go1.22.10
Git commit: c710b88
Built: Mon Dec 23 11:56:44 2024
OS/Arch: linux/ppc64le
Experimental: false
I was able to untar a jdk binary on a ubuntu 2404 container on it without the permissions error. Looks good
On the arm64 dockerhosts, it looks like the tar error on arm32 ubuntu 2404 containers cleared itself up? Docker or libseccomp2 may have upgraded during an automated patch. I cant seem to recreate the tar error on a arm32 ubuntu2404 container on any of the arm64 docker nodes
dockerhost-skytap-ubuntu2004-ppc64le-1 is the remaining problem machine. As per https://github.com/adoptium/infrastructure/issues/3588 I am going to upgrade it to Ubuntu 2404, so I will upgrade docker on the machine after the OS upgrade (if ill still need to)
I am starting the dockerhost-skytap-ubuntu2004-ppc64le-1 OS upgrade right now
The upgrade terminated midway due to a lack of diskspace, presumably on /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/ubu1604p8--vg-root 38G 28G 8.2G 77% /
Reiterating https://github.com/adoptium/infrastructure/issues/3588#issuecomment-2665831738
Ive upgraded dockerhost-skytap-ubuntu2004-ppc64le-1 to ubuntu 22.04, disk space issues on /boot are preventing an upgrade to ubuntu 24.04 but I am comfortable keeping it on 22 since its still supported. Its ubuntu2404 container, which has long been offline due to the untarring issue, is now able to run grinders because Docker on the host system has been upgraded to v26 https://ci.adoptium.net/job/Grinder/12705/console