Static Docker containers need to be restarted with `--cpuset-cpus="0-3"`
ref https://github.com/adoptium/infrastructure/issues/3360#issuecomment-1924438777
All of our containers need to be restarted with the proper command to assign 4 cpus. The command --cpuset-cpus="0-3" needs to be used instead of --cpus=4.0. That way the test jobs can properly read the number of cpus on the container, instead of reading 160 cores on the dockerhost, and then assigning the appropriate concurrency. At the moment test jobs are running with -concurrency:81 on the containers while it should be -concurrency:3
The following nodes have been restarted with --cpuset-cpus="0-3"
dockerhost-equinix-ubuntu2204-armv8l-1
- https://ci.adoptium.net/computer/test-docker-debain12-armv8l-1/
- All of https://ci.adoptium.net/label/hw.dockerhost.arm.dockerhost-equinix-ubuntu2204-armv8-1/
- https://ci.adoptium.net/computer/test-docker-alpine319-armv8-1/
dockerhost-equinix-ubuntu2004-armv8l-1
- https://ci.adoptium.net/computer/test-docker-sles15-armv8l-1/
- https://ci.adoptium.net/computer/test-docker-fedora39-armv8l-1/
@Haroon-Khel Is test-docker-sles15-armv8l-1 based on the BCI image referenced in https://github.com/adoptium/infrastructure/issues/3135?
Note: This PR should cap test concurrency to either:
- (0.5*cores)+1 or
- (0.5*gigs-of-memory)
Whichever is smaller.
Also, we calculate "memory" as either the machine memory of the cgroup (container) memory, whichever is smaller.
✅ Implies the containers have been rerun with --cpuset-cpus="0-3"
[
{
"name": "dockerhost-equinix-ubuntu2004-armv8-1",
"ip": "147.75.35.203",
"containers": [
"build-docker-ubuntu2004-armv7l-1", Does not exist on machine
"test-docker-alpine313-aarch64-1", replaced by test-docker-alpine319-armv8-2 ✅
"test-docker-alpine314-aarch64-1", replaced by test-docker-alpine319-armv8-4 ✅
"test-docker-fedora39-armv8l-1", ✅
"test-docker-sles15-armv8l-1", ✅
"test-docker-ubuntu1804-armv8l-4", ✅
"test-docker-ubuntu2004-armv7l-1",
"test-docker-ubuntu2004-armv7l-2",
"test-docker-ubuntu2004-armv7l-3",
"test-docker-ubuntu2004-armv8l-1", ✅
"test-docker-ubuntu2004-armv8l-2", ✅
"test-docker-ubuntu2004-armv8l-3", ✅
"test-docker-ubuntu2204-armv8l-2", ✅
"test-docker-ubuntu2310-armv8l-1" ✅
],
"containersCount": 14
},
{
"name": "dockerhost-equinix-ubuntu2004-x64-1",
"ip": "145.40.114.58",
"containers": [
"test-docker-alpine314-x64-1",
"test-docker-alpine317-x64-1",
"test-docker-centos8-x64-1",
"test-docker-debian11-x64-1",
"test-docker-fedora35-x64-1",
"test-docker-fedora37-x64-1",
"test-docker-fedora37-x64-3",
"test-docker-ubi8-x64-1",
"test-docker-ubuntu2004-x64-1",
"test-docker-ubuntu2204-x64-1",
"test-docker-ubuntu2204-x64-3"
],
"containersCount": 11
},
{
"name": "dockerhost-equinix-ubuntu2204-armv8-1",
"ip": "139.178.86.243",
"containers": [
"test-docker-alpine314-armv8-1", replaced by test-docker-alpine319-armv8-3 ✅
"test-docker-alpine314-armv8-3", duplicate of test-docker-alpine314-armv8-1
"test-docker-alpine315-armv8-2", exists in jenkins but not on dockerhost (ghost)
"test-docker-alpine319-armv8-1", ✅
"test-docker-debain12-armv8l-1", ✅
"test-docker-ubuntu2004-armv7l-4", ✅
"test-docker-ubuntu2004-armv7l-5", ✅
"test-docker-ubuntu2004-armv7l-6", ✅
"test-docker-ubuntu2204-armv8-1", ✅
"test-docker-ubuntu2204-armv8-2", ✅
"test-docker-ubuntu2204-armv8-3" ✅
],
"containersCount": 11
},
{
"name": "dockerhost-equinix-ubuntu2204-x64-1",
"ip": "145.40.113.173",
"containers": [
"test-docker-alpine314-x64-2",
"test-docker-alpine317-x64-2",
"test-docker-centos8-x64-2",
"test-docker-debian11-x64-2",
"test-docker-fedora35-x64-2",
"test-docker-fedora37-x64-2",
"test-docker-ubi8-x64-2",
"test-docker-ubuntu2004-x64-2",
"test-docker-ubuntu2204-x64-2"
],
"containersCount": 9
},
{
"name": "dockerhost-marist-ubuntu2204-s390x-1",
"ip": "148.100.74.237",
"containers": [
"test-docker-sles12-s390x-1", ✅
"test-docker-sles15-s390x-1" ✅
],
"containersCount": 2
},
{
"name": "dockerhost-osuosl-ubuntu2004-ppc64le-1",
"ip": "140.211.168.214",
"containers": [
"docker-osuosl-ubuntu2004-ppc64le-1", duplicate of dockerhost-osuosl-ubuntu2004-ppc64le-1
"test-docker-fedora33-ppc64le-1", replaced with test-docker-fedora39-ppc64le-1
"test-docker-ubuntu1804-ppc64le-1", replaced with test-docker-ubuntu2004-ppc64le-1
"test-docker-ubuntu2010-ppc64le-1" replaced with test-docker-ubuntu2204-ppc64le-3
],
"containersCount": 4
},
{
"name": "dockerhost-osuosl-ubuntu2204-aarch64-1",
"ip": "140.211.167.67",
"containers": [],
"containersCount": 0
},
{
"name": "dockerhost-rise-ubuntu2204-aarch64-1",
"ip": "34.72.108.242",
"containers": [],
"containersCount": 0
},
{
"name": "dockerhost-skytap-ubuntu2004-ppc64le-1",
"ip": "20.61.136.212",
"containers": [
"test-docker-debian11-ppc64le-1", ✅
"test-docker-debian11-ppc64le-2", ✅
"test-docker-debian11-ppc64le-3", ✅
"test-docker-debian11-ppc64le-4", ✅
"test-docker-ubuntu2204-ppc64le-1", ✅
"test-docker-ubuntu2204-ppc64le-2" ✅
],
"containersCount": 6
},
{
"name": "dockerhost-skytap-ubuntu2204-x64-1",
"ip": "20.61.136.254",
"containers": [
"test-docker-debian12-x64-1", ✅
"test-docker-fedora39-x64-1", ✅
"test-docker-ubuntu2204-x64-4", ✅
"test-docker-ubuntu2204-x64-5" ✅
],
"containersCount": 4
}
]
Annoyingly to rerun a container with different parameters, it isnt as simple as docker stop $container docker start $container --new-options. As far as I can tell, I need to stop the running container, remove it, and then run a container from the same image with the new options. I can use https://github.com/adoptium/infrastructure/blob/master/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/dockernode.yml to automate this but I need to update some of the dockerfiles first
I wont restart the x64 equinix nodes as we want to start decommissiong those nodes as anyway as per https://github.com/adoptium/infrastructure/issues/3378#issuecomment-1938595634
With the x64 equinix dockerhost machines decommissioned, theres just the ppc64le nodes on dockerhost-osuosl-ubuntu2004-ppc64le-1 left
dockerhost-osuosl-ubuntu2004-ppc64le-1 nodes have been restarted. Issue is closed