EPIC: Upgrade EOL Ubuntu Machines To New 24.04 LTS Version
Carry out an upgrade on all EOL Ubuntu machines to the most recent LTS release ( 24.04 ), and also upgrade some of the existing Ubuntu 20.04 & 22.04 machines to provider greater coverage.
Machines Identified:
- ubuntu1604-x64-1: {ip: 78.47.239.96, description: nagios.adoptopenjdk.net}
- vagrant-x64-1: {ip: 150.239.60.120, description: Bare metal machine to run vagrantPlaybookCheck and qemuPlaybookCheck}
- [x] https://github.com/adoptium/infrastructure/issues/3501
- [x] https://github.com/adoptium/infrastructure/issues/3589
- [x] https://github.com/adoptium/infrastructure/issues/3598
- [x] https://github.com/adoptium/infrastructure/issues/3693
- [x] https://github.com/adoptium/infrastructure/issues/3692
- [x] https://github.com/adoptium/infrastructure/issues/3589
- [x] https://github.com/adoptium/infrastructure/issues/3577
- [x] https://github.com/adoptium/infrastructure/issues/3729
Other machines that should be upgraded as part of this:
| Host | Current OS | Status |
|---|---|---|
| ci.adoptium.net (Primary jenkins server) | Ubuntu 20.04 | Separate issue |
| dockerhost-azure-ubuntu2204-x64-2 | Ubuntu 22.04 [§] | |
| dockerhost-equinix-ubuntu2004-armv8-1 | Ubuntu 20.04 [§] | |
| dockerhost-osuosl-ubutu2004-ppc64le-1 | Ubuntu 20.04 [§] | |
| test-ibmcloud-ubuntu1604-x64-1 | Ubuntu 16.04 [†] | |
| test-osuosl-ubuntu1604-ppc64le-1 | Ubuntu 16.04 [†] | |
| test-osuosl-ubuntu1604-ppc64le-2 | Ubuntu 16.04 [†] | |
| test-osuosl-ubuntu1804-ppc64le-1 | Ubuntu 18.04 [†] | |
| test-osuosl-ubuntu1804-ppc64le-2 | Ubuntu 18.04 [†] | |
| test-skytap-ubuntu2004-ppc64le-1 | Ubuntu 20.04 | |
| test-osuosl-ubuntu2004-ppc64le-1 | Ubuntu 20.04 |
[§] - Updating the dockerhosts to 24.04 will mean that the kernel will be suitable for any newer docker containers that we wish to run. [†] - These machines are running a version which is now out of standard support.
ref Upgrade/Rebuild IBM VPC Host To Ubuntu 24.04
I managed to get the VPC machine, 150.239.60.120, into a state where it is receiving updates, but I am hitting dependency errors. Using apt --fix-broken install
******************************************************************************
*
* The base-files package cannot be installed because
* /bin is a directory, but should be a symbolic link.
*
* Please install the usrmerge package to convert this system to merged-/usr.
*
* For more information please read https://wiki.debian.org/UsrMerge.
*
******************************************************************************
dpkg: error processing archive /var/cache/apt/archives/base-files_13.5_amd64.deb (--unpack):
new base-files package pre-installation script subprocess returned error exit status 1
Errors were encountered while processing:
/var/cache/apt/archives/base-files_13.5_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
Then I try apt install usrmerge to solve this, but I again hit the dependency errors, sort of like a cycle. I recommend that the machine be rebuilt
@AdamBrousseau are you able to have this machine reinstalled with a Ubuntu2404 image?
@AdamBrousseau are you able to have this machine reinstalled with a Ubuntu2404 image?
FYI @AdamBrousseau has reinstalled the ibvmcloud vagrant host and once he's got the keys on it (I've asked him to put mine on then I'll add the others) we'll be able to do the setup.
FYI @AdamBrousseau has reinstalled the ibvmcloud vagrant host and once he's got the keys on it (I've asked him to put mine on then I'll add the others) we'll be able to do the setup.
Done
Noting that in addition to the list in the earlier comment there are ten test-docker machines which are Ubuntu 20.04.
I've added test-osuosl-ubuntu2004-ppc64le-1 and test-skytap-ubuntu2004-ppc64le-1 to the list above too.
The odroid ones on Ubuntu 20.04 will hopefully become irrelevant when https://github.com/adoptium/infrastructure/issues/3043 is closed.
This is being blocked by https://github.com/adoptium/infrastructure/issues/3547 (for the test-docker nodes only) since the untarr error is preventing us from upgrading our test-docker ubuntu 2004 to 2404 on ppc64le and arm32. In https://github.com/adoptium/infrastructure/issues/3547#issuecomment-2649063527 I have recommended upgrading the docker engine to >= 25.0.3, so I will try that to see if it fixes the issue
I have recommended upgrading the docker engine to >= 25.0.3, so I will try that to see if it fixes the issue
Which OSs are the problematic docker host systems running?
The default docker.io package even back to Ubuntu 20.04 seems to be Docker version 26.1.3, build 26.1.3-0ubuntu1~20.04.1 which should meet that requirement. We may have blocked regular updates to that package though to prevent it automatically causing an outage on the containers (since I seem to recall that has happened in the past)
Which OSs are the problematic docker host systems running?
dockerhost-osuosl-ubuntu2404-ppc64le-1 and dockerhost-skytap-ubuntu2004-ppc64le-1 (needs to be upgraded to ubuntu 2404 anyway) are running
Server:
Engine:
Version: 24.0.7
dockerhost-skytap-ubuntu2004-ppc64le-1 needs its libseccomp2 upgraded to >= 2.5.5 as it is suspected that this is also causing the tar error in https://github.com/adoptium/infrastructure/issues/3547
While the problematic arm64 dockerhost, dockerhost-equinix-ubuntu2204-armv8-1, though its running a docker >= 27, its libseccomp2 needs to be upgraded to >= 2.5.5, more details in https://github.com/adoptium/infrastructure/issues/3547#issuecomment-2649087432
Yep see my suggestion on libseccomp. It works be nice if that wasn't a blocker.
We'll need to see why the machines locked at an earlier version are stuck there but hopefully we can just have an outage and manually update them but we should also check if it's not at a later one because we've stopped it updating our some other reason
All of the test-docker-ubuntu2004 nodes have been upgraded to ubuntu 2404
Reiterating https://github.com/adoptium/infrastructure/issues/3547#issuecomment-2663722626
Attempted the OS upgrade on dockerhost-skytap-ubuntu2004-ppc64le-1, it stopped midway due to a lack of diskspace on /. Im unable to increase the size of /. A solution could be to recreate this vm in the skytap console
EDIT: I think the skytap console allows me to increase / while the machine is offline
Ive increased the / space to 100G, but now it is specifically requesting more /boot space
Not enough free disk space
The upgrade has aborted. The upgrade needs a total of 132 M free
space on disk '/boot'. Please free at least an additional 68.4 M of
disk space on '/boot'. You can remove old kernels using 'sudo apt
autoremove' and you could also set COMPRESS=xz in
/etc/initramfs-tools/initramfs.conf to reduce the size of your
initramfs.
Ive removed some of the older kernel files in /boot with apt purge to relieve space in /boot. Upgrade to ubuntu 22.04 (on the way to 22.04) is underway
Upgraded to ubuntu 22.04, but now need more space on /boot for the 24.04 upgrade
The upgrade has aborted. The upgrade needs a total of 179 M free
space on disk '/boot'. Please free at least an additional 146 M of
disk space on '/boot'. You can remove old kernels using 'sudo apt
autoremove' and you could also set COMPRESS=xz in
/etc/initramfs-tools/initramfs.conf to reduce the size of your
initramfs.
At this point I am comfortable keeping dockerhost-skytap-ubuntu2004-ppc64le-1 on ubuntu 22.04, since it is still in support. Its docker version has been upgraded to 26 and the machine is no longer suffering from the issues in https://github.com/adoptium/infrastructure/issues/3547
Im going to proceed with the upgrades of the following machines to ubuntu 2404
test-osuosl-ubuntu1604-ppc64le-1 test-osuosl-ubuntu1604-ppc64le-2 test-osuosl-ubuntu1804-ppc64le-1 test-osuosl-ubuntu1804-ppc64le-2 test-osuosl-ubuntu2004-ppc64le-1
Unfortunately these are all Power 8 machines, which cannot be upgraded passed ubuntu 20.04. These machines will need to be recreated in the osuosl power console with power 9 cpus
https://ci.adoptium.net/computer/test-osuosl-ubuntu2404-ppc64le-2/ is replacing test-osuosl-ubuntu1604-ppc64le-1 AQA test pipeline https://ci.adoptium.net/job/AQA_Test_Pipeline/395/console
test-osuosl-ubuntu1604-ppc64le-2 does not exist, perhaps it was deleted from the console and the inventory file was not updated. So I wont be replacing it with another machine
https://ci.adoptium.net/computer/test-osuosl-ubuntu2404-ppc64le-3/ replaces test-osuosl-ubuntu1804-ppc64le-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/398/console
https://ci.adoptium.net/computer/test-osuosl-ubuntu2404-ppc64le-4/ replaces test-osuosl-ubuntu1804-ppc64le-2 https://ci.adoptium.net/job/AQA_Test_Pipeline/399/console
https://ci.adoptium.net/computer/test-osuosl-ubuntu2404-ppc64le-5/ replaces test-osuosl-ubuntu2004-ppc64le-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/400/console
Thats all of the OSUOSL ppc64le machines
https://ci.adoptium.net/computer/test-skytap-ubuntu2404-ppc64le-1 replaces test-skytap-ubuntu2004-ppc64le-1
https://ci.adoptium.net/job/AQA_Test_Pipeline/401/console
I think that's all of the machines in the list https://github.com/adoptium/infrastructure/issues/3588#issuecomment-2160182456
I'll close this issue once https://github.com/adoptium/infrastructure/pull/3883 is merged