lima icon indicating copy to clipboard operation
lima copied to clipboard

Automation for updating templates

Open AkihiroSuda opened this issue 2 years ago • 21 comments

It is really hard for me to create a PR like https://github.com/lima-vm/lima/pull/1236 to update the template image digests.

We have to have a tool for updating these templates automatically. The tool must retain comment lines and indentation styles in the YAMLs.


Frequently updated images:

  • [X] Ubuntu https://github.com/lima-vm/lima/pull/2702
  • [x] Debian https://github.com/lima-vm/lima/pull/2731
  • [X] ArchLinux https://github.com/lima-vm/lima/pull/2788
  • [ ] CentOS Stream

Less frequent ones:

  • [ ] AlmaLinux
  • [ ] Rocky Linux
  • [ ] Oracle Linux
  • [ ] Fedora
  • [ ] openSUSE Leap

AkihiroSuda avatar Feb 02 '23 01:02 AkihiroSuda

Maybe also some means of sharing them, with some kind of FROM system ?

  • https://github.com/lima-vm/lima/issues/824

afbjorklund avatar Feb 02 '23 07:02 afbjorklund

Maybe also some means of sharing them, with some kind of FROM system ?

Yes, but that is a separate issue

AkihiroSuda avatar Feb 02 '23 07:02 AkihiroSuda

Do you already have the link to the updated image? Or is that step needed as well? I expect you grab the checksum provided at the source, rather than generating it yourself, but please do confirm.

jlm0x017 avatar Feb 09 '23 04:02 jlm0x017

Do you already have the link to the updated image?

No, e.g., we have to detect the latest version 20230124-1270 from https://cloud.debian.org/images/cloud/bullseye/ , but I'm not sure what is the robust way to do this.

w3m | grep might be enough, but seriously we should also consider adopting some machine learning stuff.

AkihiroSuda avatar Feb 10 '23 00:02 AkihiroSuda

If I remember correctly, there was some standard metadata for scraping upstream for new releases to package...

Like https://wiki.debian.org/debian/watch

Maybe something like that can be used here, to "describe" the various vendors and where they put their binaries ?

Some thing simple, with place holders for date strings and checksums.

afbjorklund avatar Feb 11 '23 08:02 afbjorklund

Example output:

https://qa.debian.org/cgi-bin/watch?pkg=containerd

Unfortunately, the others are not available as packages. But maybe something similar to this, but for images:

https://repology.org/project/nerdctl/versions

afbjorklund avatar Feb 11 '23 08:02 afbjorklund

What about a naive bash/python script to replace placeholder strings in yaml? Something like:

ubuntu_image = `w3m | grep -e "xxx"`
sed -i "s/UBUNTU_IMAGE/$ubuntu_image/g" ubuntu.yaml

The robustness solely relies on consistent file naming of upstreams(counting on hyrum's law). Sadly I didn't find any public tool to retrive latest released images.

lobshunter avatar Apr 21 '23 03:04 lobshunter

I'm now planning to use yq

AkihiroSuda avatar Apr 21 '23 05:04 AkihiroSuda

How about using libosinfo (osinfo-db os) ??

I could see their db has info of different os variants https://gitlab.com/libosinfo/osinfo-db/-/tree/main/data/os

We might need to write a python wrapper on top of this library (the tool osinfo-db is not giving out info on image download URL's by architecture that is present in the xml file)

balajiv113 avatar May 29 '23 12:05 balajiv113

yq can read xml too

yq -p xml -P

afbjorklund avatar May 29 '23 12:05 afbjorklund

The library and database are licensed under the terms of the GNU LGPL version 2 or later.

https://libosinfo.org/

afbjorklund avatar May 29 '23 12:05 afbjorklund

osinfo-db

Doesn't seem to contain permalinks: https://gitlab.com/libosinfo/osinfo-db/-/blob/ea8a7974a1f7189953c80fa9b1478b1ff8a75f8e/data/os/ubuntu.com/ubuntu-23.04.xml.in

    <image arch="x86_64" format="qcow2" cloud-init="true">
      <url>https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img</url>
    </image>

AkihiroSuda avatar May 29 '23 12:05 AkihiroSuda

yq can read xml too

True, but if we can use API it would be great. Else with yq we might need to read all xml files under each folder that we are interested in. With API i think it will be more managable

GNU LGPL version 2 or later

I thought since we are going to use this more of a build tool (Mostly a github actions workflow) this should not be a problem.

balajiv113 avatar May 29 '23 12:05 balajiv113

It was mostly referring to the "and database", most of the tools actually seem to be GPL v2 (and to require glib)

afbjorklund avatar May 29 '23 12:05 afbjorklund

Doesn't seem to contain permalinks

True :( Supported examples are as below

  • [x] almalinux-8.yaml
  • [x] almalinux-9.yaml
  • [ ] alpine.yaml (we can support this as its our variant of alpine)
  • [ ] archlinux.yaml (Not present)
  • [x] centos-stream-8.yaml
  • [x] centos-stream-9.yaml
  • [ ] debian.yaml (Present, Using only latest version)
  • [x] fedora.yaml
  • [ ] opensuse.yaml (Present, But qcow2 image not present)
  • [ ] oraclelinux-8.yaml (Not present)
  • [ ] oraclelinux-9.yaml (Not present)
  • [x] rocky-8.yaml
  • [x] rocky-9.yaml
  • [ ] ubuntu.yaml (Present, Using only latest version)
  • [ ] ubuntu-lts.yaml (Present, Using only latest version)
  • [ ] experimental/opensuse-tumbleweed.yaml (Present, But qcow2 image not present)

balajiv113 avatar May 29 '23 12:05 balajiv113

I guess we can consider using GPT

image

AkihiroSuda avatar Jul 31 '23 16:07 AkihiroSuda

For Ubuntu, this is implemented in the ironically named "simple streams" (it's 14M):

sudo apt install simplestreams ubuntu-keyring

sstream-query --json --max=1 --keyring=/usr/share/keyrings/ubuntu-cloudimage-keyring.gpg http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:download.sjson release='noble' ftype='disk1.img' | jq -r '.[] | [.item_url,.arch,.sha256]'

https://philroche.net/2018/02/12/ubuntu-cloud-images-and-how-to-find-the-most-recent-cloud-image-part-1-of-3/

The JSON+GPG file is: http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:download.sjson


There is also a highlevel command:

sudo snap install image-status

$ image-status cloud-release
focal    amd64  20240626  disk1.img
jammy    amd64  20240627  disk1.img
mantic   amd64  20240619  disk1.img
noble    amd64  20240622  disk1.img

Where "disk1.img" is the old spelling of QCOW.

afbjorklund avatar Jun 28 '24 08:06 afbjorklund

Thanks @norio-nomura :tada:

  • https://github.com/lima-vm/lima/pull/2702

AkihiroSuda avatar Oct 08 '24 16:10 AkihiroSuda

For Debian, probably we can parse this JSON: https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-amd64.json

sha512 isn't encoded in hex though: "cloud.debian.org/digest": "sha512:2oTWCdfsVkXa4d9QPqcgN7KoMUAdG0LOLn7CqEC2mfB8qK6mMIU6PVQwg5Jowr0ze+RdiUmCZMNqm14ShyxZ7g"

Probably it should be just grepped from https://cloud.debian.org/images/cloud/bookworm/latest/SHA512SUMS instead.

AkihiroSuda avatar Oct 10 '24 01:10 AkihiroSuda

For ArchLinux, curl -fsSL https://gitlab.archlinux.org/api/v4/projects/archlinux%2Farch-boxes/packages | jq '.[-1].version' can be used for retrieving the latest version

AkihiroSuda avatar Oct 10 '24 01:10 AkihiroSuda

sha512 isn't encoded in hex though: "cloud.debian.org/digest": "sha512:2oTWCdfsVkXa4d9QPqcgN7KoMUAdG0LOLn7CqEC2mfB8qK6mMIU6PVQwg5Jowr0ze+RdiUmCZMNqm14ShyxZ7g"

It appears that this is a base64-encoded hash binary with the trailing "==" removed. The following steps will convert it into hex-encoded format.

$ debian_sha512="2oTWCdfsVkXa4d9QPqcgN7KoMUAdG0LOLn7CqEC2mfB8qK6mMIU6PVQwg5Jowr0ze+RdiUmCZMNqm14ShyxZ7g"
$ echo "${debian_sha512}=="|base64 -d|xxd -p -c -
da84d609d7ec5645dae1df503ea72037b2a831401d1b42ce2e7ec2a840b699f07ca8aea630853a3d5430839268c2bd337be45d89498264c36a9b5e12872c59ee

norio-nomura avatar Oct 10 '24 02:10 norio-nomura

Marking as completed, huge thanks to @norio-nomura 🎉

AkihiroSuda avatar Nov 01 '24 09:11 AkihiroSuda

👍🏻 I’m still researching the OpenSUSE releases. Once it looks feasible, I’ll create an update-template-opensuse.sh script as well.

norio-nomura avatar Nov 01 '24 09:11 norio-nomura