coreos-assembler
coreos-assembler copied to clipboard
support `cosa init --ostree docker://quay.io/coreos-assembler/fcos:testing-devel`
This is part of https://github.com/coreos/fedora-coreos-tracker/issues/828 conceptually, which was actually in retrospect framed too broadly. Focus shifted to coreos layering but that's really the "user experience" half. Re engineering how we build and ship FCOS (the first issue) still applies.
In particular, I think we should support a cosa init --ostree
mode that takes a container image as input and outputs just a container. We may not even generate a builds/
directory, and no meta.json
stuff should be created for this.
flowchart TB
quayprevious["previous ostree container"] --> ostreebuild
subgraph ostreebuild [cosa ostree build]
configgit["config git"]-->container
rpms --> container["ostree container"]
container --> quay["quay.io"]
end
subgraph imagebuild [cosa image build]
quay --> qemu["qemu image"]
quay --> metal
metal --> iso
qemu --> ami
qemu --> vsphere
qemu --> gcp
end
imagebuild --> S3
imagebuild -->
Note how the input to "cosa image build" is just the ostree container, not config git (or rpms).
Further, I want to emphasize that the "build ostree container" and "build disk images" can (and would normally be) separate processes. (Now, how testing is integrated here is not depicted, but basically we'd still probably generate a qemu image to sanity test our container builds, but it would be discarded and regenerated by the image build process only once that image had passed other approvals)
A specific thing this would really help unblock is reworking our build/CI flow to be more like:
- check for changes in input
- build new container image
- Do sanity checks on that container image as a container (perhaps systemd-in-container even)
- Push that container image to registry
The remainder of stuff here could be parallelized/configurable:
- Kick off upgrade tests from the previous stable release
- Generate a fresh qemu image and run qemu basic tests
- Do ISO/metal tests
- Do cloud tests
And we could now much more naturally represent stages of CI with container image tags. For example we might push fcos:testing-devel-candidate
or so. And then only tag to fcos:testing-devel
once some of those tests have passed.
For example we might push fcos:testing-devel-candidate or so. And then only tag to fcos:testing-devel once some of those tests have passed.
This is my favorite part of this. This would enable consumers of these images to get feedback about the overall image state.
Just to clarify, when you say OCI standard keys, are you referring to https://github.com/opencontainers/image-spec/blob/main/annotations.md?
Yep! Specifically org.opencontainers.image.source
and org.opencontainers.image.revision
.
https://github.com/ostreedev/ostree-rs-ext/pull/234 will help this
I'm not sure I understand the value here. Maybe we can talk about it at the next video community meeting to make it more clear for people.
https://github.com/ostreedev/ostree-rs/pull/47
Was that a response to me? If so I still don't understand how that answers the question.
Was that a response to me?
Nope, just keeping track of related PRs.
I'm not sure I understand the value here.
I tried to elaborate on all this in https://github.com/coreos/fedora-coreos-tracker/issues/828
The simplest way to say it is that our center of gravity ships much closer to container image builds, and not custom JSON schema stored in a blob store.
Right now the container image is exported from the blob store - this would flip things around; source of truth is a container image. Disk image builds are secondary/derivatives of that.
https://github.com/coreos/rpm-ostree/pull/3402
Hmm, also unsure about this. At the end of the day, we'll probably still always want public images sitting in object stores so it's convenient for users/higher-level tools to download and run without involving a container stack. Which means we'd still have something like the builds dir in S3. So there's a lot of force pulling us towards keeping it as canonical too.
we'll probably still always want public images sitting in object stores so it's convenient for users/higher-level tools to download and run without involving a container stack.
In our world, "images" is an ambiguous term. You're thinking disk/boot images, right? Yes, I agree. Wrapping those in a container is currently a bit of a weird thing to do.
Which means we'd still have something like the builds dir in S3. So there's a lot of force pulling us towards keeping it as canonical too.
I think the interesting angle here more is having disk images come after (follow, derive from) the container builds. But yes, when we go to generate a cosa build, we convert the container back into an ociarchive and store it in S3 as we do currently.
I feel like if we're pushing in this direction we should probably have a larger discussion about it. Would you like to bring it up at this week's meeting?
I think we need to embed image.yaml
inside the ostree commit to make this work too.
OK I tried to visualize the new proposal, where we natively store the ostree containers in a registry, and disk images in S3:
(moved to https://github.com/coreos/coreos-assembler/issues/2685#issue-1123147338 )
This depends on https://github.com/coreos/coreos-assembler/pull/2806
Here's another way to look at this - we have a ton of bespoke load bearing custom tooling. And some (even most) of that we really do need. But, I think meta.json
and all the S3 stuff around that is a great example of something that could sort of just slowly be...de-emphasized at least.
The default result of a build would be a container image. We have tons of tools to version, manage, test, promote, mirror, sign, and inspect those.
We still need to make disk images - but the version numbers of those are the same as the container image. We have more flexibility then to change how we store and manage disk images too.
(Stream metadata remains user facing API)
A good example of something fixed by this is https://github.com/coreos/coreos-assembler/issues/668
I've reworked the proposal here even more - the cosa init --ostree
and cosa ostree build
process should not create any meta.json
at all. In fact, we could even drop the builds/
directory, and the output of the build process goes into a directory named oci/
or so (i.e. this would be an oci directory - which already natively supports multiple builds and tags and stuff.
Here's another way to look at this - we have a ton of bespoke load bearing custom tooling. And some (even most) of that we really do need. But, I think meta.json and all the S3 stuff around that is a great example of something that could sort of just slowly be...de-emphasized at least.
Although meta.json
and builds.json
etc... are indeed all bespoke, we also have a lot of tooling built on top of that at this point (and it extends to many other teams now). I think indeed we'd have to keep supporting it for a good while. How do you think about the benefits of doing this vs the cost of having to maintain both in parallel?
Edit: "many other teams" is probably too strong here, but I know at least e.g. openQA knows the format, and I know many folks in other OpenShift teams are familiar with it as well, but I'm not sure how much meta.json/builds.json-aware tooling owned by them actually exists.
I am not actually really proposing to get rid of builds.json/meta.json anytime soon (read: year at least).
What I am proposing is adding a new opt-in/parallel flow for just the base image build (aka oscontainer) that is container-native, i.e. does not by default generate or require builds.json/meta.json.
Again even this would clearly need to be opt-in in the immediate term. But I'd at least like to try using it in OCP Prow, where this would feel much more natural than anything requiring S3.
When we actually try flipping over the prod FCOS pipeline to this would be contingent on us figuring out that it makes sense and works.
But for example...we could use this for a separate quay.io/fedora/fedora-coreos:36-continuous
container-only (to start) stream.
xref https://github.com/coreos/fedora-coreos-tracker/issues/1068 and coreos-assembler changes, which ties into landing nontrivial changes in coreos-assembler.
One angle I'm looking at this from is that by decoupling the container builds from the disk image builds, we are also implicitly starting to build up an "API" between the two, which helps unblock other tools doing disk image builds.
Specifically for example, I see this as a small but notable step in trying to align with e.g. osbuild. Here, the role of osbuild would be to accept a container image as input which has effectively everything necessary to "materialize" that container as a disk image. The image.json in ostree merged, which was a key part of this.
For example, I'm envisioning osbuild having an interface that takes this and e.g. generates a CoreOS-style -metal.img
or .iso
that works the same as ours do today.
One way to imagine this is if e.g. a yum repo also had an embedded kickstart with it that defined the provisioning defaults. Today, those kind of defaults are part of anaconda which is lifecycled (versioned) separately from the yum repo. This allows us to e.g. in theory change our default filesystem configuration and grub.cfg
without changing the disk image generation tool.
(Now, our image.json
is obviously a made-up ad-hoc thing and we should figure out something better, but that's the idea)
OK so I discovered that at least for OCP, our release tooling actually depends on having the @sha256
digest in meta.json
for the (legacy) container, and it'll be hard to have the new format container not have that.
https://github.com/coreos/coreos-assembler/pull/2828 is modifying things to mutate the meta.json in push-container
by default because the kubevirt one will end up there so it can end up in stream metadata...so I guess we can move forward for now assuming that the ostree-container also appears in meta.json and stream metadata.
I'd hoped to also use this as leverage to fix the "one big build" problem. But perhaps we could do as a short term hack is to decouple the pipeline into two "build streams" that both use meta.json/S3, just in separate places.
I do think longer term though the right thing is still this proposal where the ostree-container is not dependent on meta.json.
PR in https://github.com/coreos/coreos-assembler/pull/3128 which starts the ball rolling here