rules_apko icon indicating copy to clipboard operation
rules_apko copied to clipboard

Towards upgrading incomplete lock files to fully resolved per-arch yaml configurations

Open xnox opened this issue 8 months ago • 2 comments

In terraform-provider-apko the yaml configuration is resolved, and attached as an attestation, which is fully reproducible and possible to rebuild offline with a local cache of apks.

The partial lockfiles in json are incomplete configuration, which still requires yaml configuration, and can generate non-reproducible results - or rather unplanned for results (yaml with arches x86_64 and arm64, plus a lockfile of the same, happily can be asked to create an empty s390x image).

Currently I don't see a direct cli command to translate yaml file from a multi-arch one, to a fully resolved single target arch yaml configuration.

Also the fully resolved yaml configuration doesn't currently record checksums, or at least reference locked apkindex somehow. They are encoded in the SPDX json however.

A lot of code could be removed from apko, and a lot of things could be removed from rules_apko, if instead of json lock file the pagadim would be "apko.yaml => apko.$targetarch.yaml => oci image". And would fit the bazel ideas more cleanly, with better reproducibility (matching the apko behaviour as used elsewhere).

Whilst the bazel download cache handling is admirable, it is also a burden to maintain for no additional gain (apko is designed to be fully reproducible with its existing cache.... and yet somehow bazel doesn't quite let one achieve the same goals).

I wonder if apko could populate / fetch cache natively, and for bazel to accept that, without maintaining infrastructure to support json lock files in apko. Or if some sort of generator could be created that would recreate apko cache-dir based on the locked yaml configuration external to apko codebase.

Bazel documentation is very user centric, and it is not clear how to best integrate foreign packages. Also a lot of foreign packages repositories are oriented to fetching source code and building it; rather than pre-built binaries (which is what .apks are). Even looking at go, instead of pushing bazel specific integrations into go get a separate generator gazelle was written that mangles things on bazel terms. Something like that would be best here too.

I rough bazel terms, I think following "rules" will be needed:

  • something that accepts apko.yaml and tracks it
  • something that can translate apko.yaml to a resolved apko.$targetarch.yaml for as many arches as needed and track it
  • something that can given apko.$targetarch.yaml can create bazel accepted http cache, which happens to be acceptable as apko's go-apk cache too, and track its state
  • allow updating the go-apk cache from scratch (for example, during key rotation, or apk version reuse - not the case for cgr.dev publications, but possibly true for local / self-service publications)
  • allow updating the resolved apko.$targetarch.yaml upon remove archive moves
  • remove usage of json lockfiles, remove --lockfile from apko
  • allow calling apko build against just apko.yaml
  • allow calling apko build against apko.$targetarch.yaml without cache
  • allow calling apko build against apok.$targetarch.yaml with bazel managed build-cache and thus --offline
  • never use json lockfiles anymore

This will significantly simply apko & rules_apko maintenance, reduce codebase of both, and make the builds more flexible and possibly even faster.

xnox avatar May 13 '25 11:05 xnox

I'm not a direct apko contributor - but having dug through the code some, I think it's fair to say that there's some additional burden in maintaining the lockfiles, and that the manual construction of the http cache directory takes a decent amount of code - there are ways that these could be reduced.

I'd push back against abandoning the bazel repo cache entirely - performing downloads during action execution is an antipattern. Some orgs perform their execution stage in remote environments that have a locked-down network, and rely on the host running bazel to do the actual external downloads when populating the repository cache. We also lose the heremedicity guarantees that bazel strives for. It's true that apko can pin a specific upstream release, and knows that we're trusting a particular signing key, but there's no guarantee that a bad actor hasn't modified an artifact in some upstream without checking hashes.

I think we can reduce the burden significantly by relying more on the local repository behavior that apko has recently introduced, although not all the pieces are immediately available.

High level, we're still just looking to:

  • Resolve the content of an apko.yaml to some specific versions from some upstreams
  • Grab that content
  • Make it available when we apko build.

Similar to the behavior you're suggesting now, we can render the apko.yaml to some intermediary format that can be less detailed than the current lock.json. We really just need a package name, an arch, a URL, and a SHA. That can be fed to a repository rule that downloads all that content, and produces a rule that bundles the content into a resolved index file of just the packages we're interested in using melange index.

Any builds can then depend on that index file (or I think multiple index files - in the case of cross-arch), and the contents of that resolved repository cache to build. This prevents us from needing to fuss with the http cache at all, and removes the burden of needing to know how to build with a lockfile from apko. I'm also a little biased - because this brings us closer to only maintaining an index that can perform multiple image builds with the same versions, instead of maintaining many separate resolved lockfiles that each pull in most of the same content.

How are those resolved yaml files generated now? Is there a world where we could somehow add in hash information?

alexnovak avatar May 30 '25 16:05 alexnovak

How are those resolved yaml files generated now? I see them in terraform-provider-apko outputs in my terraform state. But I am failing to see how it is done, and if that is using internal apis, which are not exposed over CLI. cc @joshrwolf @javacruft @

Is there a world where we could somehow add in hash information? Sure we should.

Also i am pondering multi-mode support:

  • resolve apko.yaml to apko.yaml with fixed versions and hashes
  • without bazel cache it can be marked hermetic then
  • if desired from fixed versions and hashes generate bazel managed cache files
  • at build time use fixed versions as is

But the point is that in apko build the codepath is removed that receives both unresolved yaml + json lock file, and then attempts to perform a new resolution that is constraint with lock files. As all that is unnecessary if one has full set of packages specified with versions anyway.

xnox avatar May 30 '25 18:05 xnox