melange icon indicating copy to clipboard operation
melange copied to clipboard

Melange SCA for shared library dependencies should be versioned

Open xnox opened this issue 1 year ago • 4 comments

$ docker run --rm -ti --entrypoint sh cgr.dev/chainguard/wolfi-base@sha256:d6b37317ae7cb5c0864189e9e5acd825386ae226a413e7c19370f5f87d150f92
/ # apk add openssl~3.0
fetch https://packages.wolfi.dev/os/x86_64/APKINDEX.tar.gz
(1/1) Installing openssl (3.0.7-r3)
WARNING: Support for packages with multiple data parts will be dropped in apk-tools 3.
OK: 16 MiB in 15 packages
/ # openssl version
OpenSSL 3.0.7 1 Nov 2022 (Library: OpenSSL 3.3.1 4 Jun 2024)
/ # apk add openssl~3.1
(1/2) Installing openssl-provider-legacy (3.4.0-r1)
(2/2) Upgrading openssl (3.0.7-r3 -> 3.1.4-r1)
OK: 16 MiB in 16 packages
/ # openssl version
OpenSSL 3.1.4 24 Oct 2023 (Library: OpenSSL 3.3.1 4 Jun 2024)
/ # apk add openssl~3.2
(1/1) Upgrading openssl (3.1.4-r1 -> 3.2.1-r0)
OK: 16 MiB in 16 packages
/ # openssl version
OpenSSL 3.2.1 30 Jan 2024 (Library: OpenSSL 3.3.1 4 Jun 2024)
/ # apk add openssl~3.3
(1/1) Upgrading openssl (3.2.1-r0 -> 3.3.2-r2)
OK: 16 MiB in 16 packages
/ # openssl version
OpenSSL 3.3.2 3 Sep 2024 (Library: OpenSSL 3.3.1 4 Jun 2024)
/ # apk add openssl~3.4
(1/1) Upgrading openssl (3.3.2-r2 -> 3.4.0-r1)
OK: 16 MiB in 16 packages
/ # openssl version
openssl: /usr/lib/libssl.so.3: version `OPENSSL_3.4.0' not found (required by openssl)
openssl: /usr/lib/libcrypto.so.3: version `OPENSSL_3.4.0' not found (required by openssl)
/ # apk add -u libcrypto3 libssl3
(1/2) Upgrading libcrypto3 (3.3.1-r4 -> 3.4.0-r1)
(2/2) Upgrading libssl3 (3.3.1-r4 -> 3.4.0-r1)
OK: 15 MiB in 16 packages
/ # openssl version
OpenSSL 3.4.0 22 Oct 2024 (Library: OpenSSL 3.4.0 22 Oct 2024)
/ # exit 0

Above show that libcrypo3 / libssl3 v3.3 are backwards compatible with openssl binaries versioned v3.3; v3.2; v3.1; v3.0. However libcrypto3 / libssl3 v3.3 are not forwards compatible with openssl binary versioned v3.4. Same is true for many other applications such as python https://github.com/wolfi-dev/os/issues/33218 and really any other users of openssl. Same is true for major upgrades of glibc; when applications start to use newer versioned glibc abis.

It would help a lot if libcrypto3 package had provides = so:libcrypto.so.3=3.3.1-r4, and for each of openssl~3.N packages had depends = so:libcrypto.so.3>=3.N as then openssl~3.4 would have not been a valid candidate to apk add, unless `libcrypto3 was upgraded from 3.3. to at least 3.4.

Other distributions resolve this by either not changing API/ABI ever (i.e. a given release of ubuntu/debian/rhel/fedora do not upgrade openssl, not an option for a rolling distribution like Wolfi) or by using symbol level versioning.

Symbol level versioning relies on a library package maintainer to maintain a full list of symbols; and when they were introduced; such that package build process can generate accurate versioned dependencies. OpenSSL symbols are well maintained https://salsa.debian.org/debian/openssl/-/blob/debian/unstable/debian/libssl3t64.symbols?ref_type=heads however other libraries can be more complicated. See forexample https://sources.debian.org/src/zlib/1%3A1.3.dfsg%2Breally1.3.1-1/debian/zlib-core.symbols/ where lots of symbols have matching versioned symbol as the minimum package versioned; many which do not have version assigned to a symbol; or where there is drift of what a given symbols is declared as versus what the minimum required version is. This is highly accurate and flexible, but requires by-hand analysis and maintainance.

A pragmatic middle ground could be to update all the so: provides to be versioned; and ensure all depends on so: to be in the from >=${{the provider package full version}}. meaning such that packages like openssl and python-3.12-base gain depend = so:libcrypto.so.3>=3.4.0-r2 and depend = so:libcrypto.so.3>=3.3.1-r2 and so on, depending on which version of libcrypto3 package was present in the build environment.

This is not unique to openssl; as the same is true for glibc, libgcc, and all other libraries with a history of multi-year stable ABI, backwards compatible, but not forwards compatible.

xnox avatar Nov 05 '24 23:11 xnox

Landing https://github.com/chainguard-dev/melange/pull/1622 first may make implementing this request easier.

xnox avatar Nov 06 '24 11:11 xnox

I realized there is bit of a problem with this solution (I think) in the event that there are 2 packages that provide so:libfoo.so.1 that we probably at least need to think through.

if you build apk-tools witih zlib-ng-dev in it's environment, then it would get a dep on so:libz.so.1>=2.2.2 (zlib-ng's current version) but zlib is only going to provide so:libz.so.1 at 1.3.1 . That would mean zlib is insufficient simply because it's software version is lower than zlib-ng. That seems fine in this example, but zlib decided to start versioning with '2024.1' and still providing libz.so.1 then it would flip.

smoser avatar Nov 13 '24 18:11 smoser

@xnox @smoser Could you both share more details of what is missing from this feature request ?

hectorj2f avatar Jan 08 '25 22:01 hectorj2f

@smoser if there is more than one provider of shared library, even today we must specify explicit dep on the runtime package name.

Like for example today we compile against udev, but use eudev at runtime which is wrong and almost works

This is sort of why dpkg provides shlib deps and resolves sonames to real package names at build time; rather than having virtual dependencies on so:

Generating better version deps will help for the majority cases of a single provider of the runtime library.

It is true that for multiple providers neither status quo, nor this proposal addresses resolution and we need something else (manual deps, or deps resolved to package name).

xnox avatar Jan 08 '25 22:01 xnox

We need to prioritize this one P0 in 25B. CC @kimsterv @juliandunn @johncslack

dustinkirkland avatar Mar 16 '25 00:03 dustinkirkland

I added the eng priorities project, prioritized as a 0.2, filled out the field I could, assigned to OS team. I don't see it on the board yet - maybe there's a delay, but it should be there now! @johncslack @juliandunn based on whatever t-shirt size gets set for this, some of the other items prioritized for 25b will need to fall below the cut line now.

kimsterv avatar Mar 16 '25 16:03 kimsterv

This is basically a P0 because it is a remediation item from the curl incident... correct?

juliandunn avatar Mar 18 '25 01:03 juliandunn

An update here.

The first part of the request (having a versioned provides:) was relatively straightforward to implement (even though it took me some time to figure out where everything is in melange, since I'm new to the codebase).

The second part of the request (having a versions depends:) is more involved. Even though we do have precise version information for all build dependencies inside .melange.yaml, we currently don't have a way to easily match a shared library with its corresponding package name. I have an idea on how to tackle that.

sergiodj avatar Mar 20 '25 20:03 sergiodj

This is basically a P0 because it is a remediation item from the curl incident... correct?

We are racy.

As we do continuous publication of .APK and app.apk can always be published before libapp.apk. thus this is always a problem. Normally nobody notices the seconds of when the APK registry is inconsistent.

Curl was an incident because it just happened to be that there was CDN outage at the same time, which made it impossible for libapp.apk to publish and propagate everywhere.

The more users we have, the more people will notice the continuous exposure to broken & fixed packages, unless we have more strict relationships between various packages, even when they publish out of order (or fail to publish).

xnox avatar Mar 20 '25 20:03 xnox

Surely this is a race that can be solved on the server side with locks and atomic operations, flipping symlinks, no?

On Thu, Mar 20, 2025, 15:00 Dimitri John Ledkov @.***> wrote:

This is basically a P0 because it is a remediation item from the curl incident... correct?

We are racy.

As we do continuous publication of .APK and app.apk can always be published before libapp.apk. thus this is always a problem. Normally nobody notices the seconds of when the APK registry is inconsistent.

Curl was an incident because it just happened to be that there was CDN outage at the same time, which made it impossible for libapp.apk to publish and propagate everywhere.

The more users we have, the more people will notice the continuous exposure to broken & fixed packages, unless we have more strict relationships between various packages, even when they publish out of order (or fail to publish).

— Reply to this email directly, view it on GitHub https://github.com/chainguard-dev/melange/issues/1621#issuecomment-2741648481, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGMZBSD66OT3HBYWWOCYID2VMT6LAVCNFSM6AAAAABRHU7MGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBRGY2DQNBYGE . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: xnox]xnox left a comment (chainguard-dev/melange#1621) https://github.com/chainguard-dev/melange/issues/1621#issuecomment-2741648481

This is basically a P0 because it is a remediation item from the curl incident... correct?

We are racy.

As we do continuous publication of .APK and app.apk can always be published before libapp.apk. thus this is always a problem. Normally nobody notices the seconds of when the APK registry is inconsistent.

Curl was an incident because it just happened to be that there was CDN outage at the same time, which made it impossible for libapp.apk to publish and propagate everywhere.

The more users we have, the more people will notice the continuous exposure to broken & fixed packages, unless we have more strict relationships between various packages, even when they publish out of order (or fail to publish).

— Reply to this email directly, view it on GitHub https://github.com/chainguard-dev/melange/issues/1621#issuecomment-2741648481, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGMZBSD66OT3HBYWWOCYID2VMT6LAVCNFSM6AAAAABRHU7MGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBRGY2DQNBYGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

dustinkirkland avatar Mar 20 '25 23:03 dustinkirkland

Surely this is a race that can be solved on the server side with locks and atomic operations, flipping symlinks, no?

Most distributions indeed do atomic pushes => wait for uploads from all arches to finish; index them all; then update the package repo metadata. This is now currently how our publisher works, it takes .apk one at at time, and upon accept pushes it to apkindex straight away (without waiting for the complete set other .apk from all the subpackages).

There are also client side solutions to request and consider all packages only up to a consistent ceiling based on the build timestamp. I am working on a prototype that would solve all the SCA issues; as well as all the non-atomic publication; whilst simultaneously fixing this for all previously published containers ever. But it will take some time to cook. It will be a single statically linked binary that makes reasonable choices, taking all the information it has available into account.

xnox avatar Mar 20 '25 23:03 xnox

This work is temporarily blocked on some CI issues with golang versioning. @jonjohnsonjr and I talked about it last Friday, the plan is to fix the issues this week.

sergiodj avatar Mar 31 '25 20:03 sergiodj

I've made good progress on this and https://github.com/chainguard-dev/melange/pull/1871 contains a fully working version of the approach I'd like to implement. However, there is still one possible issue that I'd like to discuss. The following comment contains a summary:

https://github.com/chainguard-dev/melange/pull/1871#issuecomment-2770948164

I would like other people's feedback on this potential issue before moving forward with the PR.

sergiodj avatar Apr 02 '25 15:04 sergiodj

This is blocked for now, waiting on more feedback from @xnox.

sergiodj avatar Apr 07 '25 15:04 sergiodj

It would help a lot if libcrypto3 package had provides = so:libcrypto.so.3=3.3.1-r4, and for each of openssl~3.N packages had depends = so:libcrypto.so.3>=3.N as then openssl~3.4 would have not been a valid candidate to apk add, unless libcrypto3 was upgraded from 3.3. to at least 3.4. ... A pragmatic middle ground could be to update all the so: provides to be versioned; and ensure all depends on so: to be in the from >=${{the provider package full version}}. meaning such that packages like openssl and python-3.12-base gain depend = so:libcrypto.so.3>=3.4.0-r2 and depend = so:libcrypto.so.3>=3.3.1-r2 and so on, depending on which version of libcrypto3 package was present in the build environment.

How should this work in cases where we have multiple packages providing the same so:? For example with libcurl-openssl4 and libcurl-rustls4, we generally keep the package versions in sync, but the epoch versions drift. You'd exclude one or the other unless the versions happen to be identical.

We could find the longest common version prefix between them? Or are we okay with always going with whatever was installed in the build environment? Do you see this as a problem?

jonjohnsonjr avatar Jul 07 '25 22:07 jonjohnsonjr

It would help a lot if libcrypto3 package had provides = so:libcrypto.so.3=3.3.1-r4, and for each of openssl~3.N packages had depends = so:libcrypto.so.3>=3.N as then openssl~3.4 would have not been a valid candidate to apk add, unless libcrypto3 was upgraded from 3.3. to at least 3.4. ... A pragmatic middle ground could be to update all the so: provides to be versioned; and ensure all depends on so: to be in the from >=${{the provider package full version}}. meaning such that packages like openssl and python-3.12-base gain depend = so:libcrypto.so.3>=3.4.0-r2 and depend = so:libcrypto.so.3>=3.3.1-r2 and so on, depending on which version of libcrypto3 package was present in the build environment.

How should this work in cases where we have multiple packages providing the same so:? For example with libcurl-openssl4 and libcurl-rustls4, we generally keep the package versions in sync, but the epoch versions drift. You'd exclude one or the other unless the versions happen to be identical.

We could find the longest common version prefix between them? Or are we okay with always going with whatever was installed in the build environment? Do you see this as a problem?

I happy to ignore libcurl for this, as a non issue for now. We don't provide dev variant of rusttls backend, meaning one cannot compile against rusttls backend today thus it is unlikely to be in the build environment. Our provider priorities also prefer to use the openssl one at runtime. Separately, we previously had rusttls backend build separately due to introduction of a build cycle, which we no longer enforce. I thus ponder if we can and should merge the two sources into one anyway.

xnox avatar Jul 07 '25 22:07 xnox

In general, this proposal does not address how to change between two implementations, with different versioning schemes, of the same soname ABI. As in switch from zlib to zlib-ng or to cloudflare-zlib; or glibc libcrypt to libxcrypt; ffmpeg and ffmpeg nvidia and so on.

And I don't think there is a generic way to resolve that. If we want to have multiple soname abis, we need to manage which ones is used else how - with own virtual provides, and provider priorities like we do for libxcrypt and libcurl today.

And even possibly use transitional packages, and package withdrawals to manage transitions.

xnox avatar Jul 07 '25 23:07 xnox