bazeldnf icon indicating copy to clipboard operation
bazeldnf copied to clipboard

Don't write duplicate entries e.g. symlinks to tar file

Open fionera opened this issue 11 months ago • 6 comments

Currently there are multiple entries for the same target file inside a rpm2tar result. Because of this, it isn't possible to build e.g. an oci_layer out of it. The symlinks should be add to the collectors map of written files and the collector should skip and write for these

fionera avatar Jan 15 '25 01:01 fionera

Do you have a reproducer?

kellyma2 avatar Jan 15 '25 01:01 kellyma2

load("@bazeldnf//:deps.bzl", "rpmtree")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_load", "oci_push")

rpmtree(
    name = "sandbox",
    rpms = [
        "@binutils-0__2.41-38.fc40.x86_64//rpm",
        "@binutils-gold-0__2.41-38.fc40.x86_64//rpm",
    ],
    symlinks = {
        "/usr/bin/ld": "/usr/bin/ld.bfd",
    },
    visibility = ["//visibility:public"],
)
oci_image(
    name = "sandbox_image",
    base = "@distroless_base",
    entrypoint = [],
    tars = [
        ":sandbox",
    ],
    visibility = ["//visibility:private"],
    workdir = "/root",
)
oci_load(
    name = "load",
    image = ":sandbox_image",
    repo_tags = ["foo"],
)

    rpm(
        name = "binutils-0__2.41-38.fc40.x86_64",
        sha256 = "5dba5e8826c29a4b4d55fb506c9b6f929ded1e73259fce26630cf13f1f4d5715",
        urls = [
            "https://dl.fedoraproject.org/pub/fedora/linux/updates/40/Everything/x86_64/Packages/b/binutils-2.41-38.fc40.x86_64.rpm",
        ],
    )
    rpm(
        name = "binutils-gold-0__2.41-38.fc40.x86_64",
        sha256 = "02962db175354365a447c0cfd56c7f4902359dee3a4b302c9d123799b840f218",
        urls = [
            "https://dl.fedoraproject.org/pub/fedora/linux/updates/40/Everything/x86_64/Packages/b/binutils-gold-2.41-38.fc40.x86_64.rpm",
        ],
    )

fionera avatar Jan 15 '25 01:01 fionera

So the overlap is in usr/lib/.build-id that I think we should ignore anyway, how does the error manifest? What are your bazel calls that leads to the issue? Could you make a repro repo so we can test things?

manuelnaranjo avatar Jan 15 '25 11:01 manuelnaranjo

rpm2tar writes its symlinks ( https://github.com/rmohr/bazeldnf/blob/main/cmd/rpm2tar.go#L64 ) before the actual files from all rpms. But because the collector doesn't know that the symlinks are written (they get directly add to the tarWriter), the original file is also added. e.g. /usr/bin/ld

If you try to run the :load target your docker daemon will complain that there are multiple entries

fionera avatar Jan 15 '25 11:01 fionera

So ld comes from the symlinks you're passing into the rpmtree call and from 1 of the rpms (not from both). The only overlapping file is in the path I mentioned, I opened both rpms manually. Maybe for the symlink you should add a tar that creates the symlink in another layer, you will still be waisting space on the image as ld binary is still there. I would say to work on a fix the first thing we need is either a repro repository or an e2e test like the other ones we have.

manuelnaranjo avatar Jan 15 '25 11:01 manuelnaranjo

So, I think we cover files. We fixed that as part of https://github.com/rmohr/bazeldnf/pull/49, seems like we need to do the pass in explicit symlink to the collector right away. #136 is probably enough. But need to add a test.

rmohr avatar May 30 '25 05:05 rmohr