syft
syft copied to clipboard
Add ability to see the first location a package was added
Adds a squashed-with-all-layers
resolver which acts like the squashed resolver with the additional behavior of returning instances of the path found in all other layers. This, combined with additional changes to denote the layer index directly in locations, allows for someone to be able to know the first location a package was introduced.
For example:
# Dockerfile for test:latest
FROM alpine:latest
RUN apk add wget
RUN apk add curl
When running syft...
$ syft -o json -s squashed-with-all-layers test:latest -vvv
...
[0000] DEBUG discovered 58 packages cataloger=apkdb-cataloger
[0000] DEBUG found path duplicate of /lib/ld-musl-x86_64.so.1
[0000] DEBUG found path duplicate of /usr/share/apk/keys/[email protected]
[0000] DEBUG found path duplicate of /usr/share/apk/keys/[email protected]
[0000] DEBUG found path duplicate of /usr/share/apk/keys/[email protected]
[0000] DEBUG found path duplicate of /usr/share/apk/keys/[email protected]
...
[0000] TRACE merging similar packages id=291d1267b40d636f purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=alpine-baselayout&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d9700f02cf26e8b8 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=623d53216342d45e purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=256fc96b4a8c4da8 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=92b19c7750fb559d purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2b5e23d349b556cf purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b805d823ae624f04 purl=pkg:apk/alpine/ca-certificates-bundle@20220614-r4?arch=x86_64&upstream=ca-certificates&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d3084c788891fb28 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2a95f0251fba7a33 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b15247aafcd4a647 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=94014313cfcd2b71 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e5f757b0df1f62bc purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e903138d19e85b80 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=pax-utils&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=f71ecf5267e6c37b purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=musl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=8126b232e2d3c608 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=libc-dev&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=291d1267b40d636f purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=alpine-baselayout&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d9700f02cf26e8b8 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=623d53216342d45e purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=256fc96b4a8c4da8 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=92b19c7750fb559d purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2b5e23d349b556cf purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b805d823ae624f04 purl=pkg:apk/alpine/ca-certificates-bundle@20220614-r4?arch=x86_64&upstream=ca-certificates&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d3084c788891fb28 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2a95f0251fba7a33 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b15247aafcd4a647 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=94014313cfcd2b71 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e5f757b0df1f62bc purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e903138d19e85b80 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=pax-utils&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=f71ecf5267e6c37b purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=musl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=8126b232e2d3c608 purl=pkg:apk/alpine/[email protected]?arch=x86_64&upstream=libc-dev&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=58d60d9b7d1565f1 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=3841a3199a1ee118 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e40c4f862e3949e8 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=971b42d7909ea972 purl=pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3
# proceeds to output 25 packages, not 58
You'll see merged location elements for each package:
{
"id": "94014313cfcd2b71",
"name": "zlib",
"version": "1.2.13-r0",
"type": "apk",
"foundBy": "apkdb-cataloger",
"locations": [
{
"path": "/lib/apk/db/installed",
"layerID": "sha256:0d71e44edab1e63f802dfd59cbf8c128c4f89f2ae3c4edb79475678dcedb5bff"
},
{
"path": "/lib/apk/db/installed",
"layerID": "sha256:a2ea955c0abfa7fb734e0991ef02fb4e4f35e8090ae76cd6f14dc58d037fa23e"
},
{
"path": "/lib/apk/db/installed",
"layerID": "sha256:f1417ff83b319fbdae6dd9cd6d8c9c88002dcd75ecf6ec201c8c6894681cf2b5"
}
],
"licenses": [
"Zlib"
],
"language": "",
"cpes": [
"cpe:2.3:a:zlib:zlib:1.2.13-r0:*:*:*:*:*:*:*"
],
"purl": "pkg:apk/alpine/[email protected]?arch=x86_64&distro=alpine-3.17.3",
...
TODO:
- [ ] add tests 🧛 🩸
- [ ] add layer index to location?
- [ ] sort slice from location set not lexically, but by layer order.
- [ ] there are a log of "found path duplicate of
" log entries, which hints that there is an issue with relationship creation for these duplicate packages found.
Open question:
- Should we omit packages for certain ecosystems that have been found in previous layers but are known to be the same? E.g. deb/apk/rpm packages are in a single DB, so adding any new package will make the previously installed packages look like they've been installed again, which isn't what's happening here.
Problems:
- This will report packages that get removed and are not logically in the squashed representation (introducing FPs relative to the squashed representation).
Closes #435
Benchmark Test Results
Benchmark results from the latest changes vs base branch
goos: linux%0Agoarch: amd64%0Apkg: github.com/anchore/syft/test/integration%0Acpu: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz%0A │ ./.tmp/benchmark-14e8cb4.txt │%0A │ sec/op │%0AImagePackageCatalogers/alpmdb-cataloger-2 11.80m ± 24%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2 856.1µ ± 2%25%0AImagePackageCatalogers/python-package-cataloger-2 3.097m ± 1%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2 695.8µ ± 1%25%0AImagePackageCatalogers/javascript-package-cataloger-2 356.7µ ± 2%25%0AImagePackageCatalogers/dpkgdb-cataloger-2 511.1µ ± 1%25%0AImagePackageCatalogers/rpm-db-cataloger-2 491.1µ ± 3%25%0AImagePackageCatalogers/java-cataloger-2 10.73m ± 1%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2 8.390µ ± 2%25%0AImagePackageCatalogers/apkdb-cataloger-2 556.0µ ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2 18.95µ ± 2%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2 981.6µ ± 1%25%0AImagePackageCatalogers/portage-cataloger-2 344.5µ ± 1%25%0AImagePackageCatalogers/nix-store-cataloger-2 222.9µ ± 2%25%0AImagePackageCatalogers/sbom-cataloger-2 110.8µ ± 0%25%0AImagePackageCatalogers/binary-cataloger-2 190.1µ ± 0%25%0Ageomean 451.0µ%0A%0A │ ./.tmp/benchmark-14e8cb4.txt │%0A │ B/op │%0AImagePackageCatalogers/alpmdb-cataloger-2 5.064Mi ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2 123.8Ki ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2 947.4Ki ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2 155.8Ki ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2 90.79Ki ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2 144.6Ki ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2 170.2Ki ± 0%25%0AImagePackageCatalogers/java-cataloger-2 2.720Mi ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2 1.555Ki ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2 129.2Ki ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2 3.133Ki ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2 314.5Ki ± 0%25%0AImagePackageCatalogers/portage-cataloger-2 77.23Ki ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2 36.07Ki ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2 13.57Ki ± 0%25%0AImagePackageCatalogers/binary-cataloger-2 29.91Ki ± 0%25%0Ageomean 101.7Ki%0A%0A │ ./.tmp/benchmark-14e8cb4.txt │%0A │ allocs/op │%0AImagePackageCatalogers/alpmdb-cataloger-2 86.71k ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2 2.049k ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2 15.49k ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2 3.457k ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2 1.205k ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2 2.646k ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2 3.759k ± 0%25%0AImagePackageCatalogers/java-cataloger-2 38.26k ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2 40.00 ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2 3.438k ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2 101.0 ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2 5.011k ± 0%25%0AImagePackageCatalogers/portage-cataloger-2 1.539k ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2 671.0 ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2 392.0 ± 0%25%0AImagePackageCatalogers/binary-cataloger-2 872.0 ± 0%25%0Ageomean 2.062k
May I know why this pr is not merged . Its extremely helpful in deduping the components across layers
do you have an eta for this addition? can be helpful
Hi @tomerse-sg and @Deep232, thanks for the notes, we don't have an ETA but we will take a look and see if we can move this forward. Thank you for letting us know this would be useful for you!
can you please elaborate about the problem you specified in the PR description? what will be the different between all layers & this mode in case of deleted packages? @tgerla @wagoodman
I tried to run a test using an image golang 1.14 using all-layers and squashed-with-all-layers. I didn't see any difference between the jsons. can you please elaborate how do we plan to mark packages that doesn't exist in the squashed?
another question - seems this pr is based on syft 0.76.0, do you think it is possible to contribute new pr and aligned it to newest syft?
another thing - I think I've found a bug - I created this dockerfile:
# Use the alpine base image
FROM alpine:latest
# Install curl
RUN apk add --no-cache curl
# Copy the file test.txt to the container
COPY test.txt /test.txt
# Install Ruff (Python linting tool)
RUN apk add --no-cache jq
RUN apk del jq
# Install Ruff (Python linting tool)
RUN apk add --no-cache jq
RUN apk del jq
# Set a default command for the container
CMD ["sh"]
and when I scan it I do see "jq" I expect not seeing it... otherwise no diff between all-layers & squashed-with-all-layers
I think this solve the problem of the deleted package - https://github.com/anchore/syft/pull/3138 I opened a new PR since lot have change in syft let me know how to proceed further, this feature is useful :)