cosign icon indicating copy to clipboard operation
cosign copied to clipboard

Sanitize cosign's dependency tree

Open imjasonh opened this issue 3 years ago • 22 comments

Problem

Cosign's module dependencies are pretty heavy.

Its go.mod currently transitively depends on k8s.io/client-go, the AWS SDK, the Azure SDK, a GCP cloud storage API client, Docker internals, fsnotify, OpenTelemetry, a Kubernetes controller framework, a GitHub API client, cuelang, and many more.

This can make it difficult to consume for clients that don't want all that functionality, and may just want to sign some bytes using a key, push the resulting image to an OCI registry, and upload data to Rekor with certs via Fulcio.

For example, sget, which does relatively little compared to cosign -- it verifies signatures on an image and fetches a blob -- has a go.mod file that's 280 lines, producing a 72 MB binary 🏋️ .

Large dependency trees lead to larger built binaries (see above), larger attack surfaces, longer and more fragile builds, and (most interesting to me personally as a maintainer!) more toil for maintainers to update dependencies and sift through security alerts for those vulnerabilities.

Possible Solutions

  • Moving more dependencies into sigstore/sigstore (https://github.com/sigstore/sigstore/issues/81), and taking that opportunity to reconsider if any dependencies can be replaced by lighter-weight alternatives
  • Taking better advantage of Go interfaces, especially for KMS services, and moving implementations of those interfaces into separate Go modules or separate repositories
  • Splitting the cosigned webhook into a separate Go module from the cosign CLI, possibly even moving it to a separate repo

The cosign CLI would likely not get any smaller as a result of this work -- it still naturally depends on all the KMSes, K8s, cloud SDKs -- but other clients that want to do cosign-like things (signing, and especially verifying) wouldn't need to.

cc @luhring @jonjohnsonjr

imjasonh avatar Feb 15 '22 18:02 imjasonh

  • Splitting the cosigned webhook into a separate Go module from the cosign CLI, possibly even moving it to a separate repo

I don't think this will really help, but I might be misunderstanding the exact issue. The k8s libraries shouldn't show up in a downstream binary or dependency tree unless they're actually used.

dlorenc avatar Feb 15 '22 18:02 dlorenc

I don't think this will really help, but I might be misunderstanding the exact issue. The k8s libraries shouldn't show up in a downstream binary or dependency tree unless they're actually used.

I mentioned this since the cosign CLI depends on k8s.io/client-go just to get/create/update secrets, when it could shell out to kubectl instead (introducing a different set of problems, understandibly). But that wouldn't help anyway, because the Go module it shares with the cosigned webhook necessarily and correctly depends on knative/pkg, k8s.io/client-go, etc.

Go might be smart enough to trim that dependency from the binary, but it still incurs maintenance cost for cosign CLI and SDK maintainers who shouldn't care about K8s' own large dependency graph, security alerts, dependabots, etc.

imjasonh avatar Feb 15 '22 18:02 imjasonh

it still incurs maintenance cost for cosign CLI and SDK maintainers who shouldn't care about K8s' own large dependency graph, security alerts, dependabots, etc.

💯 We're working on integrating some Cosign functionality into Syft, and it looks like we need k8s.io/api via Cosign:

$ go mod why -m k8s.io/api
# k8s.io/api
github.com/anchore/syft/cmd
github.com/sigstore/cosign/cmd/cosign/cli/sign
github.com/sigstore/cosign/pkg/signature
github.com/sigstore/cosign/pkg/cosign/kubernetes
k8s.io/api/core/v1

(from Syft at ec1cceb)

It could be that there's a better way to structure the code to avoid these dependencies, but it's not clear to me how. I think if the "package graph" extends into a given module, that module is needed during build.

luhring avatar Feb 15 '22 19:02 luhring

💯 We're working on integrating some Cosign functionality into Syft, and it looks like we need k8s.io/api via Cosign:

This is correct - today cosign does depend on k8s/api directly for the generate-key-pair command which can create a secret in k8s. Moving this to use a smaller library or shell out to kubectl would solve that problem.

It would not remove k8s from the go.mod file though, because the cosigned webhook still imports that stuff.

I think there are a few different things we're trying to optimize for here:

  • As a user of cosign, I want the binary to be small
  • As a library author that imports cosign code, I want fine-grained control over what packages get pulled into my dependency tree
  • As a cosign maintainer, I want a small go.mod file

I don't think moving the cosigned webhook somewhere else helps with any of these, it just shifts the problem around and means there's a separate repo and go.mod file maintainers need to keep up to date.

dlorenc avatar Feb 15 '22 19:02 dlorenc

I don't think moving the cosigned webhook somewhere else helps with any of these, it just shifts the problem around and means there's a separate repo and go.mod file maintainers need to keep up to date.

It helps me as library author and cosign maintainer, since it shifts the problem somewhere where I'm less (or not) involved. It does make dependency maintenance slightly more work for cosigned maintainers, but there are always going to be many fewer of those than there will be consumers of a cosign SDK.

I imagine the following "hierarchy of needs" for consumers of cosign logic:

  • sign some bytes, get some bytes (sigstore/sigstore)
  • sign some bytes, get a container image I can push (sigstore? cosign SDK?)
  • sign some bytes using a key in KMS, get a container image (cosign SDK? w/ pluggable KMS impls?)
  • verify some bytes using a key (sigstore/sigstore)
  • verify a container image's signatures using a key in KMS (cosign SDK?)
  • verify a container image's signatures using a key in KMS, in kubernetes (cosigned)

Today, if I want to do anything with container images, my go.mod picks up deps to serve all of the above.

imjasonh avatar Feb 15 '22 19:02 imjasonh

Today, if I want to do anything with container images, my go.mod picks up deps to serve all of the above.

I don't think this part is necessarily true, but that might be what I'm missing. A downstream client should be able to import code from cosign without introducing all the dependencies to their own go.mod.

dlorenc avatar Feb 15 '22 19:02 dlorenc

For KMSes it might be as simple as moving KMS provider registration out of pkg/signature/kms, into each impl's package, so you can selectively depend on only those KMSes you actually want, and depend on none of them if you want none of them:

https://github.com/sigstore/sigstore/blob/v1.1.0/pkg/signature/kms/kms.go#L31-L44

imjasonh avatar Feb 15 '22 19:02 imjasonh

Also related https://github.com/sigstore/cosign/issues/651

sambhav avatar Feb 15 '22 20:02 sambhav

Today, if I want to do anything with container images, my go.mod picks up deps to serve all of the above.

I don't think this part is necessarily true, but that might be what I'm missing. A downstream client should be able to import code from cosign without introducing all the dependencies to their own go.mod.

I think this is right, in that this doesn't have to be true, but I think we're seeing that it happens to be true today with the current code. (Maybe not "all" deps, but more deps than are actually needed by the downstream consumer.)

Taking better advantage of Go interfaces, especially for KMS services, and moving implementations of those interfaces into separate Go modules or separate repositories

This might be worth exploring first, since it's presumably less drastic of a change than introducing a new repo / Go module.

My understanding of Go's build approach is that it looks at the call graph to determine which packages are needed, and can then determine which modules are needed.

@imjasonh's example is really interesting. init() is tricky because it can effectively start its own call graph, which means potentially traversing more packages and modules, and potentially without the consumer having any practical use for that additional code.

Also, this might be overly pedantic, but what I'm looking to minimize (as a lib consumer) isn't go.mod entries in my project, but the number of modules in the "build list" in my project's binary (e.g. Syft).

luhring avatar Feb 16 '22 12:02 luhring

Also, this might be overly pedantic, but what I'm looking to minimize (as a lib consumer) isn't go.mod entries in my project, but the number of modules in the "build list" in my project's binary (e.g. Syft).

Bingo! This is what we can control without needing to split things up across repos.

dlorenc avatar Feb 16 '22 12:02 dlorenc

Shall we set up a dedicated meeting for this topic? I think we can start with getting the very low-level bits done (i.e., signing/verifying bytes + metadata) but I have no vision on how the code should be structured into smaller Go packages (to prune the heavier dependencies).

Cc: @mtrmac

vrothberg avatar Feb 17 '22 14:02 vrothberg

I 👍 ed this but I realize it doesn't notify anybody when you do that. 🤦‍♂️

So, 👍, we should meet some day next week (I'm out Mon+Tues) to map out specific packages/APIs that we'd want to refactor/move/etc.

imjasonh avatar Feb 18 '22 16:02 imjasonh

Just looping back on this issue as cosigned is now going to move into its own repo and we have the following issues which @lkatalin has expressed an interest in working on:

  • https://github.com/sigstore/sigstore/issues/199
  • https://github.com/sigstore/sigstore/issues/81

I can create a sigstore community calendar invite, how should we set a time? @vrothberg / @imjasonh et al?

lukehinds avatar Apr 28 '22 16:04 lukehinds

Thank you for the ping. I'd join the meeting :heavy_check_mark:

vrothberg avatar Apr 29 '22 08:04 vrothberg

How would next Wednesday work for 11am EST?

lukehinds avatar Apr 29 '22 09:04 lukehinds

That works for me!

imjasonh avatar Apr 29 '22 11:04 imjasonh

Works for me as well :heavy_check_mark:

vrothberg avatar Apr 29 '22 13:04 vrothberg

@imjasonh @vrothberg done, you will see this on the community calendar. access to calendar is here:

https://github.com/sigstore/community#community-meetings

I may not be able to make it, only just noticed I have a webinar / talk thing going on. Sometimes hangouts (even though its on the community calendar) inherits Red Hat corp google rights, but we have @vrothberg who should be able to open the meeting and admit others.

lukehinds avatar Apr 30 '22 05:04 lukehinds

Just finding this issue from twitter. Quick comment for https://github.com/defenseunicorns/zarf as we include both cosign and syft as sdks, the dependency tree is large—but I’d be more concerned with Cosign just falling back to kubectl for us as we can’t guarantee that binary is available when our tools runs.

jeff-mccoy avatar Jul 20 '22 16:07 jeff-mccoy

Just finding this issue from twitter. Quick comment for https://github.com/defenseunicorns/zarf as we include both cosign and syft as sdks, the dependency tree is large—but I’d be more concerned with Cosign just falling back to kubectl for us as we can’t guarantee that binary is available when our tools runs.

I assume that's responding to https://github.com/sigstore/cosign/issues/1462#issuecomment-1040653999:

I mentioned this since the cosign CLI depends on k8s.io/client-go just to get/create/update secrets, when it could shell out to kubectl instead (introducing a different set of problems, understandibly). But that wouldn't help anyway, because the Go module it shares with the cosigned webhook necessarily and correctly depends on knative/pkg, k8s.io/client-go, etc.

I agree I don't think shelling out to kubectl will be a good approach, for basically the reasons you describe.

It might be worth exploring replacing the k8s client used to get/create/update secrets with standard HTTP calls, but even that will probably be more headache than it's worth since we'll also want to find and parse kubeconfigs to manage auth to the cluster in question. I think we're stuck with our k8s dependency for good.

imjasonh avatar Jul 20 '22 16:07 imjasonh