cosign icon indicating copy to clipboard operation
cosign copied to clipboard

sign-blob attemps to read the whole file into memory

Open luc-lynx opened this issue 3 years ago • 4 comments

Description

At the moment sign-blob logic attempts to read the whole file into memory and then sign it. It can lead to problems with signing large blobs and leads to increased memory requirements on build agents perform signing.

I suggest refactoring that place and using more "traditional" steps like

  • digest calculation which can be done via sequential reading of small potrions of file
  • signing the digest

This will allow to avoid reading the whole blob into memory.

luc-lynx avatar Jul 23 '22 19:07 luc-lynx

I think this is because we abstract away the signer, and some key types require the full contents in memory (ed25519). We'd have to special case this.

dlorenc avatar Aug 20 '22 02:08 dlorenc

Could we get away with passing around a handle to a file instead?

znewman01 avatar Aug 30 '22 17:08 znewman01

At some point you still need to read the entire file though, unless we special case out the ed25519 style signers.

dlorenc avatar Aug 30 '22 17:08 dlorenc

See SignMessage. The message passed to SignMessage is an io.Reader; even for ed25519 we just rewind the reader. The sigstore/sigstore libraries take care not to require reading the whole message into memory at once.

I think this should be doable

znewman01 avatar Aug 30 '22 17:08 znewman01

Hi. I'm new to sigstore. For sign-blob command, I noticed that payload is loaded into the memory to sign it - https://github.com/sigstore/cosign/blob/main/cmd/cosign/cli/sign/sign_blob.go#L63 and then the payload is also used to upload transparency log - https://github.com/sigstore/cosign/blob/main/cmd/cosign/cli/sign/sign_blob.go#L93.

The issue is that io.Reader can only be used to read data once. So we have to load the entire payload into the memory for this to work. Any suggestions on what can be done here?

AnishShah avatar Nov 30 '22 07:11 AnishShah

Hi. I'm new to sigstore. For sign-blob command, I noticed that payload is loaded into the memory to sign it - https://github.com/sigstore/cosign/blob/main/cmd/cosign/cli/sign/sign_blob.go#L63

This could require the whole payload.

and then the payload is also used to upload transparency log - https://github.com/sigstore/cosign/blob/main/cmd/cosign/cli/sign/sign_blob.go#L93.

Looks like ultimately it just gets hashed: https://github.com/sigstore/cosign/blob/ce5ba1cac8669692c8d50286a2576ccd8bc01410/pkg/cosign/tlog.go#L191

The issue is that io.Reader can only be used to read data once. So we have to load the entire payload into the memory for this to work. Any suggestions on what can be done here?

Two options:

  • Go has a Seeker interface which file handles should implement; we can rewind the payload after signing it. This almost works, except now if we're trying to sign STDIN we have to fall back to reading the whole thing into memory. I guess that's possible.
  • (IMO this is better because we don't have the problem with STDIN, but it's a little more complicated to implement)
    1. change TLogUpload to take a hash, rather than a full payload.
    2. Write a struct that wraps an io.Reader and implements io.Reader and hash.Hash that does a pass through; something like this; wrap payload in it.
    3. When we sign the payload, get the hash afterwards. Use that later on

znewman01 avatar Nov 30 '22 20:11 znewman01