rekor icon indicating copy to clipboard operation
rekor copied to clipboard

OOM when rekor-cli upload'ing 5.2GB iso

Open kpcyrd opened this issue 2 years ago • 7 comments

I'm trying to record the pgp signature of the QubesOS iso in rekor. The system seems to run out of memory (despite having 64gb and the iso only being 5.2gb).

% rekor-cli upload --signature Qubes-R4.0.4-x86_64.iso.asc --artifact Qubes-R4.0.4-x86_64.iso --public-key qubes-release-4-signing-key.asc
[1]    2479956 killed     rekor-cli upload --signature Qubes-R4.0.4-x86_64.iso.asc --artifact   
rekor-cli upload --signature Qubes-R4.0.4-x86_64.iso.asc --artifact     129.62s user 213.11s system 74% cpu 7:41.19 total
%

Download the artifacts:

wget https://mirrors.edge.kernel.org/qubes/iso/Qubes-R4.0.4-x86_64.iso https://mirrors.edge.kernel.org/qubes/iso/Qubes-R4.0.4-x86_64.iso.asc https://keys.qubes-os.org/keys/qubes-release-4-signing-key.asc

The rekor-cli binary is 0.4.0 from Arch Linux:

% pacman -Si rekor
Repository      : community
Name            : rekor
Version         : 0.4.0-1
Description     : Signature Transparency Log -- Sigstore client and server tools
Architecture    : x86_64
URL             : https://github.com/sigstore/rekor
Licenses        : Apache
Groups          : None
Provides        : None
Depends On      : None
Optional Deps   : None
Conflicts With  : None
Replaces        : None
Download Size   : 9.06 MiB
Installed Size  : 45.38 MiB
Packager        : Christian Rebischke <[email protected]>
Build Date      : Sat 01 Jan 2022 02:57:40 AM CET
Validated By    : MD5 Sum  SHA-256 Sum  Signature

kpcyrd avatar Jan 16 '22 19:01 kpcyrd

cc @SantiagoTorres any ideas here?

dlorenc avatar Jan 16 '22 23:01 dlorenc

On the tip of main I'm getting a slightly different error when reproducing

./rekor-cli upload --signature Qubes-R4.0.4-x86_64.iso.asc --artifact Qubes-R4.0.4-x86_64.iso --public-key qubes-release-4-signing-key.asc
[server exited unexpectedly]

My guess is that in this case I've caused the rekor-server to get OOM killed and that perhaps we're not streaming large bodies and trying to push it all into RAM? I'll keep digging to find that root cause..

nsmith5 avatar Jan 24 '22 19:01 nsmith5

Yeah nevermind that. The "server" in this case was my tmux session getting killed not the rekor server. Can confirm the rekor-cli is getting killed...

nsmith5 avatar Jan 24 '22 19:01 nsmith5

I think there's an upload limit on the public rekor instance too, so I'm suspecting this happens before the upload. Also the memory usage is higher than the artifact itself. :/

kpcyrd avatar Jan 25 '22 11:01 kpcyrd

because the PGP library we use requires the entire ISO to be uploaded to rekor in order to verify its signature, the 5.2GB ISO must be base64 encoded which increases its size by 33% pushing the value of that field to be close to 7GB. we probably end up with both representations in memory before uploading it to the rekor server. There is an 128MB limit on the public instance though as @kpcyrd notes.

bobcallaway avatar Jan 25 '22 15:01 bobcallaway

I think we could provide a way to verify the hash alone, for as long as we compute the hash properly. I think I could sketch something out in the next couple of weeks...

SantiagoTorres avatar Jan 25 '22 16:01 SantiagoTorres

@SantiagoTorres Lol, I just realized I did this 5 years ago with PGP signatures in Go.

https://github.com/Foxboron/clave

This implements a tool to generate the PGP checksum remotely and completes the signature locally. You could probably use the same strategy for signature validation. It's super hacky but it works.

Foxboron avatar Apr 20 '22 10:04 Foxboron