nix s3-binary-cache-store: getFile streaming from S3

Before this change nix copy --from s3://... would download the entire object into memory before passing it on to the destination store. For bigger derivations this can lead to high memory usage and out-of-memory (OOM) errors. We experienced these OOM failures frequently.

This change adds support to stream objects from S3 to nix copy. This keeps the memory usage at a constant low level. We measured resident memory usage of roughly 40MB for packages of any size. To verify this patch works, we created a derivation with 10 GB of random data and did a before and after test. Without the patch we observed memory usage climbing to 10 GB. With the patch memory usage would settle at 40 MB.

We are now running this code in production and are no longer seeing OOM errors.

cc @thufschmitt

Jul 20 '22 12:07 mupdt

Quick update: First of all, apologies for the radio silence. We have since fixed the "corruption on retry" bug mentioned in the previous comment. We've been since using the patch in production.

Unfortunately, we have observed yet another a flake (rare, but persistent) which we believe to be a bug in our patch. We are in the process of tracking down the bug and will try to get a fix as soon as possible. Once we have the bugfix ready, we will use it internally until we are satisfied that there are no longer any bugs in the code. This might take a while, but we will comment on this PR when it happens and submit the final patch in this PR.

Apologies again and thank you for bearing with us!

Sep 26 '22 12:09 mupdt

Alright, this took much longer than expected, many apologies about that!

We managed to identify the cause for the second set of occasional failures. It turns out that AWS C++ SDK uses the truncate feature in streams (see this issue in AWS C++ SDK). For example, if there's a retriable HTTP error response then the response's error body is written straight into the stream. AWS SDK code parses that error body, truncates the stream back to the end of the last successful body, and continues with the download streaming.

So, if we use a stream that doesn't support truncate (which we did) then we'd occasionally see retriable errors streaming garbage into our nar or narinfo streams. This would happen very rarely, hence the rare failures.

We fixed this bug internally back on the 15th of October, stress-tested it for a while, and have then been using it in production since 25th of October. We haven't detected any failures since and have downloaded in the order of ~100TiB (very rough estimate).

Nov 22 '22 09:11 mupdt

@thufschmitt: Could I ask you for another look? 🙏

P.s.: We've been using this code in production for 3 months now without any hiccups.

Feb 23 '23 17:02 mupdt

This PR needs a rebase.

Would you be interested in writing a test with minio? (https://github.com/NixOS/nix/issues/8239) While your practical testing is very encouraging, this would increase our confidence and help us maintain the S3 related code.

Apr 19 '23 13:04 roberth

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-04-28-nix-team-meeting-minutes-50/27698/1

Apr 28 '23 17:04 nixos-discourse

@roberth: Do we have an existing example of using an external service (like minio) in integration tests? That would help me a lot.

May 02 '23 10:05 mupdt

@mupdt You might want to have a look at the Prometheus NixOS test, which integrates a separate node holding a minio S3 storage for the other test VM nodes to use.

Aug 07 '23 23:08 osnyx