aws-cli icon indicating copy to clipboard operation
aws-cli copied to clipboard

Memory leak in aws s3 cp when piping .zstd files through stdout to a slow consumer

Open dandrei opened this issue 1 year ago • 1 comments

Describe the bug

Like the title says, when doing an aws s3 cp to stdout on a .zst file piped to zstd -dc and then to a slow consumer, after a while the AWS CLI exhibits memory leak, leading to the complete freeze of the OS within minutes.

Expected Behavior

The file should be streamed without incident.

Current Behavior

After a certain amount of time, the AWS CLI starts using too much memory until the OS freezes.

Reproduction Steps

Below is a script that streams a .zst file to stdout (-), pipes it to zstd -dc and then to a slow consumer: a Python script that reads from stdin and sleeps for 1 second every 1000 lines.

For this to work, you need access to an S3 bucket with a large enough .zst file for the bug to manifest. In my tests, it consistently happened before 1 million lines were read, but you could use a larger file to be sure. The production file I was reading when I stumbled upon this bug contained one JSON object / line, and lines had an average length of ~1000-2000 characters, in case this matters.

The ulimit is there to prevent the memory leak from freezing your machine, the process will instead get killed once the allowed memory is exceeded (2G). It also demonstrates that the leak happens in the call to aws s3 cp, and not at any other step.

I have tested this with higher values like 8G, 16G, 32G, no matter how much memory you have once the bug happens your machine's RAM will get filled up within minutes, even though up to that point memory use remained largely linear.

(ulimit -v 2097152; aws s3 cp s3://your-bucket/path/to/file.zst -) | zstd -dc | python -c "
import sys, time
line_count = 0
for line in sys.stdin: 
    line_count += 1
    if line_count % 1000 == 0:
        print(str(line_count))
        time.sleep(1)
"

Possible Solution

My best guess is that it has to do with the multi-threaded nature of aws s3 cp, and the way zstd -dc reads data from stdin. The bug doesn't manifest with other file formats (e.g. gzip -dc), or with aws s3api get-object.

Additional Information/Context

This was reproduced on multiple CLI versions, and on multiple operating systems (Ubuntu, Amazon Linux).

CLI version used

2.15.43

Environment details (OS name and version, etc.)

Ubuntu 22.04.4 LTS

dandrei avatar Sep 10 '24 09:09 dandrei

Hi, thanks for reaching out. The latest AWS CLI version is 2.17.50 per the CHANGELOG, I recommend testing on a more recent version if you haven't already. You can also try setting different S3 configurations to optimize the download: https://awscli.amazonaws.com/v2/documentation/api/latest/topic/s3-config.html

That's interesting that this only happens with .zst files, and aws s3api get-object works for the same file. What size .zst file can you reproduce this with? Could you share your debug logs (with any sensitive info redacted) by adding --debug to your command? That could help give more insight into what's going on here.

tim-finnigan avatar Sep 13 '24 15:09 tim-finnigan

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.

github-actions[bot] avatar Sep 23 '24 16:09 github-actions[bot]

I can confirm that the bug is no longer surfacing in the current version, 2.17.56.

dandrei avatar Sep 23 '24 17:09 dandrei

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

github-actions[bot] avatar Sep 23 '24 17:09 github-actions[bot]