s5cmd icon indicating copy to clipboard operation
s5cmd copied to clipboard

Inconsistent downloads with a command.txt on AWS Lambda warm invocations

Open pitthecat opened this issue 1 year ago • 0 comments

Hi there, I am using s5cmd to increase the S3 download/upload speed on AWS Lambda. I am having a problem with s5cmd on AWS Lambda warm invocations, as parts of a command.txt are sometimes skipped.

Environment:

  • s5cmd version: 2.2.2.
  • Python subprocess to call s5cmd inside a Lambda container function subprocess.run(["/usr/local/bin/s5cmd", "--numworkers", "16", "run", "/tmp/project/personA/download_commands.txt" ],shell=False,check=True)
  • Maximum hardware specs for the AWS Lambda function

1. First invocation/cold start. A new Lambda instances spawns and everything is downloaded correctly. I can see log events for the copy

download_commands.txt for personA:

[INFO] 2023-11-16T10:41:33.990Z <some-uui-id> cp s3://bucket/personA/video/0_folder/* /tmp/project/personA/video/0_folder/ 
cp s3://bucket/personA/video/3_folderA/* /tmp/project/personA/video/3_folderA/ 
cp s3://bucket/personA/video/3_folderB/* /tmp/project/personA/video/3_folderB/ 
cp s3://bucket/personA/video/7_folder/* /tmp/project/personA/video/7_folder/ 
cp s3://bucket/personA/video/11_folder/* /tmp/project/personA/video/11_folder/ 

I can see log events for the 0_folder

cp s3://bucket/personA/video/0_folder/prefix_A/part_10.obj /tmp/project/personA/video/0_folder/prefix_A/part_10.obj 

2. Invocation/warm start. Lambda uses the same Lambda instance as 1.

download_commands.txt for personB:

[INFO] 2023-11-16T10:42:37.682Z <some-uui-id> cp s3://bucket/personB/video/0_folder/* /tmp/project/personB/video/0_folder/
cp s3://bucket/personB/video/3_folderA/* /tmp/project/personB/video/3_folderA/ 
cp s3://bucket/personB/video/3_folderB/* /tmp/project/personB/video/3_folderB/
cp s3://bucket/personB/video/7_folder/* /tmp/project/personB/video/7_folder/
cp s3://bucket/personB/video/11_folder/* /tmp/project/personB/video/11_folder/

No error message, but the 0_folder is not downloaded. No log events for copying any content of 0_folder. 3_folderA, 3_folderB, 7_folder and 11_folder are being downloaded and logged

  • The problem only occurs during Lambda warm invocations. It never happens for a Lambda cold start
  • It can happen for any person's prefix. Sometimes personA, sometimes personB
  • It not just happens to 0_folder, it can happen to other folders
  • It shouldn't be a problem will still existing data in the Lambda container. There is a clean up of /tmp/project/personA after every run and download paths are different for each person.
  • No problems with aws cli

Any help would be highly appreciated. It's a great tool and speeds up the S3 work significantly :)

pitthecat avatar Nov 16 '23 12:11 pitthecat