s5cmd icon indicating copy to clipboard operation
s5cmd copied to clipboard

Question: where does s5cmd main performance benefits come from?

Open quinnj opened this issue 2 years ago • 1 comments

Hopefully pretty self-explanatory issue title. I've snooped around the code a bit, so I'll give my guess, but just wondered if others could chime in on what ends up giving s5cmd such impressive performance in the README charts.

  • Re-use of aws sessions as noted here
  • Ability to run various GET/DELETE requests in parallel as outlined in docs
  • Specification of concurrency/part_size for large file download/upload

I guess I'm curious how much impact each of these have on overall performance gains. I'll admit I was somewhat surprised that "under the hood" the core aws-sdk routines were used for the actual requests, so it made me wonder how the performance in s5cmd could be so much better on top and hence the issue/theories above.

Thanks!

(for context, I'm looking to implement a performant cloud storage API in Julia language, so I'm looking at the "best" implementations people have been referring me to 😄 )

quinnj avatar Jun 29 '22 20:06 quinnj

I am curious about this too as when I came across this project and tried it, it made me wonder what makes it fast and whats the downside, reliability may be as s5cmd doesnt do checksums for verification I guess.

gamefundas avatar Jul 22 '22 13:07 gamefundas