fog-aws icon indicating copy to clipboard operation
fog-aws copied to clipboard

Extremely slow transfer rate to S3

Open avoidik opened this issue 9 months ago • 5 comments

Hello,

Could you assist me with the following concern? At the moment I am observing an extremely slow transfer rate to S3 when using fog-aws library. I have a 1,4TB archive of compressed random data (with high entropy, i.e. higher randomness). Whenever I transfer that file using awscli the process is completed in 22 minutes, but when I do this using the fog-aws library, it takes 6 hours.

I would be happy to get any suggestion in order to troubleshoot, identify, and address the root cause.

Regards

avoidik avatar May 07 '25 16:05 avoidik

@avoidik Thanks for the detailed report. I don't know for sure, but can make some educated guesses (especially given the other message you shared about chunk count and size). I think generally, fewer/larger chunks will be faster. I think it defaults to the largest supported size, so leaving it at that would likely help (if you aren't already doing that).

Could you share the code you are using to do the upload with fog-aws and the command line options you are using with awscli? I think that may help me better understand discrepancies and hopefully start to narrow in on what might be happening.

geemus avatar May 07 '25 19:05 geemus

Hmm, in my email it appears there was a comment here that must have been deleted. In any event here are some extra thoughts based on that:

Digging in, it looks like the concurrency setting is not used for create (only for copy) at least at present. Which might explain some of why this would act so much different than expected. It might be possible to update this to use that as well, is that something you would be interested in helping work on?

geemus avatar May 09 '25 02:05 geemus

@geemus my apologies, I have deleted the comment due to the mislabeled test results and unclean environment where I run this test, I'm going to add it later today

avoidik avatar May 09 '25 06:05 avoidik

No problem

geemus avatar May 09 '25 11:05 geemus

fewer/larger chunks will be faster

@geemus I might be wrong in my analysis, please correct me if so

in the current implementation every multipart_chunk_size chunk will be cached in memory. With multipart_chunk_size = 100MB and concurrency = 10 the process will consume at least 1000MB of RAM. If I set either of these two properties too large I'll increase the odds of OOM failure. On contrary, if I'm not going to set the multipart_chunk_size at all, wouldn't it be changed to MAX_SINGLE_PUT_SIZE = 5368709120 (5GB) for files larger than 5GB? Meaning that 5GB RAM will be required.

Based on:

  • https://github.com/fog/fog-aws/blob/v3.31.0/lib/fog/aws/storage.rb#L25
  • https://github.com/fog/fog-aws/blob/v3.31.0/lib/fog/aws/storage.rb#L210-L213
  • https://github.com/fog/fog-aws/blob/v3.31.0/lib/fog/aws/models/storage/file.rb#L278-L282
  • https://github.com/fog/fog-aws/blob/v3.31.0/lib/fog/aws/models/storage/file.rb#L338-L341

Could you share the code you are using to do the upload with fog-aws and the command line options you are using with awscli?

Here goes belated benchmark results of 1GB and 10GB files transfer to AWS S3 using fog-aws and awscli. I have been using hyperfine to benchmark transfer performance.

Legend:

measure-transfer-rate-(method)-(chunk_size)-(concurrency)
Click to view 1GB test results...
bs=1024 count=1000000 = 1GB

measure-transfer-rate-awscli-100mb-10 ran
    1.22 ± 0.11 times faster than measure-transfer-rate-awscli-1000mb-10
    1.82 ± 0.16 times faster than measure-transfer-rate-awscli-10mb-10
    1.85 ± 0.08 times faster than measure-transfer-rate-awscli-100mb-5
    1.92 ± 0.27 times faster than measure-transfer-rate-awscli-1000mb-5
    2.29 ± 0.16 times faster than measure-transfer-rate-awscli-5mb-10
    3.27 ± 0.14 times faster than measure-transfer-rate-awscli-10mb-5
    4.49 ± 0.15 times faster than measure-transfer-rate-awscli-5mb-5
    8.02 ± 0.36 times faster than measure-transfer-rate-awscli-1000mb-1
    8.53 ± 0.30 times faster than measure-transfer-rate-awscli-100mb-1
   10.34 ± 0.42 times faster than measure-transfer-rate-fog-1000mb-10
   10.59 ± 0.54 times faster than measure-transfer-rate-fog-1000mb-1
   10.80 ± 0.62 times faster than measure-transfer-rate-fog-1000mb-5
   11.22 ± 0.40 times faster than measure-transfer-rate-fog-100mb-5
   11.25 ± 0.42 times faster than measure-transfer-rate-fog-100mb-1
   11.35 ± 0.45 times faster than measure-transfer-rate-fog-100mb-10
   14.99 ± 0.61 times faster than measure-transfer-rate-awscli-10mb-1
   21.16 ± 0.89 times faster than measure-transfer-rate-awscli-5mb-1
   21.34 ± 0.76 times faster than measure-transfer-rate-fog-10mb-1
   21.56 ± 0.76 times faster than measure-transfer-rate-fog-10mb-5
   23.50 ± 0.83 times faster than measure-transfer-rate-fog-10mb-10
   30.94 ± 1.14 times faster than measure-transfer-rate-fog-5mb-10
   31.02 ± 1.16 times faster than measure-transfer-rate-fog-5mb-5
   31.45 ± 1.05 times faster than measure-transfer-rate-fog-5mb-1
Command Mean [s] Min [s] Max [s] Relative
measure-transfer-rate-fog-5mb-1 37.970 ± 0.448 37.464 38.537 16.23 ± 0.71
measure-transfer-rate-awscli-5mb-1 25.481 ± 0.602 24.897 26.369 10.89 ± 0.53
measure-transfer-rate-fog-10mb-1 26.466 ± 0.953 25.074 27.721 11.31 ± 0.63
measure-transfer-rate-awscli-10mb-1 18.568 ± 1.191 17.117 20.231 7.94 ± 0.61
measure-transfer-rate-fog-100mb-1 13.471 ± 0.472 12.940 14.190 5.76 ± 0.32
measure-transfer-rate-awscli-100mb-1 10.985 ± 0.227 10.773 11.308 4.70 ± 0.22
measure-transfer-rate-fog-1000mb-1 9.608 ± 0.756 9.224 10.958 4.11 ± 0.37
measure-transfer-rate-awscli-1000mb-1 12.587 ± 5.374 10.054 22.196 5.38 ± 2.31
measure-transfer-rate-fog-5mb-5 37.665 ± 0.542 36.949 38.134 16.10 ± 0.72
measure-transfer-rate-awscli-5mb-5 6.230 ± 0.080 6.152 6.356 2.66 ± 0.12
measure-transfer-rate-fog-10mb-5 25.338 ± 1.145 24.143 26.906 10.83 ± 0.67
measure-transfer-rate-awscli-10mb-5 5.039 ± 0.191 4.758 5.266 2.15 ± 0.12
measure-transfer-rate-fog-100mb-5 13.402 ± 0.426 12.997 13.967 5.73 ± 0.30
measure-transfer-rate-awscli-100mb-5 3.282 ± 0.220 3.146 3.671 1.40 ± 0.11
measure-transfer-rate-fog-1000mb-5 9.287 ± 0.059 9.226 9.381 3.97 ± 0.17
measure-transfer-rate-awscli-1000mb-5 10.158 ± 0.125 10.052 10.340 4.34 ± 0.19
measure-transfer-rate-fog-5mb-10 37.846 ± 0.756 36.923 38.681 16.18 ± 0.75
measure-transfer-rate-awscli-5mb-10 3.774 ± 0.092 3.662 3.874 1.61 ± 0.08
measure-transfer-rate-fog-10mb-10 26.692 ± 2.206 24.798 30.453 11.41 ± 1.06
measure-transfer-rate-awscli-10mb-10 3.405 ± 0.369 3.094 4.041 1.46 ± 0.17
measure-transfer-rate-fog-100mb-10 13.260 ± 0.371 13.013 13.886 5.67 ± 0.29
measure-transfer-rate-awscli-100mb-10 2.339 ± 0.099 2.237 2.486 1.00
measure-transfer-rate-fog-1000mb-10 9.378 ± 0.272 9.228 9.862 4.01 ± 0.21
measure-transfer-rate-awscli-1000mb-10 10.034 ± 0.039 9.996 10.088 4.29 ± 0.18
Click to view 10GB test results...
bs=1024 count=10000000 = 10GB

  measure-transfer-rate-awscli-1000mb-10 ran
    1.02 ± 0.12 times faster than measure-transfer-rate-awscli-100mb-10
    1.66 ± 0.18 times faster than measure-transfer-rate-awscli-10mb-10
    1.84 ± 0.20 times faster than measure-transfer-rate-awscli-100mb-5
    1.87 ± 0.29 times faster than measure-transfer-rate-awscli-1000mb-5
    2.27 ± 0.24 times faster than measure-transfer-rate-awscli-5mb-10
    3.25 ± 0.38 times faster than measure-transfer-rate-awscli-10mb-5
    4.35 ± 0.47 times faster than measure-transfer-rate-awscli-5mb-5
    8.29 ± 0.98 times faster than measure-transfer-rate-awscli-1000mb-1
    8.68 ± 0.93 times faster than measure-transfer-rate-awscli-100mb-1
   10.36 ± 1.11 times faster than measure-transfer-rate-fog-1000mb-1
   10.39 ± 1.11 times faster than measure-transfer-rate-fog-1000mb-10
   10.51 ± 1.14 times faster than measure-transfer-rate-fog-1000mb-5
   11.33 ± 1.21 times faster than measure-transfer-rate-fog-100mb-1
   11.36 ± 1.23 times faster than measure-transfer-rate-fog-100mb-5
   11.57 ± 1.26 times faster than measure-transfer-rate-fog-100mb-10
   15.24 ± 1.68 times faster than measure-transfer-rate-awscli-10mb-1
   21.23 ± 2.38 times faster than measure-transfer-rate-fog-10mb-1
   21.28 ± 2.29 times faster than measure-transfer-rate-awscli-5mb-1
   21.33 ± 2.30 times faster than measure-transfer-rate-fog-10mb-10
   21.43 ± 2.32 times faster than measure-transfer-rate-fog-10mb-5
   31.55 ± 3.38 times faster than measure-transfer-rate-fog-5mb-10
   31.70 ± 3.41 times faster than measure-transfer-rate-fog-5mb-5
   32.01 ± 3.47 times faster than measure-transfer-rate-fog-5mb-1
Command Mean [s] Min [s] Max [s] Relative
measure-transfer-rate-fog-5mb-1 360.685 ± 6.454 355.721 371.606 32.01 ± 3.47
measure-transfer-rate-awscli-5mb-1 239.723 ± 2.906 236.838 244.180 21.28 ± 2.29
measure-transfer-rate-fog-10mb-1 239.250 ± 8.083 230.494 249.348 21.23 ± 2.38
measure-transfer-rate-awscli-10mb-1 171.706 ± 4.437 166.396 177.866 15.24 ± 1.68
measure-transfer-rate-fog-100mb-1 127.688 ± 0.800 126.569 128.436 11.33 ± 1.21
measure-transfer-rate-awscli-100mb-1 97.750 ± 0.967 96.816 99.069 8.68 ± 0.93
measure-transfer-rate-fog-1000mb-1 116.742 ± 0.626 115.698 117.355 10.36 ± 1.11
measure-transfer-rate-awscli-1000mb-1 93.403 ± 4.721 90.101 100.782 8.29 ± 0.98
measure-transfer-rate-fog-5mb-5 357.176 ± 4.189 353.278 363.775 31.70 ± 3.41
measure-transfer-rate-awscli-5mb-5 49.024 ± 0.292 48.567 49.309 4.35 ± 0.47
measure-transfer-rate-fog-10mb-5 241.413 ± 4.034 235.312 245.585 21.43 ± 2.32
measure-transfer-rate-awscli-10mb-5 36.565 ± 1.619 34.948 39.278 3.25 ± 0.38
measure-transfer-rate-fog-100mb-5 128.022 ± 1.975 126.901 131.531 11.36 ± 1.23
measure-transfer-rate-awscli-100mb-5 20.731 ± 0.143 20.518 20.907 1.84 ± 0.20
measure-transfer-rate-fog-1000mb-5 118.470 ± 2.442 115.726 122.211 10.51 ± 1.14
measure-transfer-rate-awscli-1000mb-5 21.106 ± 2.337 19.878 25.248 1.87 ± 0.29
measure-transfer-rate-fog-5mb-10 355.457 ± 2.890 351.907 359.180 31.55 ± 3.38
measure-transfer-rate-awscli-5mb-10 25.619 ± 0.276 25.323 26.063 2.27 ± 0.24
measure-transfer-rate-fog-10mb-10 240.349 ± 3.288 234.784 243.221 21.33 ± 2.30
measure-transfer-rate-awscli-10mb-10 18.655 ± 0.595 18.092 19.533 1.66 ± 0.18
measure-transfer-rate-fog-100mb-10 130.351 ± 2.764 128.019 134.301 11.57 ± 1.26
measure-transfer-rate-awscli-100mb-10 11.490 ± 0.456 11.048 11.997 1.02 ± 0.12
measure-transfer-rate-fog-1000mb-10 117.050 ± 0.997 116.276 118.770 10.39 ± 1.11
measure-transfer-rate-awscli-1000mb-10 11.268 ± 1.204 10.599 13.408 1.00

It seems fog-aws at least 10 times slower than awscli. And then, changing the concurrency property in fog-aws has no sense, it gives more or less similar transfer rate irregardless of the property being set (1, 5, or 10).

it looks like the concurrency setting is not used for create

Here's my code and a test script I was using to run these tests:

  • https://gist.github.com/avoidik/9ce10e7ead1fe137af77a509828b62df

avoidik avatar May 11 '25 16:05 avoidik