rust-s3
rust-s3 copied to clipboard
Add support of Parallel GET/PUT for tokio
Hi, thanks for great works!
I found that put_object_stream command is using a single thread. It causes to diminish the performance abnormally for large file data transfer. It's even 2x+ slower than put_object command for 300MB+ files.
In my project, this performance degradation was a big stumbling block. So, I have added an additional command put_object_stream_parallel to perform this operation in parallel. This is an asynchronous method using tokio to help enable multi-part data transfer in parallel. The number of workers is set to 20, which is an empirical value for rook-ceph private storage cluster environment.
Besides, although unofficial, I also implemented multi-part downloads.
Since it's needed to go outside the thread boundary, my new implementation expects the Bucket to be wrapped in Arc.
Performance Improvements
In the rook-ceph private storage cluster environment,
- GET (496MB): 3.125271375s => 2.366952567s [x1.37]
- PUT (496MB): 13.307903273s => 1.508496017s [x8.82]
@kerryeon This looks great, but I'll need a bit more time to look it over
@kerryeon I'm gonna close this, I've added parallel support for async uploads using futures::future::join_all, thanks for inspiring me to do it, my guess is that the performance is a bit worse then what you did, but the implementation is a bit simpler :)
It would be great if you could test out the new PUT implementation, I didn't touch GET yet.
Environments:
- NIC: Mellanox ConnectX-5 100Gb
- S3: Rook-Ceph Object Storage (Bare-metal)
Experiments:
- Data Size: 64 MiB
- Number of Iterations: 30
- Read Speed (NO parallel): 11.14 Gbps
- Read Speed (Parallel): 10.30 Gbps
- Write Speed (NO multipart): 337.07 Mbps
- Write Speed (Proposed Parallel): 1.01 Gbps
- Write Speed (Current): 324.66 Mbps
This implementation has no profits of using multipart uploading.
By the way, you said you were inspired by my implementation, but I'm confused that this statement is not quoted anywhere in commit, release, etc. It gives me the impression that my contribution has really helped nowhere.