httpclient icon indicating copy to clipboard operation
httpclient copied to clipboard

Use IO.copy_stream when possible

Open casperisfine opened this issue 6 years ago • 0 comments

Fix: https://github.com/nahi/httpclient/issues/66

Context

Ref: https://github.com/GoogleCloudPlatform/google-cloud-ruby/issues/1897

We noticed that Google Cloud Storage's ruby library performance on download was heavily impacted by CPU usage on the host, especially for big files. After some digging it was clear it's due to how the data has to transit through read() and write() instead of leveraging sendfile().

An experiment using a quick and dirty patch showed a reduction from 15s to 5s for a 500MB download.

The patch

To leverage sendfile() in ruby, the best and simplest API is IO.copy_stream as suggested in https://github.com/nahi/httpclient/issues/66.

The problem is that copy_stream need IO or IO like objects to work with, and httpclient's API mostly deal with blocks, so I had to adapt the API somehow.

One important thing to note, is that we can only leverage sendfile if there is no modifications to apply on the request body, e.g. no chunking, no compression.

I'll add comments on specific parts of the patch in a later comments.

casperisfine avatar Jan 24 '18 16:01 casperisfine