httpclient
httpclient copied to clipboard
Use IO.copy_stream when possible
Fix: https://github.com/nahi/httpclient/issues/66
Context
Ref: https://github.com/GoogleCloudPlatform/google-cloud-ruby/issues/1897
We noticed that Google Cloud Storage's ruby library performance on download was heavily impacted by CPU usage on the host, especially for big files. After some digging it was clear it's due to how the data has to transit through read()
and write()
instead of leveraging sendfile()
.
An experiment using a quick and dirty patch showed a reduction from 15s
to 5s
for a 500MB download.
The patch
To leverage sendfile()
in ruby, the best and simplest API is IO.copy_stream
as suggested in https://github.com/nahi/httpclient/issues/66.
The problem is that copy_stream
need IO or IO like objects to work with, and httpclient
's API mostly deal with blocks, so I had to adapt the API somehow.
One important thing to note, is that we can only leverage sendfile
if there is no modifications to apply on the request body, e.g. no chunking, no compression.
I'll add comments on specific parts of the patch in a later comments.