http
http copied to clipboard
Request thread safety issue
We're getting weird connection issues in our exception tracker:
HTTP::ConnectionError: error reading from socket: stream closed in another thread
per_operation.rb 79 wait_readable(...)
[GEM_ROOT]/gems/http-4.1.1/lib/http/timeout/per_operation.rb:79:in `wait_readable'
Caused by:
IOError: stream closed in another thread
per_operation.rb 79 wait_readable(...)
[GEM_ROOT]/gems/http-4.1.1/lib/http/timeout/per_operation.rb:79:in `wait_readable'
The logs for that single job don't show anything interesting, but if you look at the logs for all the workers in that process:
[2019-07-12T15:40:15.589244 #4] INFO -- : [03e56584-c7a1-45ee-9ca4-c86334cd7e75] > POST https://onesignal.com/api/v1/notifications
[2019-07-12T15:40:15.886281 #4] INFO -- : [1bf0df80-9dde-4724-a377-5ed9271284de] > POST https://onesignal.com/api/v1/notifications
[2019-07-12T15:40:15.887557 #4] INFO -- : [03e56584-c7a1-45ee-9ca4-c86334cd7e75] HTTP::ConnectionError: error reading from socket: stream closed in another thread
[2019-07-12T15:40:15.889464 #4] ERROR -- : [03e56584-c7a1-45ee-9ca4-c86334cd7e75] Error performing WisperActiveJobBroadcaster::Job (Job ID: 03e56584-c7a1-45ee-9ca4-c86334cd7e75) from Shoryuken(tesseract-production-background) in 691.77ms: HTTP::ConnectionError (error reading from socket: stream closed in another thread):
[2019-07-12T15:40:16.268402 #4] INFO -- : [1bf0df80-9dde-4724-a377-5ed9271284de] < 200 OK
Job 03e5 started posting to the endpoint, and 300ms later another thread (job 1bf0) posted to the same endpoint with a different body. This somehow interrupted the first request in progress, raising the IOError in the first 1.5ms after the 2nd request started. The 2nd request then finished normally.
Here's how we're making that request:
# global default HTTP client
MyAppHTTP =
::HTTP
.timeout(connect: 2.seconds, read: 10.seconds, write: 10.seconds)
.use(logging: { logger: Rails.logger })
# class-specific HTTP client
def http
@http ||= MyAppHTTP.
headers(headers).
timeout(write: 30, connect: 10, read: 40).
auth("Basic \"#{api_key}\"")
end
# called like this:
http.post(url("/notifications"), json: payload.to_json(:web)).status.success?
We're using the shoryuken gem as our job runner, which uses concurrent-ruby for the worker thread pool. I'm not clear how making a request in one worker thread could interfere with another. The wiki says "Thread safety comes into play when you Make an HTTP request", but doesn't offer any details, or instruction how to mitigate it.
This happens more often than I would have expected, too:

Reduced to a simple test case:
require "thread"
require "http"
@http = HTTP.headers("X-Test" => "1")
5.times.map do
Thread.new { r = @http.post("https://httpbin.org/post"); puts r.status}
end.map(&:join)
This breaks in a similar way:
/home/rando/.rubies/ruby-2.6.3/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': stream closed in another thread (IOError)
8: from test.rb:11:in `block (2 levels) in <main>'
7: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/chainable.rb:27:in `post'
6: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/client.rb:31:in `request'
5: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/client.rb:75:in `perform'
4: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/connection.rb:103:in `read_headers!'
3: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/connection.rb:212:in `read_more'
2: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/timeout/null.rb:45:in `readpartial'
1: from /home/rando/.rubies/ruby-2.6.3/lib/ruby/2.6.0/openssl/buffering.rb:125:in `readpartial'
/home/rando/.rubies/ruby-2.6.3/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': error reading from socket: stream closed in another thread (HTTP::ConnectionError)
While modifying the thread block to use the HTTP constant directly works just fine.
5.times.map do
Thread.new { r = HTTP.post("https://httpbin.org/post"); puts r.status}
end.map(&:join)
I would have expected to be able to configure the HTTP client once and be able to re-use it amongst threads, the same way calling the HTTP class methods work. Do I need to set up a concurrent-ruby thread pool of these configured clients?
Possibly related to https://github.com/httprb/http/issues/371
To try to answer this:
I would have expected to be able to configure the HTTP client once and be able to re-use it amongst threads, the same way calling the HTTP class methods work. Do I need to set up a concurrent-ruby thread pool of these configured clients?
The original goal of the HTTP::Options API builder was for it to be immutable and thread safe, and have additional chained options make copies of the original state rather than mutating. Re #371, it seems like there are places this isn't happening and the state is being shared? I'm not entirely sure.
HTTP::Client isn't thread safe, and stores multiple different kinds of mutable state. Persistent connections are likewise not thread safe, and if you'd like to use them in a multithreaded context I'd suggest using something like the connection_pool gem.
@tarcieri So do you think the HTTP gem should allow the simple test case in my comment to work or not? (The one where I re-use the same @http ivar in multiple threads to make simultaneous requests, no persistent connections involved.)
I'll start using the ConnectionPool for this, I just found the behavior surprising.
@paul if I’m reading your example right, @http should be an instance of HTTP::Options, and therefore your example should work. I’m curious if you think #371 is related/the same problem
It's entirely possible. Maybe Ruby in 2015 couldn't detect this problem so just returned an empty string, and modern Ruby IO notices somehow and is able to raise the IOError.
Re-reviewing #371 again, I think I was thinking of a different issue...
@paul so the real question is...
HTTP::Options is (or was at least originally intended to be) thread-safe by virtue of being immutable.
Are you encountering thread-safety problems with it? (i.e. is it your @http)
FWIW HTTP::Client itself isn't thread-safe:
https://github.com/httprb/http/blob/6240672bc84b23339fc9a9878040fcb45db78fb5/lib/http/client.rb#L71-L76
In order to make thread-safe concurrent calls with same HTTP::Client instance, you will need to use mutex, something like this:
@mutex = Mutex.new
# ...
res = @mutex.synchronize { http.get(...).flush }
In other words you need to guarantee that there's no concurrent request between current request and Response#flush call.
Yes, so HTTP::Options was originally intended to be immutable/thread safe, and HTTP::Client is not since it owns the (mutable) connection state, however the idea was you could use HTTP::Options as a request builder and then pass the result to different threads and instantiate thread-specific HTTP::Client instances there by making requests.
If I'm understanding correctly though, @paul is trying to do the first thing, which is share HTTP::Options between threads, and then instantiate new clients in each thread... and having thread safety issues with that.
@paul can you confirm that's the case?
Mmm, but the snippet shared above is absolutely not re-use of options, but re-use of client:
@http = HTTP.headers("X-Test" => "1")
5.times.map do
Thread.new { r = @http.post("https://httpbin.org/post"); puts r.status}
end.map(&:join)
@ixti oh, now I see...
https://github.com/httprb/http/blob/master/lib/http/chainable.rb#L250
I guess that changed way back in #7
@paul this seems like an unfortunate complication of there not being a proper "builder" type for the client, allowing options which don't require the creation of e.g. sockets to be performed the way you want.
It seems like retrofitting an API like that is possible, by splitting up the the HTTP::Chainable#branch method to return HTTP::Options in cases where it doesn't need to eagerly instantiate a client.
The only thing that might not work correctly with that idea is keep-alive (persistent) clients...
@ixti those could still return HTTP::Client, but instantiating them might need a special branch-like operation
@tarcieri On one hand, yes. On another chances that somebody will try to cache persistent client are also high:
GithubClient = HTTP.persistent("https://github.com")
I might sound like crazy, but why can't we keep @connection in thread-local variable?
Ignore me - just imagined the code for that and it's complexity will be pretty bad :((
@ixti I think it's ok to tell users of persistent connections they need to use something like the connection_pool gem if they want thread safety, and possibly document it with an example
@tarcieri I guess I agree!
FWIW, we are seeing this issue without a persistent connection, just a shared client.
HTTP::Client is not thread safe, regardless of whether persistent connections are used
@tarcieri I think we should refactor API a bit. All the options changing methods should return some sort of client builder rather than client instance, just the way it was. With one exception - #persistent which is a kind of special case. :D
@ixti absolutely. That was the original intent, however as I noted earlier in this issue it seems it's been broken since #7
I've added a link to this issue on https://github.com/httprb/http/wiki/Thread-Safety since the text is not valid anymore.
I stumbled over this today as well, after using the gem for multiple years where i never ran into that. Would it maybe make sense to at least branch in the methods of Chainable which actually perform the request, namely Chainable#get, Chainable#post, and so on?
module Chainable
def get(uri, options = {})
# request :get, uri, options
branch(default_options).request(:get, uri, options)
end
# ...
end
Then one could at least use
http = HTTP.auth(...).timeout(25).headers(...)
10.times { Thread.new { http.get('https://...') } }
without running into thread-safety issues. My workaround for now is to do:
http = HTTP.auth(...).timeout(25).headers(...)
10.times { Thread.new { http.headers({}).get('https://...') } }
But that feels a bit weird and unneccessary.