http icon indicating copy to clipboard operation
http copied to clipboard

Request thread safety issue

Open paul opened this issue 6 years ago • 27 comments

We're getting weird connection issues in our exception tracker:

HTTP::ConnectionError: error reading from socket: stream closed in another thread
 per_operation.rb  79 wait_readable(...)
[GEM_ROOT]/gems/http-4.1.1/lib/http/timeout/per_operation.rb:79:in `wait_readable'
Caused by:
 IOError: stream closed in another thread
 per_operation.rb  79 wait_readable(...)
[GEM_ROOT]/gems/http-4.1.1/lib/http/timeout/per_operation.rb:79:in `wait_readable'

The logs for that single job don't show anything interesting, but if you look at the logs for all the workers in that process:

[2019-07-12T15:40:15.589244 #4]  INFO -- : [03e56584-c7a1-45ee-9ca4-c86334cd7e75] > POST https://onesignal.com/api/v1/notifications 
[2019-07-12T15:40:15.886281 #4]  INFO -- : [1bf0df80-9dde-4724-a377-5ed9271284de] > POST https://onesignal.com/api/v1/notifications 
[2019-07-12T15:40:15.887557 #4]  INFO -- : [03e56584-c7a1-45ee-9ca4-c86334cd7e75] HTTP::ConnectionError: error reading from socket: stream closed in another thread 
[2019-07-12T15:40:15.889464 #4] ERROR -- : [03e56584-c7a1-45ee-9ca4-c86334cd7e75] Error performing WisperActiveJobBroadcaster::Job (Job ID: 03e56584-c7a1-45ee-9ca4-c86334cd7e75) from Shoryuken(tesseract-production-background) in 691.77ms: HTTP::ConnectionError (error reading from socket: stream closed in another thread): 
[2019-07-12T15:40:16.268402 #4]  INFO -- : [1bf0df80-9dde-4724-a377-5ed9271284de] < 200 OK 

Job 03e5 started posting to the endpoint, and 300ms later another thread (job 1bf0) posted to the same endpoint with a different body. This somehow interrupted the first request in progress, raising the IOError in the first 1.5ms after the 2nd request started. The 2nd request then finished normally.

Here's how we're making that request:

# global default HTTP client
  MyAppHTTP =
    ::HTTP
    .timeout(connect: 2.seconds, read: 10.seconds, write: 10.seconds)
    .use(logging: { logger: Rails.logger })

# class-specific HTTP client

  def http
    @http ||= MyAppHTTP.
              headers(headers).
              timeout(write: 30, connect: 10, read: 40).
              auth("Basic \"#{api_key}\"")
  end

# called like this:

      http.post(url("/notifications"), json: payload.to_json(:web)).status.success?

We're using the shoryuken gem as our job runner, which uses concurrent-ruby for the worker thread pool. I'm not clear how making a request in one worker thread could interfere with another. The wiki says "Thread safety comes into play when you Make an HTTP request", but doesn't offer any details, or instruction how to mitigate it.

This happens more often than I would have expected, too:

image

paul avatar Jul 12 '19 17:07 paul

Reduced to a simple test case:

require "thread"
require "http"

@http = HTTP.headers("X-Test" => "1")

5.times.map do
  Thread.new { r = @http.post("https://httpbin.org/post"); puts r.status}
end.map(&:join)

This breaks in a similar way:

/home/rando/.rubies/ruby-2.6.3/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': stream closed in another thread (IOError)
	8: from test.rb:11:in `block (2 levels) in <main>'
	7: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/chainable.rb:27:in `post'
	6: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/client.rb:31:in `request'
	5: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/client.rb:75:in `perform'
	4: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/connection.rb:103:in `read_headers!'
	3: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/connection.rb:212:in `read_more'
	2: from /home/rando/.gem/ruby/2.6.3/gems/http-4.1.1/lib/http/timeout/null.rb:45:in `readpartial'
	1: from /home/rando/.rubies/ruby-2.6.3/lib/ruby/2.6.0/openssl/buffering.rb:125:in `readpartial'
/home/rando/.rubies/ruby-2.6.3/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': error reading from socket: stream closed in another thread (HTTP::ConnectionError)

Full console output

While modifying the thread block to use the HTTP constant directly works just fine.

5.times.map do
  Thread.new { r = HTTP.post("https://httpbin.org/post"); puts r.status}
end.map(&:join)

I would have expected to be able to configure the HTTP client once and be able to re-use it amongst threads, the same way calling the HTTP class methods work. Do I need to set up a concurrent-ruby thread pool of these configured clients?

paul avatar Jul 12 '19 18:07 paul

Possibly related to https://github.com/httprb/http/issues/371

tarcieri avatar Jul 12 '19 18:07 tarcieri

To try to answer this:

I would have expected to be able to configure the HTTP client once and be able to re-use it amongst threads, the same way calling the HTTP class methods work. Do I need to set up a concurrent-ruby thread pool of these configured clients?

The original goal of the HTTP::Options API builder was for it to be immutable and thread safe, and have additional chained options make copies of the original state rather than mutating. Re #371, it seems like there are places this isn't happening and the state is being shared? I'm not entirely sure.

HTTP::Client isn't thread safe, and stores multiple different kinds of mutable state. Persistent connections are likewise not thread safe, and if you'd like to use them in a multithreaded context I'd suggest using something like the connection_pool gem.

tarcieri avatar Jul 22 '19 14:07 tarcieri

@tarcieri So do you think the HTTP gem should allow the simple test case in my comment to work or not? (The one where I re-use the same @http ivar in multiple threads to make simultaneous requests, no persistent connections involved.)

I'll start using the ConnectionPool for this, I just found the behavior surprising.

paul avatar Jul 22 '19 21:07 paul

@paul if I’m reading your example right, @http should be an instance of HTTP::Options, and therefore your example should work. I’m curious if you think #371 is related/the same problem

tarcieri avatar Jul 22 '19 22:07 tarcieri

It's entirely possible. Maybe Ruby in 2015 couldn't detect this problem so just returned an empty string, and modern Ruby IO notices somehow and is able to raise the IOError.

paul avatar Jul 22 '19 23:07 paul

Re-reviewing #371 again, I think I was thinking of a different issue...

tarcieri avatar Jul 22 '19 23:07 tarcieri

@paul so the real question is...

HTTP::Options is (or was at least originally intended to be) thread-safe by virtue of being immutable.

Are you encountering thread-safety problems with it? (i.e. is it your @http)

tarcieri avatar Jul 23 '19 00:07 tarcieri

FWIW HTTP::Client itself isn't thread-safe:

https://github.com/httprb/http/blob/6240672bc84b23339fc9a9878040fcb45db78fb5/lib/http/client.rb#L71-L76

ixti avatar Jul 24 '19 19:07 ixti

In order to make thread-safe concurrent calls with same HTTP::Client instance, you will need to use mutex, something like this:

@mutex = Mutex.new

# ...

res = @mutex.synchronize { http.get(...).flush }

In other words you need to guarantee that there's no concurrent request between current request and Response#flush call.

ixti avatar Jul 24 '19 19:07 ixti

Yes, so HTTP::Options was originally intended to be immutable/thread safe, and HTTP::Client is not since it owns the (mutable) connection state, however the idea was you could use HTTP::Options as a request builder and then pass the result to different threads and instantiate thread-specific HTTP::Client instances there by making requests.

If I'm understanding correctly though, @paul is trying to do the first thing, which is share HTTP::Options between threads, and then instantiate new clients in each thread... and having thread safety issues with that.

@paul can you confirm that's the case?

tarcieri avatar Jul 24 '19 19:07 tarcieri

Mmm, but the snippet shared above is absolutely not re-use of options, but re-use of client:

@http = HTTP.headers("X-Test" => "1")

5.times.map do
  Thread.new { r = @http.post("https://httpbin.org/post"); puts r.status}
end.map(&:join)

ixti avatar Jul 24 '19 20:07 ixti

@ixti oh, now I see...

https://github.com/httprb/http/blob/master/lib/http/chainable.rb#L250

I guess that changed way back in #7

tarcieri avatar Jul 25 '19 00:07 tarcieri

@paul this seems like an unfortunate complication of there not being a proper "builder" type for the client, allowing options which don't require the creation of e.g. sockets to be performed the way you want.

It seems like retrofitting an API like that is possible, by splitting up the the HTTP::Chainable#branch method to return HTTP::Options in cases where it doesn't need to eagerly instantiate a client.

tarcieri avatar Jul 31 '19 15:07 tarcieri

The only thing that might not work correctly with that idea is keep-alive (persistent) clients...

ixti avatar Aug 01 '19 11:08 ixti

@ixti those could still return HTTP::Client, but instantiating them might need a special branch-like operation

tarcieri avatar Aug 01 '19 13:08 tarcieri

@tarcieri On one hand, yes. On another chances that somebody will try to cache persistent client are also high:

GithubClient = HTTP.persistent("https://github.com")

ixti avatar Aug 01 '19 16:08 ixti

I might sound like crazy, but why can't we keep @connection in thread-local variable?

ixti avatar Aug 01 '19 16:08 ixti

Ignore me - just imagined the code for that and it's complexity will be pretty bad :((

ixti avatar Aug 01 '19 16:08 ixti

@ixti I think it's ok to tell users of persistent connections they need to use something like the connection_pool gem if they want thread safety, and possibly document it with an example

tarcieri avatar Aug 01 '19 16:08 tarcieri

@tarcieri I guess I agree!

ixti avatar Aug 01 '19 20:08 ixti

FWIW, we are seeing this issue without a persistent connection, just a shared client.

ianks avatar Feb 27 '20 04:02 ianks

HTTP::Client is not thread safe, regardless of whether persistent connections are used

tarcieri avatar Feb 27 '20 04:02 tarcieri

@tarcieri I think we should refactor API a bit. All the options changing methods should return some sort of client builder rather than client instance, just the way it was. With one exception - #persistent which is a kind of special case. :D

ixti avatar Feb 27 '20 19:02 ixti

@ixti absolutely. That was the original intent, however as I noted earlier in this issue it seems it's been broken since #7

tarcieri avatar Feb 27 '20 21:02 tarcieri

I've added a link to this issue on https://github.com/httprb/http/wiki/Thread-Safety since the text is not valid anymore.

lunks avatar Mar 18 '22 14:03 lunks

I stumbled over this today as well, after using the gem for multiple years where i never ran into that. Would it maybe make sense to at least branch in the methods of Chainable which actually perform the request, namely Chainable#get, Chainable#post, and so on?

module Chainable
    def get(uri, options = {})
      # request :get, uri, options

       branch(default_options).request(:get, uri, options)
    end

    # ...
end

Then one could at least use

http = HTTP.auth(...).timeout(25).headers(...)

10.times { Thread.new { http.get('https://...') } }

without running into thread-safety issues. My workaround for now is to do:

http = HTTP.auth(...).timeout(25).headers(...)

10.times { Thread.new { http.headers({}).get('https://...') } }

But that feels a bit weird and unneccessary.

mrkamel avatar Aug 19 '22 07:08 mrkamel