async
async copied to clipboard
Tasks signaling Conditions leave suspended Fibers behind
I've learned that left-over, suspended fibers can retain ActiveRecord connections, causing the connection pool to run out of available connections (example: https://github.com/rmosolgo/graphql-ruby/issues/4739#issuecomment-1866671563).
So, I added a test to my project that makes sure that all Fibers created during execution end up not alive?. It currently doesn't pass when I use async as the "backend" for GraphQL::Dataloader. Here's the test: https://github.com/rmosolgo/graphql-ruby/blob/e60d22b138d91c0193e2b24eeb7b20b60648698f/spec/graphql/dataloader_spec.rb#L1020-L1026
In that test, I have to pause GC in order to catch the suspended fiber, because otherwise it will get GC'ed, since the task assigns @fiber = nil.
But, in Rails, that suspended Fiber can be the "owner" of an ActiveRecord connection, and since it's suspended, ActiveRecord thinks that it might still use the connection -- and therefore doesn't reuse the connection 😖 .
Here's a small replication (the condition.signal seems to be the special sauce):
require "bundler/inline"
gemfile do
gem "async", "~>2.6"
gem "sqlite3", "~>1.7"
gem "rails", "~>7.1"
end
require "active_record"
ActiveSupport::IsolatedExecutionState.isolation_level = :fiber
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: "__example.db")
ActiveRecord::Schema.define do
self.verbose = false
create_table :things, force: true do |t|
t.column :name, :string
end
end
class Thing < ActiveRecord::Base
end
Thing.create!(name: "Pogo Stick")
pp ActiveRecord::Base.connection_pool.stat
def run_process(number)
puts "#{number} - begin process"
Sync do |root_task|
condition = Async::Condition.new
t1 = root_task.async do |inner_task|
condition.wait
end
root_task.async do |inner_task|
Thing.all.to_a
condition.signal
end.wait
t1.wait
end
puts "#{number} - end process"
end
run_process(1)
run_process(2)
run_process(3)
# For clarity's sake, remove connections which ActiveRecord knows aren't in use:
ActiveRecord::Base.connection_pool.reap
puts "\n\nSome AR connections are in use, even though the process exited:"
pp ActiveRecord::Base.connection_pool.connections.map { |c|
{ in_use: !!c.in_use?, owner_alive?: c.owner&.alive?, owner: c.owner, }
}
# For exmaple:
# [{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000107a1f1f0 (resumed)>},
# {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x000000010b845060 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
# {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x000000010b8e4020 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
# {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x000000010b80ded0 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>}]
run_process(4)
# This process will fail to acquire a connection:
run_process(5)
run_process(6)
It prints like this:
$ ruby async_issue.rb
/Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/io-event-1.3.3/lib/io/event/support.rb:24: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future!
{:size=>5, :connections=>1, :busy=>1, :dead=>0, :idle=>0, :waiting=>0, :checkout_timeout=>5.0}
1 - begin process
1 - end process
2 - begin process
2 - end process
3 - begin process
3 - end process
Some AR connections are in use, even though the process exited:
[{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000101b2f2a8 (resumed)>},
{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000105d11ce8 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000105f18d98 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000105f13b40 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>}]
4 - begin process
4 - end process
5 - begin process
(It never exits and for some reason it also doesn't raise with a timeout error 🤷 )
I think it would be better if this Fiber was terminated, so that ActiveRecord could reap the connection it was using. What do you think?
Thanks for the report, I agree we should do something, but exactly what I'm not sure yet.
Thanks for taking a look. I also just saw your answer over at https://bugs.ruby-lang.org/issues/20081 (thanks for that, too 😅 ). That ensure suggestion makes this script work, too:
+ ensure
condition.signal
I'm going to see if I can integrate that into my GraphQL-Ruby code in the meantime 🍻 !