Unexpectedly and often getting Concurrent::IllegalOperationError
Here is the request log from Rails 7.0 and Postgres 14:
Started GET "/post/1?include=children" for 127.0.0.1 at 2022-03-18 22:50:53 +0100
Processing by PostsController#show as */*
Parameters: {"id"=>"1"}
1m error: Falcon::Adapters::Rack [oid=0x57d0] [ec=0x57e4] [pid=39666] [2022-03-18 22:50:53 +0100]
| Concurrent::IllegalOperationError: Cannot release a read lock which is not held
| → /Users/emil/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/atomic/reentrant_read_write_lock.rb:244 in `release_read_lock'
Or alternatively:
2.85s error: Falcon::Adapters::Rack [oid=0x3070] [ec=0x3084] [pid=39709] [2022-03-18 22:54:43 +0100]
| NameError: uninitialized constant Rack::Mime
|
| content_type = ::Rack::Mime.mime_type(ext, nil)
They seem to appear when requests are coming in simultaneously from the same client.
Any thoughts?
Let me know if there's anything else that I need to provide.
Thanks 🤝
I will take a look.
Hey, @ioquatix , any progress on this? Is there anything I can do to improve the bug report? I would love to be able to use Falcon rather than Puma.
If you have time to make a minimal repro within just Async and Concurrent ruby that would be awesome.
The 2nd issue reported is probably the same issue as https://github.com/socketry/async/issues/157
@ioquatix, behavior does seem alike. But possibly not identical.
I don't have the debug gem in my project, the removal of which seems to be the solution for the other issue.
My understanding is the debug gem introduces autoload which can cause the described hang.
@ioquatix, it seems reasonable. Not sure what autoload it introduces because I will have to go through a lot of gems to detect that.
We need to fix the problem in Ruby.
Just chiming in here to say that I'm experiencing this also. I was getting it a lot, and then I upgraded to ruby 3.2 preview 2 and that seemed to eliminate the problem. But then I started working on a page that deals with a lot of ActiveStorage previews, and my problems came back. It may be relevant that I'm using the DiskStorage with ActiveStorage. I haven't noticed the hanging in production which uses S3 for storage.
It's not consistent, but I usually get 2-3 page loads before it hangs and I have to kill the server and restart it.
Here's my error output from falcon:
{
"time":"2022-10-16T17:30:47-05:00",
"severity":"error",
"class":"Protocol::Rack::Adapter::Rack2",
"oid":30660,
"pid":29941,
"subject":"#<Protocol::Rack::Adapter::Rack2:0x000000010682aa58>",
"message":"Cannot release a read lock which is not held",
"error":{
"kind":"Concurrent::IllegalOperationError",
"message":"Cannot release a read lock which is not held",
"stack":"/Users/will/.gem/ruby/3.2.0/gems/concurrent-ruby-1.1.10/lib/concurrent-ruby/concurrent/atomic/reentrant_read_write_lock.rb:244:in `release_read_lock'\n/Users/will/.gem/ruby/3.2.0/gems/actionview-7.0.3/lib/action_view/cache_expiry.rb:21:in `complete'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:37:in `before'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:423:in `block in make_lambda'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:199:in `block (2 levels) in halting'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:687:in `block (2 levels) in default_terminator'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:686:in `catch'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:686:in `block in default_terminator'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:200:in `block in halting'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:595:in `block in invoke_before'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:595:in `each'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:595:in `invoke_before'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:106:in `run_callbacks'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:143:in `complete'\n/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:137:in `complete!'\n/Users/will/.gem/ruby/3.2.0/gems/actionpack-7.0.3/lib/action_dispatch/middleware/executor.rb:15:in `block in call'\n/Users/will/.gem/ruby/3.2.0/gems/rack-2.2.4/lib/rack/body_proxy.rb:29:in `close'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.1.6/lib/protocol/rack/body.rb:45:in `block in wrap'\n<internal:kernel>:90:in `tap'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.1.6/lib/protocol/rack/body.rb:44:in `wrap'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.1.6/lib/protocol/rack/response.rb:66:in `wrap'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.1.6/lib/protocol/rack/adapter/generic.rb:108:in `call'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-http-0.23.10/lib/protocol/http/middleware.rb:50:in `call'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.1.6/lib/protocol/rack/rewindable.rb:75:in `call'\n/Users/will/.gem/ruby/3.2.0/gems/protocol-http-0.23.10/lib/protocol/http/middleware.rb:50:in `call'\n/Users/will/.gem/ruby/3.2.0/gems/async-http-0.59.2/lib/async/http/server.rb:67:in `block in accept'\n/Users/will/.gem/ruby/3.2.0/gems/async-http-0.59.2/lib/async/http/protocol/http1/server.rb:62:in `each'\n/Users/will/.gem/ruby/3.2.0/gems/async-http-0.59.2/lib/async/http/server.rb:56:in `accept'\n/Users/will/.gem/ruby/3.2.0/gems/async-io-1.34.0/lib/async/io/server.rb:32:in `block in accept_each'\n/Users/will/.gem/ruby/3.2.0/gems/async-io-1.34.0/lib/async/io/socket.rb:73:in `block in accept'\n/Users/will/.gem/ruby/3.2.0/gems/async-2.0.3/lib/async/task.rb:255:in `block in schedule'\n"
}
}
If there's anything else I can do to help shed some light on this, I'm game.
Thanks your report was really useful.
So, I played around with this and could reproduce the the same error/failure.
https://github.com/ruby-concurrency/concurrent-ruby/issues/962#issuecomment-1281614388
What we need to know is whether the Rails code does the same thing i.e. does it execute the acquire and release on different threads.
If anyone can reproduce the issue, can you monkeypatch the Rails file bundle open actionview and edit lib/action_view/cache_expiry.rb with:
module ActionView
class CacheExpiry
class Executor
def initialize(watcher:)
@execution_lock = Concurrent::ReentrantReadWriteLock.new
@cache_expiry = ViewModificationWatcher.new(watcher: watcher) do
clear_cache
end
end
def run
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
@cache_expiry.execute_if_updated
Console.logger.info(self, "Acquire read lock", thread: Thread.current, backtrace: Thread.current.backtrace)
@execution_lock.acquire_read_lock
end
end
def complete(_)
Console.logger.info(self, "Release read lock", thread: Thread.current, backtrace: Thread.current.backtrace)
@execution_lock.release_read_lock
end
private
def clear_cache
Console.logger.info(self, "With write lock", thread: Thread.current, backtrace: Thread.current.backtrace)
@execution_lock.with_write_lock do
ActionView::LookupContext::DetailsKey.clear
end
end
end
@ekampp I re-read your initial issue description, and it turns out there are two issues - the first one which was recently reproduced by @willcosgrove and the 2nd one which is a bug in Ruby autoload. The autoload bug is fixed by forcing Rails to load everything at boot. The 2nd issue is solved by Ruby 3.2 or config.eager_load=true (IIRC).
After looking at the backtrace for a while and reading the Executor documentation, I think I see part of the problem.
The assumption is that the request is generated on the same thread as the response.
"/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:143:in `complete'\n" +
"/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:137:in `complete!'\n" +
"/Users/will/.gem/ruby/3.2.0/gems/actionpack-7.0.3/lib/action_dispatch/middleware/executor.rb:15:in `block in call'\n" +
"/Users/will/.gem/ruby/3.2.0/gems/rack-2.2.4/lib/rack/body_proxy.rb:29:in `close'\n" +
"/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.1.6/lib/protocol/rack/body.rb:45:in `block in wrap'\n" +
In order for this to have failed, some how the BodyProxy#close would need to be invoked in a way that causes the thread local storage to have been invalidated or some how be on the wrong thread, which is really odd because Falcon runs all fibers on the same thread. So it would have to be multiple fibers locking and releasing, which some how causes problems. Maybe the same thread calling lock and unlock is causing issues...
Okay, looking at the code for the CacheExpiry, there looks to be a background thread polling the watcher. Following this, I could make a synthetic repro:
#!/usr/bin/env ruby
require 'concurrent'
require 'async'
lock = Concurrent::ReentrantReadWriteLock.new
Q = 0.01
watcher = Thread.new do
loop do
lock.with_write_lock do
sleep Q
end
sleep Q
end
end
Async do |task|
2.times do |i|
task.async do
loop do
lock.with_read_lock do
sleep Q
end
sleep Q
end
end
end
end
It seems like the problem is in concurrent-ruby now.
Okay, I was able to propose a fix: https://github.com/ruby-concurrency/concurrent-ruby/issues/962#issuecomment-1281676806
Thanks for digging into this! I applied your monkey patch listed here https://github.com/ruby-concurrency/concurrent-ruby/issues/962#issuecomment-1281676806 and I got some new behavior.
I still get errors, but the server doesn't hang. Before, when this would happen, the server would completely hang up, and not serve any other requests. This still errored, but the server would still serve subsequent requests. I refreshed several times, still with sporadic errors, and then it eventually settled into a steady state of no errors.
I recorded a screencast to try and capture the behavior. Pay no attention to the contents of my silly little demo app 😇 https://wco.sg/v6LpIc
You can see that it starts out loading with sporadic errors (which show up as images not loading, the images are served through the app from active storage). It eventually gets to a point where I cannot get it to error. Then I kill and restart the server and the behavior starts over.
I also experienced the hanging behaviour in my test case, after the concurrent ruby lock fails, it stops working for everything else too.
I'm just fixing another bug, once I'm done, I'll review your updates in more detail! Thanks.
Just chiming back in to report that I'm unfortunately still experiencing this even after updating concurrent-ruby to 1.2.0.
Stacktrace:
/Users/will/.gem/ruby/3.2.0/gems/concurrent-ruby-1.2.0/lib/concurrent-ruby/concurrent/atomic/reentrant_read_write_lock.rb:246:in `release_read_lock'
/Users/will/.gem/ruby/3.2.0/gems/actionview-7.0.3/lib/action_view/cache_expiry.rb:21:in `complete'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:37:in `before'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:423:in `block in make_lambda'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:199:in `block (2 levels) in halting'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:687:in `block (2 levels) in default_terminator'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:686:in `catch'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:686:in `block in default_terminator'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:200:in `block in halting'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:595:in `block in invoke_before'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:595:in `each'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:595:in `invoke_before'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/callbacks.rb:106:in `run_callbacks'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:143:in `complete'
/Users/will/.gem/ruby/3.2.0/gems/activesupport-7.0.3/lib/active_support/execution_wrapper.rb:137:in `complete!'
/Users/will/.gem/ruby/3.2.0/gems/actionpack-7.0.3/lib/action_dispatch/middleware/executor.rb:15:in `block in call'
/Users/will/.gem/ruby/3.2.0/gems/rack-2.2.6.2/lib/rack/body_proxy.rb:29:in `close'
/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/body.rb:29:in `block in wrap'
<internal:kernel>:90:in `tap'
/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/body.rb:28:in `wrap'
/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/response.rb:53:in `wrap'
/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/adapter/rack2.rb:91:in `call'
/Users/will/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/middleware.rb:33:in `call'
/Users/will/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/rewindable.rb:58:in `call'
/Users/will/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/middleware.rb:33:in `call'
/Users/will/.gem/ruby/3.2.0/gems/async-http-0.60.1/lib/async/http/server.rb:67:in `block in accept'
/Users/will/.gem/ruby/3.2.0/gems/async-http-0.60.1/lib/async/http/protocol/http1/server.rb:62:in `each'
/Users/will/.gem/ruby/3.2.0/gems/async-http-0.60.1/lib/async/http/server.rb:56:in `accept'
/Users/will/.gem/ruby/3.2.0/gems/async-io-1.34.3/lib/async/io/server.rb:32:in `block in accept_each'
/Users/will/.gem/ruby/3.2.0/gems/async-io-1.34.3/lib/async/io/socket.rb:73:in `block in accept'
/Users/will/.gem/ruby/3.2.0/gems/async-2.3.1/lib/async/task.rb:107:in `block in run'
/Users/will/.gem/ruby/3.2.0/gems/async-2.3.1/lib/async/task.rb:248:in `block in schedule'
If there's any other information I can provide to help debug this, let me know.
Hmmm, thanks for this, I'll take a look.
I can't get the above test script to fail with v1.2.0 of concurrent-ruby, so it must be something else.
Ah, great, I was able to reproduce this with the default Rails website rails new ..., changing puma to falcon and accessing the home page.
I did the monkey thing and patched it:
class Executor
def initialize(watcher:)
@execution_lock = Concurrent::ReentrantReadWriteLock.new
@cache_expiry = ViewModificationWatcher.new(watcher: watcher) do
clear_cache
end
end
def run
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
@cache_expiry.execute_if_updated
Console.logger.info(self, "run:acquire_read_lock")
@execution_lock.acquire_read_lock
end
end
def complete(_)
Console.logger.info(self, "complete:release_read_lock")
@execution_lock.release_read_lock
end
private
def clear_cache
Console.logger.info(self, "clear_cache:with_write_lock")
@execution_lock.with_write_lock do
ActionView::LookupContext::DetailsKey.clear
end
end
end
2.41s info: ActionView::CacheExpiry::Executor [oid=0x3728] [ec=0x373c] [pid=12809] [2023-02-05 21:33:46 +1300]
| run:acquire_read_lock
2.43s info: ActionView::CacheExpiry::Executor [oid=0x3728] [ec=0x53c0] [pid=12809] [2023-02-05 21:33:46 +1300]
| complete:release_read_lock
We can see the ec is different between the acquire and release. That's a problem.
It looks like something we might be able to fix in falcon/rack-protocol. It looks like falcon isn't calling close on the same fiber that invoked app.call(env) which messes up locking that expects to hold a lock for the entire response being written.
Acquire
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/dependencies/interlock.rb:41:in `permit_concurrent_loads'
/home/samuel/.gem/ruby/3.2.0/gems/actionview-7.0.4.2/lib/action_view/cache_expiry.rb:14:in `run'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:29:in `before'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:423:in `block in make_lambda'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:199:in `block (2 levels) in halting'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:687:in `block (2 levels) in default_terminator'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:686:in `catch'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:686:in `block in default_terminator'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:200:in `block in halting'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:595:in `block in invoke_before'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:595:in `each'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:595:in `invoke_before'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:106:in `run_callbacks'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:129:in `run'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:125:in `run!'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:78:in `block in run!'
\u003cinternal:kernel\u003e:90:in `tap'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:75:in `run!'
/home/samuel/.gem/ruby/3.2.0/gems/actionpack-7.0.4.2/lib/action_dispatch/middleware/executor.rb:12:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/actionpack-7.0.4.2/lib/action_dispatch/middleware/static.rb:23:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/rack-2.2.6.2/lib/rack/sendfile.rb:110:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/actionpack-7.0.4.2/lib/action_dispatch/middleware/host_authorization.rb:137:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/railties-7.0.4.2/lib/rails/engine.rb:530:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/adapter/rack2.rb:82:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/middleware.rb:33:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/rewindable.rb:58:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/middleware.rb:33:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/content_encoding.rb:29:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/middleware.rb:33:in `call'
/home/samuel/.gem/ruby/3.2.0/gems/async-http-0.60.1/lib/async/http/server.rb:67:in `block in accept'
/home/samuel/.gem/ruby/3.2.0/gems/async-http-0.60.1/lib/async/http/protocol/http2/server.rb:74:in `block in each'
/home/samuel/.gem/ruby/3.2.0/gems/async-2.3.1/lib/async/task.rb:107:in `block in run'
/home/samuel/.gem/ruby/3.2.0/gems/async-2.3.1/lib/async/task.rb:248:in `block in schedule'
Release
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:37:in `before'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:423:in `block in make_lambda'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:199:in `block (2 levels) in halting'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:687:in `block (2 levels) in default_terminator'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:686:in `catch'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:686:in `block in default_terminator'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:200:in `block in halting'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:595:in `block in invoke_before'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:595:in `each'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:595:in `invoke_before'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/callbacks.rb:106:in `run_callbacks'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:143:in `complete'
/home/samuel/.gem/ruby/3.2.0/gems/activesupport-7.0.4.2/lib/active_support/execution_wrapper.rb:137:in `complete!'
/home/samuel/.gem/ruby/3.2.0/gems/actionpack-7.0.4.2/lib/action_dispatch/middleware/executor.rb:15:in `block in call'
/home/samuel/.gem/ruby/3.2.0/gems/rack-2.2.6.2/lib/rack/body_proxy.rb:29:in `close'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-rack-0.2.4/lib/protocol/rack/body/enumerable.rb:58:in `close'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/body/wrapper.rb:32:in `close'
/home/samuel/.gem/ruby/3.2.0/gems/protocol-http-0.24.0/lib/protocol/http/body/deflate.rb:36:in `close'
/home/samuel/.gem/ruby/3.2.0/gems/async-http-0.60.1/lib/async/http/protocol/http2/output.rb:109:in `passthrough'
/home/samuel/.gem/ruby/3.2.0/gems/async-2.3.1/lib/async/task.rb:107:in `block in run'
/home/samuel/.gem/ruby/3.2.0/gems/async-2.3.1/lib/async/task.rb:248:in `block in schedule'
Okay the root cause is not just that the concurrent-ruby lock wasn't working, but that it's too specific. In the case of Rails, it's not so much a lock as it is a barrier to prevent the cache being cleared while there are requests in flight. I made a PR to rails to simplify this to use a counter, which should in theory fix the issue. Local tests look okay to me, but maybe you can monkey patch it into your stack and see what happens?
@ioquatix it looks like that might have fixed it for me! I haven't been able to reproduce it since patching in your Rails change, and I was previously able to get it to break in under a minute of just refreshing my app.
I did run into a new issue though 😄 Looks like something may be off with ActiveSupport::CurrentAttributes + falcon. Ah ok no, sorry I'm writing this comment and researching at the same time. Looks like I should be configuring rails to use fiber isolation level when using Falcon, and I hadn't seen that specified before now. I'm sure that's my issue here.
# config/application.rb
# https://guides.rubyonrails.org/configuring.html#config-active-support-isolation-level
config.active_support.isolation_level = :fiber
Thanks so much for digging into this! 🙏
Unfortunately I realised the current implementation is not re-entrant for the write lock mechanism. Back to the drawing board.
Okay, I managed to figure out how to implement the read_lock -> write_lock upgrade, the latest version https://github.com/rails/rails/pull/47257 should be the correct solution.
Okay, apparently the correct way to fix this is here: https://github.com/rails/rails/pull/47347
Hopefully that's sufficient, I'll test it after it's merged.
Okay, I've confirmed this is fixed by the above PR which is now merged. Please open a new issue if you see the problem again.
@ioquatix, thank you for looking into this, and persistently working to fix it! 🤝