concurrent-ruby
concurrent-ruby copied to clipboard
Deadlock on wait_for_termination
Hey, I have a deadlock problem, code snippet like this:
thread_pool = Concurrent::FixedThreadPool.new(3)
res = Concurrent::ThreadLocalVar.new({})
while @queue.size > 0 do
thread_pool.post do
cur = @queue.pop
build_res = start(cur)
do_sth(res)
end
end
thread_pool.shutdown
thread_pool.wait_for_termination
Then here is what I got:
/RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:43:in `sleep': No live threads left. Deadlock? (fatal)
from /RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:43:in `wait'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:43:in `ns_wait'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/abstract_lockable_object.rb:43:in `ns_wait_until'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/atomic/event.rb:87:in `block in wait'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/atomic/event.rb:84:in `wait'
from RHEL5_64/DEV.STD.PTHREAD/build/private/env/ruby2.3.x/ruby2.3.x/lib/ruby/gems/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/ruby_executor_service.rb:49:in `wait_for_termination'
from package_builder.rb:64:in `bulk_build'
from lib/test.rb:152:in `<main>'
rake aborted!
Can someone help me out?
Hi guys, here the update, find the problem of deadlock. it is stupid.... its the datarace for the queue... An updated one would be like this:
begin
thread_pool = Concurrent::FixedThreadPool.new(3)
build_res = Concurrent::ThreadLocalVar.new({})
while @build_queue.size > 0 do
thread_pool.post do
if @build_queue.size > 0
cur_pkg = @build_queue.pop
build_res = WeakRef.new(start_build(cur_pkg))
cur_pkg.build_req = build_res[:req_id]
cur_pkg.status = build_res[:status]
cur_pkg.need_build = false
end
end
end
thread_pool.shutdown
thread_pool.wait_for_termination
rescue Exception => e
return
end
However, I am encountering a serious memory usage problem. The weird thing is, that if the task's queue(@build_queue) size is more than my threads number, say 4. My code's consuming memory will be huge(it will keep growing until it reach the limit during run process) But, if my queue size is less or equal to my threads number(<=3) then the memory usage is very stable----it won't grow rapidly. What is the right way to consume the task queue?
The usual way is to do
while job = @queue.pop
work_on job
end
Currently there is still a data-race between size and pop. Could you try that first, before we investigate the memory issue.