threadz
threadz copied to clipboard
Document common pitfalls regarding passing data to threads
Sample program: pool = Threadz::ThreadPool.new.new_batch foo = [1,2,3,4,5,6,7,8,9,10] while bar = foo.pop pool << lambda { puts "bar = #{bar}" } end pool.wait_until_done puts "DONE!"
Expected output: bar = 10 bar = 9 bar = 8 bar = 7 bar = 6 bar = 5 bar = 4 bar = 3 bar = 2 bar = 1 DONE!
Actual output: bar = 10 bar = 8 bar = 8 bar = 6 bar = 6 bar = 4 bar = 4 bar = 2 bar = 2 bar = DONE!
It is interesting to note that the following code does execute to match the expected output above: pool = Threadz::ThreadPool.new.new_batch foo = [1,2,3,4,5,6,7,8,9,10] #while bar = foo.pop foo.reverse.each do |bar| pool << lambda { puts "bar = #{bar}" } end pool.wait_until_done puts "DONE!"
pool = Threadz::ThreadPool.new.new_batch
foo = [1,2,3,4,5,6,7,8,9,10]
while bar = foo.pop
#foo.reverse.each do |bar|
puts "queueing bar: #{bar}"
pool << lambda { puts "bar = #{bar}" }
end
pool.wait_until_done
puts "DONE!"
Output: queueing bar: 10 bar = 10 queueing bar: 9 queueing bar: 8 bar = 8 bar = 8 queueing bar: 7 queueing bar: 6 bar = 6 bar = 6 queueing bar: 5 queueing bar: 4 bar = 4 bar = 4 queueing bar: 3 queueing bar: 2 bar = 2 bar = 2 queueing bar: 1 bar = DONE!
To further exacerbate the problem, one can pass latent=true to the batch: pool = Threadz::ThreadPool.new.new_batch(:latent => true) foo = [1,2,3,4,5,6,7,8,9,10] while bar = foo.pop #foo.reverse.each do |bar| puts "queueing bar: #{bar}" pool << Proc.new { puts "bar = #{bar}" } end pool.start pool.wait_until_done puts "DONE!"
Output: queueing bar: 10 queueing bar: 9 queueing bar: 8 queueing bar: 7 queueing bar: 6 queueing bar: 5 queueing bar: 4 queueing bar: 3 queueing bar: 2 queueing bar: 1 bar = bar = bar = bar = bar = bar = bar = bar = bar = bar = DONE!
It would appear that when doing the while array pop method, the values are getting bound late? That seems weird.
Hmmmmmmmm: foo = [1,2,3,4,5,6,7,8,9,10] baz = [] while bar2 = foo.pop baz << Proc.new { puts "bar2 from Proc is #{bar2}" } baz << lambda { puts "bar2 from lambda is #{bar2}" } baz << proc { puts "bar2 from proc is #{bar2}" } end baz.each {|x| x.call}
Output: bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is bar2 from Proc is bar2 from lambda is bar2 from proc is
I think we've just discovered a limitation (or rather, behavior) of the way closures are evaluated. It's the same in Ruby 1.9. So...I guess this isn't actually a bug. I'll add some documentation to the main README, or to a wiki page (say, Common Pitfalls), to document this.
Bizarre!!!
I've circumvented the issue by calling array.each rather than array.pop, but still....it seems like some very bizarre behavior....
esp. because if this works fine: foo = [1,2,3,4,5] while bar = foo.pop Thread.new do puts "bar = #{bar}" end end
Ah, but even that example demonstrates an interesting behavior -- when you create a new thread like that, it runs immediately. If you insert a sleep 0.1 before it prints (and then join on the thread later), the "bar" value is again absent!
foo = [1,2,3,4,5,6,7,8,9,10]
threads = []
while bar2 = foo.pop
threads << Thread.new do
sleep 0.1
puts "bars from thread is #{bar2}"
end
end
threads.each { |t| t.join }
Output: bars from thread is bars from thread is bars from thread is bars from thread is bars from thread is bars from thread is bars from thread is bars from thread is bars from thread is bars from thread is
Neat huh :P
Dude, I just figured it out. It's because of the while loop itself. It doesn't create its own scope/binding or whatever because there's no do...end or {...}. So all of the threads are getting the exact same binding, the binding where bar2 is rewritten at the start of each loop! Since Array#each takes a block, each block has its own binding and thus avoids this problem.
Very strange....where I the more pro-active type, I'd send a message to the ruby mailing list....but chances are I won't be bothered! :)
D'oh!
That makes perfect sense!