drb icon indicating copy to clipboard operation
drb copied to clipboard

Use a weak equality map instead of _id2ref

Open headius opened this issue 4 years ago • 7 comments

This PR moves away from using _id2ref for object references, instead using a weak-valued Hash with appropriate locking and cleaning logic.

From the primary commit:

The ObjectSpace._id2ref method is considered an internal API of
CRuby and is difficult to implement efficiently on implementations
that do not have direct control over the garbage-collected heap.
On JRuby, for example, arbitrary _id2ref object lookup is normally
disabled, as the implementation requires maintaining a parallel
mapping of object IDs to weak object references, adding a very
large amount of overhead to all object ID uses. As a result, Ruby
standard libraries should try to avoid using _id2ref.

This patch introduces a weak map into DRb for managing object
references without _id2ref. The map is weak-valued, with a clean
subroutine to scrub out defunct entries. Puts initiate a clean,
hopefully ensuring that very few stale entries accumulate.

This patch does not use WeakMap do to limitations of its API.
WeakMap specifies that keys are identity-based, and as of Ruby 3.0
allows immediate values like Symbols and Integers. In order to
support this behavior, those values must be idempotent, which on
JRuby is impossible to support for fixnum-ranged Integers and
flonum-ranged Floats, rendering it useless as an object ID-keyed
cache.

This implementation currently hangs partway through the tests, and I am unsure why. I wanted to get this implementation out there to get help refining it and getting tests passing, since _id2ref is very problematic on JRuby and JRuby would like to work with the released versions of DRb when we release our Ruby 3.0-compatible release later this year.

headius avatar Nov 18 '21 12:11 headius

cc @hsbt @nobu for help... the logic here has been working for many years on JRuby, albeit without the locking and interrupt handling. I am not sure yet why the specs hang partway through (apparently in test_02_basic_object.

cc @eregon because I assume _id2ref is undesirable on TruffleRuby as well.

headius avatar Nov 18 '21 12:11 headius

@jeremyevans Review request was suggested by GH... feel free to ignore if you have no interest in this library.

headius avatar Nov 18 '21 12:11 headius

Those runs were clearly not completing so I will try to figure out why it hangs with my version.

headius avatar Nov 18 '21 13:11 headius

I see now there is a WeakIdConv that implements something similar using WeakMap. That may be acceptable if users realize that fixnum/flonum keys on JRuby have no guarantee of idempotency.

So I thought I would try to run the tests using that as the default idconv but I don't think I'm hooking it up right. Is more than this needed?

diff --git a/test/drb/test_drbobject.rb b/test/drb/test_drbobject.rb
index 2b0e206..57d9783 100644
--- a/test/drb/test_drbobject.rb
+++ b/test/drb/test_drbobject.rb
@@ -36,7 +36,7 @@ class TestDRbObject < Test::Unit::TestCase
   include DRbObjectTest
 
   def setup
-    DRb.start_service
+    DRb.start_service(nil, nil, {:idconv => DRb::WeakIdConv.new})
   end
 end
 

Any reason we wouldn't want to just make this the default idconv? Maybe only on JRuby and other impls where _id2ref is problematic?

headius avatar Nov 18 '21 13:11 headius

cc @seki who is the official maintainer of drb (https://github.com/ruby/ruby/blob/master/doc/maintainers.rdoc)

eregon avatar Nov 18 '21 14:11 eregon

So I thought I would try to run the tests using that as the default idconv but I don't think I'm hooking it up right. Is more than this needed?

A quick git grep leads to:

lib/drb/drb.rb:327:# default implementation is provided by DRb::DRbIdConv.  It uses an
lib/drb/drb.rb:359:  class DRbIdConv
lib/drb/drb.rb:1351:    @@idconv = DRbIdConv.new
lib/drb/drb.rb:1380:    # See #new().  The initial default value is a DRbIdConv instance.
lib/drb/drb.rb:1426:    #            to an instance of the class DRb::DRbIdConv.
lib/drb/drb.rb:1870:  # This is expected to be an instance such as DRb::DRbIdConv that responds to
lib/drb/drb.rb:1937:DRbIdConv = DRb::DRbIdConv
lib/drb/gw.rb:34:  class GWIdConv < DRbIdConv
lib/drb/timeridconv.rb:15:  class TimerIdConv < DRbIdConv
lib/drb/weakidconv.rb:11:  class WeakIdConv < DRbIdConv

Line 1351 seems relevant.

eregon avatar Dec 07 '21 20:12 eregon

@seki The main outstanding question is whether we should just replace the id2ref implementation with the WeakMap implementation, since that version should be more reliable and id2ref is problematic and effectively deprecated. If this is acceptable, I can modify this PR to use the WeakIdConv and remove the current default id2ref logic.

headius avatar Dec 07 '21 20:12 headius