truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

Introduce accessor method cache to OpenStruct

Open MattAlp opened this issue 3 years ago • 2 comments

define_method when used with a block, generates a new call target every time. This is because the captured environment for the block needs to be injected into the method.

This PR rewrites the implementation of ostruct to instead of using define_method every time, instead to cache the generated accessor methods as mixin modules within the class. This removes the constant re-generations of methods and more importantly, call targets, that currently occurs when using ostruct.

This is accomplished by creating a hash of property names and corresponding modules that contain a getter and setter method for that property, and including these generated methods upon subsequent member definitions. This means no new call targets are generated when the application is stable.

For example this code:

require 'ostruct'

puts RUBY_DESCRIPTION

def foo(o)
  o.bar
end

loop do
  o = OpenStruct.new
  o.bar = 14
  foo(o)
end

Used to generate a new pair of call targets for every iteration. After this PR, it does not.

This further allows users of the objects to have stable call target caches (the lookup of the call target is still megamorphic - due to singleton classes - we're fixing this as well.)

In this graph, we can see that the accessor was a megamorphic call before, as the call targets were fresh each time. After this PR, in the second graph, the call target is stable (as we already said, the call target lookup is not, that's a separate issue, we're fixing.)

Before: image

After: image

1245 is the read from the hash table.

before.pdf after.pdf

MattAlp avatar Aug 08 '22 15:08 MattAlp

Interesting approach. The trade-off is a footprint increase to have an extra module per such generated pair of accessors. It'd be good to have benchmark numbers in addition to just the graphs, could you make a benchmark-ips or benchmark-interface bench from your example and report numbers for this PR (and also measure on CRuby)?

Do you know why this is less problematic on CRuby? Maybe they don't create extra call targets for define_method(name) {} (and define_singleton_method) ? CRuby still pays the extra extra cost of creating a singleton class for every OpenStruct instance, which sounds pretty bad already. I always thought OpenStruct is implemented in a weird way, which probably only makes sense on CRuby and even there is likely suboptimal.

In general I think we should rely on having a fast method_missing for ostruct, so we don't even need to define methods. I think we should try this approach and compare. Would you like to try that or you'd prefer if I give it a try?

It seems like a perfect example of why optimizing method_missing and splitting all accesses (i.e. calls to method_missing) unlocks optimizations and removes megamorphism.

It also seems like a perfect example of why different Shapes for the same class are useful. ostruct currently uses a Hash for storage instead of ivars, so that's a separate step.

eregon avatar Aug 09 '22 11:08 eregon

BTW there is a long list of caveats for OpenStruct including performance: https://github.com/oracle/truffleruby/blob/4054469bbd985db7dc5c86299004ff86a2835baf/lib/mri/ostruct.rb#L67-L107

Creating an open struct from a small Hash and accessing a few of the entries can be 200 times slower than accessing the hash directly.

So it sounds like CRuby does not have a fast OpenStruct at all if those numbers still hold.

It'd be great if we can optimize OpenStruct accesses as good or even better than a plain Hash's accesses, that'd be worth a blog post etc.

Not defining methods dynamically would also avoid the issue about leaking symbols for methods.

eregon avatar Aug 09 '22 11:08 eregon