jruby
jruby copied to clipboard
Convert all global variables to invokedynamic call sites.
Summary
This is an experiment in using invokedynamic for global variable storage. In the old logic for invokedynamic global variable access, there were a number of challenges to maintaining a SwitchPoint, invalidating it appropriately, and not racing with value updates. This commit makes all internal "accessors" actually be MutableCallSite instances via the new GlobalSite class. These sites customize their own behavior, wiring up appropriate transformations for read-only, lazily-calculated, non-global and other types of global variables.
Performance
It's unclear how much of a performance gain a typical case will see, given the limited use of global variables in the wild. However some variables see more frequent usage, like $VERBOSE or $DEBUG for logging frameworks.
I've done limited experiments with a loop that checks a global:
$foo = true
def foo
i = 0
while i < 10_000_000
i+=1 # if $foo
end
end
p 1000.times.map {
t = Time.now
foo
Time.now - t }.sum / 1000 / 10_000_000
On JRuby 9.2.4.0 on C2, the loop dominates and each iteration averages 5.50ns, while the version with the globals averages 5.67ns, for an increased cost of 0.17ns.
With this patch on C2, the same values are 5.50ns and 5.56sms, increasing cost only 0.06ns, a savings of 0.11ns.
JRuby 9.2.4.0 on Graal JIT averages 0.91ns and 2.53ns, a more obvious hit of 1.62ns.
This patch on Graal JIT averages 0.90ns and 0.92ns, an increase of 0.02ns and a savings of 1.6ns
These numbers were gathered on my laptop, so throttling and such may have an impact here, but global reads are consistently less of a hit on the patch.
Caveats
This initial patch also makes non-global variables such as $~ and $_ into call sites, but they just permanently bind to a thread or frame-local accessor. No attempt is made to cache their values.
I have not installed any sort of failover to the global call sites to prevent constant deoptimization of methods that access frequently-modified globals. This will likely still be needed, since updating a MutableCallSite should have similar overhead to a SwitchPoint invalidation. It should be possible to simply force such call sites to bind permanently as with the non-globals above.
See #4808, #5525.
Is a MutableCallSite correct here, or should it be a VolatileCallSite? I guess MutableCallSite is fine when going to the fallback, as other threads would notice the SwitchPoint is invalid, but what when the global variable value changes and < maxfail? Those updates should be observed immediately by other threads.
@eregon I believe this is ok, since the contents of the call site are not themselves unguarded. The call site will always point to either the fallback, the uncached lookup, or a SwitchPoint-guarded constant value. The first two always return to the global variable accessor's volatile field, and the SwitchPoint will have volatile semantics preventing any threads from traversing it once invalidated. Even if a thread happens to cache the guarded constant in a non-volatile way, it will simply branch back into the fallback logic once the SwitchPoint has been invalidated.
@headius Where is the SwitchPoint for the constant case? I don't see it in https://github.com/jruby/jruby/pull/5536/files#diff-d527c65cb35e7631dd5519a097ca51a3
Moved to 9.4 as part of larger perf work there.
I attempted to rebase this one on master but it has gotten too far off during 9.4 changes. I will endeavor to recreate these optimizations after we release 9.4 later this month.