Values icon indicating copy to clipboard operation
Values copied to clipboard

Optimize `#eql?`

Open ms-ati opened this issue 8 years ago • 5 comments

NOTE: Currently this commit is based of of PR #57, but can be separated if necessary.

Speed improvements:

When you have... Delta IPS
Same instance 15.6x
Different last field 10.2x
Equal fields 1.4x

Benchmarking gist: https://gist.github.com/ms-ati/3e62cca57ca32cd17ad2e2e2b14cc65e

Explanation of changes:

It came as a surprise to see the humble #eql? comparison in profiling -- we had not realized that in Ruby, if one wants to short-circuit comparison of a single instance of an object to itself, for example, that this logic must be explicitly added to #eql?. This accounts for 15.6x difference in iterations-per-second of same-instance comparison.

Furthermore, the cost of pre-calculating hash is already being paid at instantiation of every value object. So we add an early exit if the hash does not match. This accounts for 10.2x difference in iterations-per-second of comparisons of instances that differ in only the last field.

Finally, the cost of instantiating two temporary Array instances is significant when comparing two distinct instances which have equal field values. So we iterate the field comparisons without the temporary Arrays. This accounts for the 1.4x difference in iterations-per-second of comparisons of distinct instances with equal field values.

ms-ati avatar Apr 07 '17 20:04 ms-ati

Codecov Report

Merging #58 into master will decrease coverage by 0.35%. The diff coverage is 96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #58      +/-   ##
==========================================
- Coverage   96.42%   96.07%   -0.36%     
==========================================
  Files           1        1              
  Lines          84      102      +18     
==========================================
+ Hits           81       98      +17     
- Misses          3        4       +1
Impacted Files Coverage Δ
lib/values.rb 96.07% <96%> (-0.36%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 121e18d...a627364. Read the comment docs.

codecov-io avatar Apr 07 '17 20:04 codecov-io

One potential downside is that this code enforces that fields which have non-equal #hash values will cause the Value objects containing them to be unequal.

What is the Ruby language policy on unequal hash implies not eql? I believe this is the case in Java but not sure here.

ms-ati avatar Apr 07 '17 20:04 ms-ati

Ok, interesting. One data point on "can unequal hash imply not eql?"

A very common example of instances of different classes attempting full interchangeability with one another is ActiveSupport's TimeWithZone, which goes to some lengths to be interchangeable at the instance level with Ruby's Time. And in fact, it does ensure that both hash and eql? match, so enclosing Value objects do compare as equal:

tz = Time.zone.now
t = tz.to_time

[t.class == tz.class, t == tz, t.hash == tz.hash]
# [false, true, true]

A = Value.new(time)
A.new(tz) == A.new(t)
# true

So far so good! I am looking for authoritative documentation that Ruby classes are expected to honor the contract that eql? being true implies equal hash, and will post here if I can find it.

ms-ati avatar Apr 07 '17 21:04 ms-ati

Ok, as I had hoped, this is part of the Ruby language specification for Object#hash:

Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.

ms-ati avatar Apr 07 '17 21:04 ms-ati

So... even more interesting. It turns out that, because the existing Values gem pre-computes #hash but does not enforce the requirement that all attribute values are frozen, the existing gem allows inconsistent behavior in the edge case where an attribute value is mutated. For example:

# existing gem, not this branch
A = Value.new(:arr)

x = A.new([1])
y = A.new([1])

[x == y, x.hash == y.hash]
#=> [true, true]

# EDGE CASE: mutate value of attribute of y
y.arr << 2
y
#=> #<A arr=[1, 2]>

[x == y, x.hash == y.hash]
#=> [false, true]

ms-ati avatar Apr 08 '17 16:04 ms-ati