UnitsNet icon indicating copy to clipboard operation
UnitsNet copied to clipboard

GetHashCode returns different hash for Equal units

Open dschuermans opened this issue 3 years ago • 16 comments

Describe the bug .GetHashCode() returns different hash for 2 units for which .Equals() return true

To Reproduce

Length inCm = Length.FromCentimeters(100);
Length inM = Length.FromMeters(1);

Assert.AreEqual(inCm, inM);
Assert.AreNotEqual(inCm.GetHashCode(), inM.GetHashCode());

Expected behavior The GetHashCode() should return the same value because the .Equals() returns true

Additional context From the remarks:

Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes.

I've peeked in the sources and I think the issue is, that the value from which the unit is constructed is used when calculating the hashcode:

https://github.com/angularsen/UnitsNet/blob/7f67ffba914705eb27000b7d2962a90e0aebf70a/UnitsNet/GeneratedCode/Quantities/Length.g.cs#L1025-L1032

Shouldn't this be normalized to the unit's base unit or something along those lines?

Reason I'm asking is because I'm doing IEquatable implementations on my objects and in the .Equals(...) method, I'm comparing Units to a max of 5 decimals while in the GetHashCode() method I'm simply using the GetHashCode() method from the unit.

During the PR someone pointed out that for the GetHashCode(), the unit should also be rounded using the same logic as in the .Equals(...) method:

        /// <inheritdoc />
        public bool Equals(ResultAnalysis1DInternalForces other)
        {
            if (ReferenceEquals(null, other))
            {
                return false;
            }

            if (ReferenceEquals(this, other))
            {
                return true;
            }

            return base.Equals(other)
                && Equals(Member, other.Member)
                && Equals(MemberRib, other.MemberRib)
                && Index == other.Index
                && Section.UnitsNetEquals(other.Section)
                && N.UnitsNetEquals(other.N)
                && Vy.UnitsNetEquals(other.Vy)
                && Vz.UnitsNetEquals(other.Vz)
                && Mx.UnitsNetEquals(other.Mx)
                && My.UnitsNetEquals(other.My)
                && Mz.UnitsNetEquals(other.Mz);
        }

        /// <inheritdoc />
        public override bool Equals(object obj)
        {
            return ReferenceEquals(this, obj) || obj is ResultAnalysis1DInternalForces other && Equals(other);
        }

        /// <inheritdoc />
        public override int GetHashCode()
        {
            unchecked
            {
                int hashCode = base.GetHashCode();
                hashCode = (hashCode * 397) ^ (Member != null ? Member.GetHashCode() : 0);
                hashCode = (hashCode * 397) ^ (MemberRib != null ? MemberRib.GetHashCode() : 0);
                hashCode = (hashCode * 397) ^ Index;
                hashCode = (hashCode * 397) ^ Section.GetHashCode();
                hashCode = (hashCode * 397) ^ N.GetHashCode();
                hashCode = (hashCode * 397) ^ Vy.GetHashCode();
                hashCode = (hashCode * 397) ^ Vz.GetHashCode();
                hashCode = (hashCode * 397) ^ Mx.GetHashCode();
                hashCode = (hashCode * 397) ^ Mz.GetHashCode();
                return hashCode;
            }
        }
        public static bool UnitsNetEquals<TUnit>(this TUnit value, TUnit other)
        {
            if (value == null && other == null)
            {
                return true;
            }

            if (value != null && other != null && value is IQuantity thisUnit && other is IQuantity otherUnit)
            {
                try
                {
                    double thisValue = thisUnit.Value;
                    double otherValueInThisUnits = otherUnit.As(thisUnit.Unit);

                    return Comparison.Equals(thisValue, otherValueInThisUnits, ComparingConstants.DoubleComparisionDelta, ComparisonType.Absolute);
                }
                catch (ArgumentException e)
                {
                    return false;
                }
            }

            return EqualityComparer<TUnit>.Default.Equals(value, other);
        }

dschuermans avatar Jan 25 '22 08:01 dschuermans

I managed to "fix" it, at least for my needs, with the following extension:

public static int GetUnitsNetHashCode<TUnit>(this TUnit unit)
{
	if (unit is IQuantity quantity)
	{
		var valueOfUnitAsBaseUnitRounded = Math.Round(quantity.As(quantity.QuantityInfo.BaseUnitInfo.Value), ComparingConstants.DecimalPlaces, MidpointRounding.AwayFromZero);

		return (new
		{
			quantity.QuantityInfo.Name,
			valueOfUnitAsBaseUnitRounded,
			quantity.QuantityInfo.BaseUnitInfo.Value
		}).GetHashCode();
	}
	else
	{
		return unit?.GetHashCode() ?? 0;
	}
}

The following test now succeeds:

[TestMethod]
public void MyTestMethod()
{
	Length inCm = Length.FromCentimeters(100);
	Length inM = Length.FromMeters(1);

	Assert.IsTrue(inCm.Equals(inM));
	Assert.AreNotEqual(inCm.GetHashCode(), inM.GetHashCode());
	Assert.AreEqual(inCm.GetUnitsNetHashCode(), inM.GetUnitsNetHashCode());
}

So instead of using the regular .GetHashCode() in my own GetHashCode() implementations, I'm using GetUnitsNetHashCode() instead:

/// <inheritdoc />
public override int GetHashCode()
{
	unchecked
	{
		int hashCode = base.GetHashCode();
		hashCode = (hashCode * 397) ^ (Member2D != null ? Member2D.GetHashCode() : 0);
		hashCode = (hashCode * 397) ^ Edge;
		hashCode = (hashCode * 397) ^ Index;
		hashCode = (hashCode * 397) ^ Section.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Mx.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ My.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Mxy.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Vx.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Vy.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Nx.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Ny.GetUnitsNetHashCode();
		hashCode = (hashCode * 397) ^ Nxy.GetUnitsNetHashCode();
		return hashCode;
	}
}

image

dschuermans avatar Jan 25 '22 09:01 dschuermans

Another case where it goes wrong:

Dictionary<Length, string> test = new Dictionary<Length, string>();

Length one = Length.FromMeters(1);
Length two = Length.FromCentimeters(100);

test.Add(one, "The one with meters");
test.Add(two, "The one with centimeters");

Assert.AreEqual(2, test.Count);
Assert.AreEqual("The one with meters", test[one]);
Assert.AreEqual("The one with centimeters", test[two]);
Assert.AreNotEqual(one.GetHashCode(), two.GetHashCode());
Assert.AreEqual(one, two);

There should only be 1 item in the dictionary? Or a duplicate key error at the very least.

dschuermans avatar Jan 25 '22 13:01 dschuermans

Luckily, my fix doesn't seem to break stuff (found this test in #541)

var length = new Length(1.0, (LengthUnit)2);
var area = new Area(1.0, (AreaUnit)2);

Assert.AreNotEqual(length, area);
Assert.AreNotEqual(length.GetHashCode(), area.GetHashCode());
Assert.AreNotEqual(length.GetUnitsNetHashCode(), area.GetUnitsNetHashCode());

dschuermans avatar Jan 25 '22 15:01 dschuermans

@angularsen or @tmilnthorp any thoughts?

dschuermans avatar Jan 27 '22 13:01 dschuermans

Hi, will try to respond soon, need to digest it all first

angularsen avatar Jan 27 '22 18:01 angularsen

This is a big can of worms.

I probably can't summarize it very well, it's been a long time, but I believe it goes something like this:

Current state

Equals(a, b) converts b to the unit of a and then compares their double/decimal values exactly.

This often bites people in the ass due to subtle rounding errors. Also, it is not apparent to the consumer that some quantities use decimal and others use double as the internal value representation. Many folks are not even familiar with rounding errors in floating-point arithmetic.

After much back and forth we just haven't been able to agree on a way to implement it that feels correct and intuitive.

3 solutions

Two of these are discussed in #717 and #838, the PR that got closest to addressing this before it lost its momentum.

1 Strict equality

Let Equals/GetHashCode be based on new { QuantityName, Value, Unit }.

If the value + unit doesn't match exactly, they are not equal. Simple, arguably correct, will 100% piss off consumers trying to equate 100 cm to 1 meter.

Unintuitively, these are also not equal: Length.Zero and Length.FromCentimeters(0)

2 Rounded equality

Perform some rounding to mitigate most rounding errors in practice. Similar to the current implementation, Equals(a, b) converts b to the unit of a, but performs rounding before comparing their double/decimal values.

However, let's say we agree on rounding to integers just to make a point.

  • Equals({0.8 m}, {0.9 m}) => Equals({1 m}, {1 m}) == true with equal hash codes. Easy.
  • Equals({0.5 m}, {0.51 m}) => Equals({0 m}, {1 m}) == false. Unintuitive edge-case, but correct.

Now, let's agree to round to 1e-5 as was proposed as a sane default. If you are working with very small numeric values like 1e-9, a precision of 1e-5 is pretty much useless in terms of comparing values this small. The same with very large values.

#838 tried to find a universal rounding function and it kind of seemed to work with several test cases around it, but it all got very complicated to understand and be sure that would work well for all sizes of numbers and units, including double vs decimal. In the end the author also agreed that this may not be the way to go.

3 Cop out and remove IEquatable

This is currently where we are headed in #982 unless someone can champion a better solution.

We offer overloads of Equals() that take a double tolerance and the option to perform absolute or relative comparisons. It is the better choice to use this overload unless you absolutely need to work with the IEquatable interface.

Closing words

It's a big wall of text, but since you are currently facing this, what do you think would be the better solution here?

cc @lipchev for comments

angularsen avatar Jan 28 '22 20:01 angularsen

I think the approach that @dschuermans is using in the ResultAnalysis1DInternalForces is the correct one- I'm also doing this for some of my classes- like when comparing two "measurements" (e.g. from an analytical balance)- I know what the resolution is, so I can safely implement a value based equality contract (think about rounding to the nearest nanogram). In other cases I known that the relative difference between the expected and the target concentration should not be more than some % - so I implement it accordingly (typically in a IEqualityComparer of the given class).

I don't know if there is a reasonable default rounding precision that can be assumed when comparing using the default Equals/GetHashCode methods. By the way- I just recently discovered how Delphi 6 implements the SameValue function:

const
  FuzzFactor = 1000;
  ExtendedResolution = 1E-19 * FuzzFactor;
  DoubleResolution   = 1E-15 * FuzzFactor;
  SingleResolution   = 1E-7 * FuzzFactor;

function SameValue(const A, B: Double; Epsilon: Double): Boolean;
begin
  if Epsilon = 0 then
    Epsilon := Min(Abs(A), Abs(B)) * DoubleResolution;
  Result := Abs(A - B) <= Epsilon;
end;

However in #838 I didn't actually go with the rounding approach- but was rather relying (at least initially) on the use of the IComparable interface - which is still not fixed (this test would fail in the current code-base).

This main problem with the IComparable approach was that it isn't transitive - X.ToUnit(..).ToUnit(X.Unit) doesn't necessarily convert back to X so- having a conversion function in the Equals method would seem like a bad idea.

As much as I'd like to have the proper value-based-unit-agnostic-equality-comparison-contract, it just doesn't seem like it's possible with the double arithmetic (I personally think that a decimal / rational type would have been better suited for use with real quantities- but that's another can of worms).

As for fixing the Doubles- I think the best approach would be to have the strict equality contract (with an exception for Zero) and a Roslyn Analyzer that should generate a warning when 'comparing floating point numbers'.

lipchev avatar Jan 29 '22 00:01 lipchev

This is a big can of worms.

Why do I always seem to be asking the hard questions? 😅

I can understand the difficulty in providing a solution that works for all, hence I'm not really asking for a "fix this now" I was just pointing out that something didn't work as I expected it to work and was wondering if you guys were aware of this.

That being said, I will most likely not be the last one to bump into these kind of issues so it might not be a bad idea to do -something- with it.

Maybe have some explicit mention of these "issues" in the docs and provide workarounds for the problem? (like my extension method) Or come up with a way so that consumers can have a choice on how Equals and GetHashCode behave, rounding values up to a certain amount of decimals etc.

In our case for example, we decided that 1E-05 is to be used when comparing UnitsNet units. All our objects implement IEquatable so for each UnitsNet property, we use the .Equals(other, tolerance, comparisontype) overload when checking equality.

dschuermans avatar Jan 31 '22 08:01 dschuermans

I don't think it's a big can of worms at all 😃 I'm all for strict equality.

The documentation is clear:

The following statements must be true for all implementations of the Equals(Object) method. In the list, x, y, and z represent object references that are not null.

  • x.Equals(x) returns true.
  • x.Equals(y) returns the same value as y.Equals(x).
  • x.Equals(y) returns true if both x and y are NaN.
  • If (x.Equals(y) && y.Equals(z)) returns true, then x.Equals(z) returns true.
  • Successive calls to x.Equals(y) return the same value as long as the objects referenced by x and y are not modified.
  • x.Equals(null) returns false.

x.Equals(y) returns the same value as y.Equals(x) means we can not have any conversion in the default Equals method.

Doing any sort of rounding or "near values" in GetHashCode increases the chance of collisions and reduces efficiency for HashSets, Dictionaries, etc.

A lot of methods on collections support passing in custom comparison/IComparer implementations. And the hash based collections can use a custom IEqualityComparer. However the default must be strict per the Framework rules.

The overloaded Equals can help with the cases such as 1m.Equals(100cm).

On a related note, I'm not a fan of an exception for "Zero" quantities. For example, a length of 0 and a force of 0 mean an absence of value, but in temperatures 0F != 0C. In fact, I think we should remove the Zero property entirely.

tmilnthorp avatar Jan 31 '22 12:01 tmilnthorp

I am leaning towards the same. Strict equality. It is the most consistent with the docs you posted, as well as how record types work.

My only problem with it, is that IEquatable interface on quantities becomes pretty much useless. I don't know when I would ever use it, so one could argue we might as well remove the interface? But I guess it is .NET conventions for value types.

At least you know what you gun get. - Tom Hanks, probably.

angularsen avatar Feb 02 '22 07:02 angularsen

I think we should remove the Zero property entirely.

I would not take it that far. As long as a quantity has a well defined base unit, which all ours have, there can be a consistent Zero value. For Temperature, it is 0 Kelvins.

However, yes, if the special case is implemented as As(BaseUnit) == 0, so you may have to convert 0 Celsius of 0 Fahrenheit to Kelvins first, then it involves possible rounding errors and I agree we should not have this special case for zero.

It is easier if it is consistent, with its quirks and warts. I think we all realize by now, that there is no one perfect solution here that isn't going to trip some people up. The best we can do is be very clear in xmldoc and wiki.

In summary, I am for:

  • Strict equality, without "zero" exception. (alternative 1)
  • ..or removing IEquatable (alternative 3)

angularsen avatar Feb 02 '22 07:02 angularsen

My only problem with it, is that IEquatable interface on quantities becomes pretty much useless. I don't know when I would ever use it, so one could argue we might as well remove the interface? But I guess it is .NET conventions for value types.

Regardless of the Equals(object) implementation, IEquatable<T> is not useless. It's merely the same Equals method, but with a strong type.

Also, see here:

The IEquatable<T> interface is used by generic collection objects such as Dictionary<TKey,TValue>, List<T>, and LinkedList<T> when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.

It is also slightly more performant than Equals(object).

tmilnthorp avatar Feb 02 '22 20:02 tmilnthorp

We can keep the interface, but I think the main argument is to conform to .NET conventions.

To avoid value boxing and support equality checks in generic collections is nice on paper, but I can't really see a real world scenario for using strict value+unit equality checks myself. Except maybe trivial examples where the unit is always the same and values are not subject to rounding errors.

People should use the overload that allows to specify tolerance or passing in their own EqualityComparer, outside this interface.

But enough of that. Can we land on keeping IEquatable then, with strict equality checks. Alternative 1?

angularsen avatar Feb 02 '22 22:02 angularsen

We can keep the interface, but I think the main argument is to conform to .NET conventions.

To avoid value boxing and support equality checks in generic collections is nice on paper, but I can't really see a real world scenario for using strict value+unit equality checks myself. Except maybe trivial examples where the unit is always the same and values are not subject to rounding errors.

People should use the overload that allows to specify tolerance or passing in their own EqualityComparer, outside this interface.

But enough of that. Can we land on keeping IEquatable then, with strict equality checks. Alternative 1?

That sounds good to me!

tmilnthorp avatar Feb 02 '22 22:02 tmilnthorp

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '22 05:04 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 21 '22 02:09 stale[bot]