DeepCloner icon indicating copy to clipboard operation
DeepCloner copied to clipboard

HashSet contains method return false after deep clone

Open DemoJameson opened this issue 5 years ago • 8 comments

var clonedHashSet = new HashSet<object>{new object()}.DeepClone();
var clonedObject = clonedHashSet.First();
clonedHashSet.Contains(clonedObject); // false

clonedHashSet.Clear();
clonedHashSet.Add(clonedObject);
clonedHashSet.Contains(clonedObject); // true

DemoJameson avatar Aug 20 '20 07:08 DemoJameson

This is complex issue, related to missing GetHashCode implementation for object. Without implementation — object reference is used. So, after cloning clonedObject has another reference and GetHashCode for it returns another value. In real situation correct implementation of GetHashCode is very recommended. I'll think about possible variants of mitigation of this problem.

force-net avatar Aug 22 '20 14:08 force-net

Ran into this issue after switching to DeepCloner from BinaryFormatter. BinaryFormatter had the same fundamental issue, but I was able to work around it by using the [OnDeserialized] hook like this:

public class MyObject<T> : IList<T>
{
    private HashSet<T> _uniqueItemTracker = new HashSet<T>();

    //Omitting IList<T> implementation

    [OnDeserialized]
    private void OnDeserialized(StreamingContext context)
    {
        //After cloning, OnDeserialized invoked and the _uniqueItemTracker is re-populated by creating a new HashSet off the current contents
        _uniqueItemTracker = new HashSet<T>(this);
    }
}

This doesn't solve the fundamental problem, but a post-DeepClone hook would at least allow us to write code to fix it ourselves.

zane-woodard-drc avatar Sep 23 '21 23:09 zane-woodard-drc

Currently, I'm trying to find best solution to mitigate this issue. E.g. this code will not work by design (we clone set entirely but check with non-cloned object).

var key = new object();
var set = new HashSet<object> {key};
var cloned = set.DeepClone();
Assert.That(cloned.Contains(key)); // failed

This case can occur in any complex cloning and cannot be solved without custom handlers.

For you case, you can wrap set with new set, but EqualityComparer can be passed to HashSet (so, we need to pass it to new Set). Also, there are a lot of Dictionaries and Sets implementations with custom behavior, so, it hard to make correct general cloning.

force-net avatar Oct 20 '21 19:10 force-net

Is there any update on this? There's some weird things going on with strings as well.

Note that this only fails on .net 6.0 and passes on netcoreapp3.1. It could probably have something to do with an optimization in HashSet<T> for when T == typeof(string).

var set = new HashSet<string> { "value" };
set.Contains("value").Should().BeTrue(); // pass

var cloned = set.DeepClone();
cloned.Contains("value").Should().BeTrue(); // pass

var copyOfSet = new HashSet<string>(set, set.Comparer);
copyOfSet.Contains("value").Should().BeTrue(); // pass

var copyOfCloned = new HashSet<string>(cloned, cloned.Comparer);
(copyOfCloned.ToArray()[0] == "value").Should().BeTrue(); // pass

copyOfCloned.Contains("value").Should().BeTrue(); // fail

cguedel avatar Apr 22 '22 09:04 cguedel

If I hack together a separate processing method for HashSet, where I just return new HashSet<string>(objFrom, objFrom.Comparer), this test works. so it's probably only related to strings...

cguedel avatar Apr 22 '22 10:04 cguedel

It's really strange, but interesting issue. When HashSet created it checks:

  1. is comparers are same (for strings it should be true, but there are lot of internal magic to create Comparer.Default) - in this case it just copies internal HashSet objects
  2. if it differs code like to new HashSet(comparer).UnionWith(source.AsEnumerable())

In both cases result should be identical. So, I'll try to reproduce problem and found a source of this issue.

force-net avatar Apr 22 '22 15:04 force-net

I found an issue and fix it (will do some more tests and will publish new version in some days). Problem only with strings (but another code can also cause similar problems). HashSet in .NET6.0 uses special comparer for strings if no specific comparer is provided. Also, it returns non-real Comparer when you ask for it (it uses NonRandomizedStringEqualityComparer but returns GenericEqualityComparer). When you create new instance of HashSet, it checks comparer == EqualityComparer<string>.Default - and it fails for cloned comparer. So, HashSet began to use GenericEqualityComparer instead of default and it produces different hash code So, it real .NET magic and it needs to be analyzed.

force-net avatar Apr 23 '22 14:04 force-net

Yeah I saw that too, not sure what's exactly going on there. Looking forward to a new release. I've tested a workaround where in the deep cloner code I detect instances of HashSet and simply use a new HashSet<string>(source, source.Comparer) to create the clone, this seems to work pretty good.

cguedel avatar Apr 25 '22 08:04 cguedel