rehansaeed.github.io icon indicating copy to clipboard operation
rehansaeed.github.io copied to clipboard

[Comment] GetHashCode Made Easy

Open RehanSaeed opened this issue 5 years ago • 24 comments

https://rehansaeed.com/gethashcode-made-easy/

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Nathan Nathan commented on 2014-06-13 12:52:08

Very cool class!

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Savvas Savvas commented on 2014-07-30 13:23:01

Very very nice! Thank you.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Trenton McFarlane Trenton McFarlane commented on 2016-07-27 02:18:55

Very clean! Thank you! I've found but little help elsewhere on a clean implementation like this. Well done!

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Leonid Efremov Leonid Efremov commented on 2017-09-06 11:51:07

int hashCode = items.Select(x => GetHashCode(x)).Aggregate((x, y) => CombineHashCodes(x, y));

Will get Exception for empty array, maybe this is better to use:

var hashCode = items.Any() ? items.Select(GetHashCode).Aggregate(CombineHashCodes) : 0;

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2017-09-12 15:28:06

int hashCode = items.Select(x => GetHashCode(x)).Aggregate((x, y) => CombineHashCodes(x, y));

Will get Exception for empty array, maybe this is better to use:

var hashCode = items.Any() ? items.Select(GetHashCode).Aggregate(CombineHashCodes) : 0;

Thanks, I've updated the post.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Crispin Crispin commented on 2018-01-09 14:30:51

Thanks, I've updated the post.

Suggested correction is missing a null check on items. Enumerable.Any will null check argument and throw.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2018-01-23 11:51:52

Suggested correction is missing a null check on items. Enumerable.Any will null check argument and throw.

Added your suggestion. Thanks.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Union Union commented on 2018-03-24 01:45:03

Very cool! You should make this a nuget package

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Simeon Simeon commented on 2018-08-14 15:20:18

This struct is so great that it should go straight into the .NET Framework! Thank You, Rehan, for sharing it with us! However, I could not find any license information about it. Could you, please, let us know how we are allowed to use it.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2018-08-14 16:12:53

This struct is so great that it should go straight into the .NET Framework! Thank You, Rehan, for sharing it with us! However, I could not find any license information about it. Could you, please, let us know how we are allowed to use it.

Thanks! You are in fact the third person to ask about licensing. I have updated the blog post with information about a new helper available in .NET Core and also licensing information.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Matt Whitfield Matt Whitfield commented on 2018-08-16 09:30:28

Pretty sure this is wrong: 'Two equal objects are supposed to have the same hash code but sometimes this is not the case due to some maths that I'm not going to go into'. I think you mean 'Two objects that are not equal are supposed to have different hash codes'. Because having two objects that are equal and do not have the same hash code breaks equality.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2018-08-16 11:25:58

Pretty sure this is wrong: 'Two equal objects are supposed to have the same hash code but sometimes this is not the case due to some maths that I'm not going to go into'. I think you mean 'Two objects that are not equal are supposed to have different hash codes'. Because having two objects that are equal and do not have the same hash code breaks equality.

Sometimes there are hash collisions where objects which are not equal to each other produce the same hash.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Brandon Brandon commented on 2018-08-16 12:32:08

Sometimes there are hash collisions where objects which are not equal to each other produce the same hash.

Hi, great post and thanks for it, for clarification I thought Matt was indicating a possible need to update the article. I also think "Two equal objects are supposed to have the same hash code but sometimes this is not the case due to some maths that I'm not going to go into" is stated incorrectly. Two equal objects do indeed have the same hash. We are assuming you meant "Two un-equal objects are supposed to have different hash codes but sometimes this is not the case, a hash collision, due to some maths that I'm not going to go into". By definition equal objects can't have unequal hashes, correct?

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Matt Whitfield Matt Whitfield commented on 2018-08-16 13:27:04

Sometimes there are hash collisions where objects which are not equal to each other produce the same hash.

So might be an idea to update the phrase that's wrong above then?

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2018-08-16 14:55:21

So might be an idea to update the phrase that's wrong above then?

I re-read your comment and yes, you are totally right. Updated, thank you!

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Nadav Nadav commented on 2018-12-11 15:01:56

Shouldn't:

public HashCode AndEach(IEnumerable items)

Be:

public HashCode AndEach(IEnumerable items)

Also, if you want the code to be as fast as possible, then you should not use Linq, using

var temp = this.value;
foreach (T item in items)
{
    temp = CombineHashCodes(temp, GetHashCode(item));
}
return new HashCode(temp);

Is much faster....

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Oliver Oliver commented on 2018-12-14 09:46:36

Just another idea (I used in my implementation): HashCode has to be fast and collisions for different objects are allowed (than Equals() has to be called). So for collection it is not always wise to iterate over all elements (maybe there are millions in there) and maybe it is just enough to consider the first n-elements for generating the hash code. So for this purpose it would be great to enhance the .AndEach() call to take a second optional number parameter AndEach(IEnumerable items, int count =10) which will simply be added to the LINQ statement as items.Take(count).Select(GetHashCode). Within the usage you can then provide a meaningful number for your use-case. This update maybe changes existing implementations. To avoid this, the optional parameter could also be initialized with int.MaxValue to provide the current behaviour.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2018-12-14 11:40:03

Shouldn't:

public HashCode AndEach(IEnumerable items)

Be:

public HashCode AndEach(IEnumerable items)

Also, if you want the code to be as fast as possible, then you should not use Linq, using

var temp = this.value;
foreach (T item in items)
{
    temp = CombineHashCodes(temp, GetHashCode(item));
}
return new HashCode(temp);

Is much faster....

Good point about not using Linq.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2018-12-14 11:41:37

Just another idea (I used in my implementation): HashCode has to be fast and collisions for different objects are allowed (than Equals() has to be called). So for collection it is not always wise to iterate over all elements (maybe there are millions in there) and maybe it is just enough to consider the first n-elements for generating the hash code. So for this purpose it would be great to enhance the .AndEach() call to take a second optional number parameter AndEach(IEnumerable items, int count =10) which will simply be added to the LINQ statement as items.Take(count).Select(GetHashCode). Within the usage you can then provide a meaningful number for your use-case. This update maybe changes existing implementations. To avoid this, the optional parameter could also be initialized with int.MaxValue to provide the current behaviour.

That's an interesting idea. Not sure how often this use cases pops up.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Laksh Laksh commented on 2019-07-11 21:46:51

Are any of the hash code methods you mentioned are deterministic? The HashCode class from .NET Core is not deterministic.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2019-07-16 10:14:47

Are any of the hash code methods you mentioned are deterministic? The HashCode class from .NET Core is not deterministic.

Object.GetHashCode returns different values every time an application is restarted. So it is not strictly deterministic and you should not rely on it being so. It sounds like you want to use one of the other hash algorithms in the cryptographic namespace which are deterministic.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Nawfal Nawfal commented on 2020-05-06 20:36:52

Hi Rehan, is your HashCode implementation published as a nuget package?

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Muhammad Rehan Saeed Muhammad Rehan Saeed commented on 2020-05-07 12:48:30

Hi Rehan, is your HashCode implementation published as a nuget package?

No. Maybe I'll add it though. Seems strange to do so for one simple class.

RehanSaeed avatar May 12 '20 10:05 RehanSaeed

Nawfal Nawfal commented on 2020-05-07 19:34:39

No. Maybe I'll add it though. Seems strange to do so for one simple class.

Rehan, alright. Its not really required. I could just copy the class. Asking just in case..

RehanSaeed avatar May 12 '20 10:05 RehanSaeed