Why, when overriding the Equals() method, is GetHashCode() also advised to override the GetHashCode() method?
And one more following question:
Having the following fields

 public readonly int x, y; 

On msdn.com, the GetHashCode() method GetHashCode() overridden as follows:

 public override int GetHashCode() { return x ^ y; } 

And if the fields would look like this:

 public readonly int x, y, z; 

Will that override method look like this?

 public override int GetHashCode() { return x ^ y ^ z; } 

Do I understand correctly? And what if the data type of the fields will be different?

    1 answer 1

    The fact is that any object in .NET can be put in a hash table.

    Suppose now that you have a class X in which the Equals method is redefined, but not GetHashCode redefined. What will happen in this case?

    Suppose you have two objects x1 and x2 type X , which coincide according to the Equals function. Since you did not redefine GetHashCode , most likely their hashcodes will be different .

    If you put the value of x1 in HashSet<X> , and then you search for x2 there, then this value will not be found , despite the fact that x1 and x2 are equal. This happens because when searching in HashSet<X> , the hash bucket that matches the hashcode is first searched, and only the element itself is searched for in it - this is the basic principle of the hash table, ensuring its effectiveness. Thus, elements of class X cannot be stored normally in hash tables.

    The same applies to Dictionary<X, T> , which is also based on a hash table and uses the GetHashCode method.


    Regarding the correct definition of GetHashCode , advice from MSDN is not very good. The fact is that if an object has two fields, then often these are small numbers, which are also often equal. Therefore, with this definition, there will be too few different hashcode values, and many X instances will fall into the same hash bucket (which reduces the effectiveness of the hash tables).

    The recommended method appears to be:

     public override int GetHashCode() { var hash = 19; // избегаем хэш-коллизий, используем (небольшое) простое число hash = hash * 37 + x.GetHashCode(); hash = hash * 37 + y.GetHashCode(); // обобщается на произвольное число полей return hash; } 

    (19 and 37 are small different prime numbers) Here the field types are unimportant, since on each of them the hashcode is taken.

    Be sure to follow the rule: if objects are equal in Equals , they must have the same hash codes.


    Another important note: if you put an element in a HashSet<T> , you should not change its state (field values) so that its hashcode changes! It is best, of course, not to change it at all. Otherwise, when you search for this item, it simply will not be found.


    Additional reading on the topic:

    • "Guidelines and rules for GetHashCode" - translation - mals
    • @MaLS: Thank you! Include in the answer. - VladD