I am interested to know how the object hash is formed in C #. For example, there is a test class:

public class A { public int b = 0; public int c; } 

Further somewhere in the code:

 A a1 = new A(); A a2 = new A(); a1.c = 10; a2.c = 10; 

a1 and a2, by the logic of things, are completely identical. But for C #, no. Their hashes are different. Of course, it is clear that this distinction may be justified, for example, by the difference in addresses in memory, well, or creation time, in the end :). So, I want to know exactly how the object hashes are formed and how their differences for identical user objects are justified.

    1 answer 1

    To begin with, since your class A does not override GetHashCode() , this function is inherited from object .

    The exact algorithm for calculating GetHashCode() for an object not specified :

    The default implementation of GetHashCode does not guarantee unique values ​​for different objects. In addition, the .NET Framework does not guarantee that the default implementation of GetHashCode , including the values ​​returned to it, will not change when the framework version is changed. Accordingly, the default value of the GetHashCode function should not be used as a unique object identifier for the purpose of hashing.

    The fact that the two objects seem identical to you does not guarantee that they will have hash codes when they are implemented by default. On the other hand, judge for yourself: if two people have the same name and surname, does this not mean that this is one and the same person? Similarly, the .NET framework, if two fields of class A all the fields coincide, does not yet consider these instances to be the same object. Accordingly, anything is used by default - for example, the address at creation or the time of creation, the documentation does not speak about the exact method. If you are not comfortable with the default behavior, you should block it.

    If your class does not use the Equals operation and is not a key in hash tables, in principle you can do nothing, you should be comfortable with the default behavior. But if your class serves as a key, you need to redefine both Equals and GetHashCode() , and in a consistent way: if x.Equals(y) returns true , then the hash codes of x and y must match.

    It follows that you must override the GetHashCode() function along with the Equals function. For example:

     public class A { public int b = 0; public int c; public override bool Equals(object obj) { return Equals(obj as A); } public bool Equals(A that) { if (that == null) return false; return this.b == that.b && this.c == that.c; } public override int GetHashCode() { return b.GetHashCode() * 13 + c.GetHashCode(); } } 

    In C #, unlike in many other languages, there are two types of objects: objects that are completely defined by their attributes (that is, fields), and objects that are independent entities that are not limited only to the values ​​of attributes.

    An example of an object of the first type is, for example, numbers: if you write

     x = 5; y = 5; 

    - there is no difference between the first and second five, it is one and the same entity . Such things are called .NET value types and are entered with the keyword struct . For them, the Equals method by default compares the values ​​of all fields, and if they are equal, it concludes that the structures are equal, the same. Therefore, structures can be copied by value, the structure can be replaced without serious consequences by its copy. (The default GetHashCode method GetHashCode defined appropriately: for equal structures, it is guaranteed to return the same hash.)

    An example of another entity is, for example, a car whose attributes are brand, year of manufacture, engine power, etc. Two cars with the same technical parameters are still different cars. Such objects in C # are called reference types and are entered with the class keyword. For classes, respectively, the Equals implementation by default does not have the right to assume that the same field values ​​automatically mean equality, each instance of the class is considered unique .

    Think about whether you really need a structure instead of class A ?

    The question of whether we all are determined, people, only by our set of parameters, or whether we have something outside of it, is not so simple.

    • Thank. This is undoubtedly very interesting :) But you talked mainly about what I already knew, or about what is obvious (I’m talking about "two types" of objects, for example). Here's how the GetHashCode () and Equals () methods work for structures, to be honest, I didn’t even think about it (I think I could guess). However, I still did not understand where the difference in the hashes returned by the GetHashCode () method in user types (classes) with an identical set of fields comes from ... based on what generated the hash? That's what really want to know. - AseN
    • @Hancock: The documentation explicitly states that the algorithm is not specified , I quoted it. This means that for GetHashCode can be any calculation algorithm, not necessarily based on field values. The current version of .NET has taken advantage of the freedom given to it not to consider the values ​​of the fields, the next version can consider them, and both are correct. How exactly the current version considers, the documentation does not say, and we don’t need to know this, since the algorithm can change at any moment. - VladD
    • @Hancock: If you need the algorithm to have some properties (for example, take into account field values), you need to override equals and GetHashCode . - VladD
    • Ok, thanks, I understood: none of the developers really know how the GetHashCode () method works. - AseN
    • one
      @Hancock here is your article from one of the C # developers blogs.msdn.com/b/ruericlippert/archive/2011/03/20/… - mals