Integers are sequentially added to the table and it is not known in advance how many will be — maybe 20, or maybe 1000. Therefore, it is necessary, each time all cells are filled, to increase the size of the table. The hash function will be the simplest:

return number%hashSize; 

where hashSize is the size of the table at the current moment, number is the current number to be added.

Let initially the size of the table 10. Add the number 15. It goes into the fifth cell (15% 10 = 5). Then, after some time, when the size of the table becomes different, for example, 20, we add once again the number 15. And it will already go to another cell, in the 15th (15% 20 = 15). Although for correct work I had to go to the 5th.

What do they do in such cases?

  • one
    Maybe a hash function based on the current size is not the best choice? - MBo
  • one
    It is necessary to redistribute existing records between cells. - Nofate

1 answer 1

In creating a hash for an object, it is worth adhering to one very important rule - the hash should be calculated based only on the properties of the object itself . In your case, this rule is not respected, because The hash of your object (number) is calculated based on external factors (the size of the hash table).

In most languages, programming a 4-byte hash for an integer is that number. But your operation

 return number%hashSize; 

It should not be a hash calculation, but a search for a list sequence number (in Java, this is called a bucket and is a linked list, this is called a cell ), to which the new element will go.

With regard to increasing the size of the hash table, then there is no getting away from the redistribution of elements between the cells within the hash table, while this increase is going on.

  • "In most languages, programming a hash for an integer size of 4 bytes is that number." So what if the number is larger than the size of the array? I will get out of borders - zhukov
  • one
    If by the notion размера массива meant the number of cells in a hash table, then you will not get out. In other words, the process of adding an element to a hash table consists of two stages : calculating the hash of an element and calculating the sequence number of the cell into which the element falls. The second operation is the remainder of the hash division by the size of the hash table (ala number%hashSize ). And the remainder of the division can never be greater than the divider - Temka, too,