Dear colleagues, faced with such a question regarding the hash table, please tell me:

It is clear that each cell of the array can be either a linked list or a tree. And how is the length of the array itself determined? Is there any generalized principle that we could say something like: "the context in which the given table will be used, such and such, and therefore we consider the length of its array by such and such a formula."

Or any firmly justifying the decision on the length of the criteria that must be followed when creating your table?

    2 answers 2

    And how is the length of the array itself determined?

    By default, an array of 16 items (baskets) is created.

    HashMap has attributes such as:

    • capacity - capacity / capacity (the current size of the array is not a class field );
    • size - the number of elements in the HashMap ;
    • loadFactor - load factor;
    • threshold is the threshold for the number of HashMap elements.

    In this case, the threshold calculated as capacity * load factor .

    After adding a pair to the HashMap , a check is performed:

     if (size++ >= threshold) 

    and, if the number of HashMap elements is equal to or greater than the threshold value, a new array is created, the size of which is twice the size of the old array:

     resize(2 * table.length); 

    at the same time, all elements of the old array are distributed to the new array and the capacity value is changed and the threshold recalculated.

    Example:

    Created a HashMap with default parameters:

    • capacity = 16
    • size = 0
    • loadFactor = 0.75
    • threshold = (int)(16 * 0.75f) = 12

    Added 11 different pairs:

    • capacity = 16
    • size = 11
    • loadFactor = 0.75
    • threshold = 12

    After adding the 12th pair we get:

    • capacity = 32
    • size = 12
    • loadFactor = 0.75
    • threshold = (int)(32 * 0.75f) = 24

    All of the above is true for JDK 7 and below, but in JDK 8 the table is expanded a little differently.

    • Thank you very much for your answer. Could you explain a little more? 1. What is the idea of ​​'loadFactor' - the load factor is the constant '0.75' for calculating the 'threshold'? Or is he somewhere else? 2. Is 'capacity' and 'table.length' the same? - Pavel
    • one
      1. loadFactor is a coefficient that determines when the hash table is rebuilt (used only to calculate the threshold ). A value of 0.75 contributes to better performance in terms of memory and time. A somewhat loose explanation, but I think the essence was informed. 2. Yes, they both have the same meaning. - post_zeew

    If we create a Map hashMap = new Hashmap()

    This is the default busket size -a 16 and a load factor of 0.75 (load factor). When the number of entries in the hash table exceeds the load factor, it doubles the busket -a busket .

    You can also set the size of the busket -a

     HashMap(int initialCapacity) HashMap(int initialCapacity, float loadFactor) 

    https://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html

    • one
      It is exhaustive about how , but not a word about the essence of the question (although the TC accepted the answer), i.e. about what exactly should be guided (or at least what to think about yourself) by choosing specific values ​​of initialCapacity and loadFactor - avp
    • one
      То по умолчанию размер busket-a 16 - what is the size of busket -a? Maybe all the same size of the array? Когда число записей в хэш-таблице превышает коэффициента нагрузки - the number of records in the hash table will exceed the value of the load factor after adding one element (with default parameters). - post_zeew