Describe a pair of string hashing algorithms. If there is, then with the code in java or pascal.

  • five
    In my opinion the cons are not fair. The question is quite correct and clear. Man wants to know simple hashing algorithms + implementation. - Artem Konovalov

2 answers 2

Please open the source code for the java.lang.String class, find the hashCode method there:

 public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value; for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } return h; } 
  • And what is interesting is, is the conversion of 31 * h into (h << 5) - h into javac ( at least in C (Ubuntu, gcc -O3, Intel (R) Core (TM) i5-2500 CPU @ 3.30 GHz) code with a shift more than one and a half times faster ) or the developers of the package chose to preserve the obviousness of the algorithm. - avp
  • What is the hash variable? - Nick
  • in String it is declared as private int hash; - Artem Konovalov
  • @Nick, this is the cache value of the hash. Since the strings are immutable, it is enough to calculate it once and write it into an instance variable, then you can simply just give it away without calculating it again each time. - iksuy
  • @iksuy is still cleverly done there, the field is not volatile, i.e. implied race data. But in order not to kill the performance, the developers deliberately went for it, finding that a few calculations of the function are less expensive than the cost of a volatile field. - Artem Konovalov

When there was no unordered_map in the C ++ standard, in its implementation I used the CRC quite successfully (it was then that I used CRC16).

Guntherot described the following variant in Optimization of C ++ Programs:

 struct hash_c_string { void hash_combine(size_t& seed, T const& v) { seed ^= v + 0x9e3779b9 + (seed << 6) + (seed >> 2); } std::size_t operator() (char const* p) const { size_t hash = 0; for (; *p; ++p) hash_combine(hash, *p); return hash; } }; 

But in general, it is necessary to experiment , for a specific set, a seemingly completely bad hash can pass, well, and vice versa ...

After that , you can see such a thing as GPERF - although this is far from the simplest hash :)