How to get a hash from an arbitrary class for an unordered container?

Question

In order to use your own class as a key in unordered_map (for example) you must define a hash function. Cppreference.com example

 template<> struct hash<S> { typedef S argument_type; typedef std::size_t result_type; result_type operator()(argument_type const& s) const { result_type const h1 ( std::hash<std::string>()(s.first_name) ); result_type const h2 ( std::hash<std::string>()(s.last_name) ); return h1 ^ (h2 << 1); } };

this concerns two fields, I am confused by this moment h1 ^ (h2 << 1) I saw examples without shift and now I don’t quite understand how I project this example on a larger number of fields, according to which rule. Can I just do h1 ^ h2 ^ ... ^ h(n) ? or do each subsequent hash I need to shift? h1 ^ ( h2 << 1 ) ^ ... ^ ( h(n) << n ) ?

I would not like to encounter conflicts, they will be very difficult to detect, but they will spoil the work significantly.

Community spirit ♦ one · Accepted Answer · 2015-07-16T14:34:59

In fact, it's best to try to “mix” hashcodes as much as possible. For example, Jon Skeet gives this example :

 result_type hash = (result_type)2166136261; hash = hash * 16777619 ^ std::hash<std::string>()(s.first_name); hash = hash * 16777619 ^ std::hash<std::string>()(s.second_name); hash = hash * 16777619 ^ std::hash<std::string>()(s.third_name); return hash;

Code

 h1 ^ h2 ^ ... ^ h(n)

It is considered incorrect, because quite often the fields have the same values and have equal hashes (and for integer fields, the field itself is often used as a hash). Since during the operation ^ equal values are mutually destroyed, then we get a smaller number of hash values and, accordingly, a lot of collisions.

Also popular option is this:

 result_type hash = start; hash = hash * factor + std::hash<std::string>()(s.first_name); hash = hash * factor + std::hash<std::string>()(s.second_name); hash = hash * factor + std::hash<std::string>()(s.third_name); return hash;

(it uses Java and .NET) for a suitable start and factor value. As a factor , a small prime number is usually used, like 13 or there 29. start should be in theory mutually simple with a factor (for example, another simple number, usually more).

Here is a good overview article (in English) on various hash calculation methods.

Community spirit ♦ one · Answer 2 · 2015-07-20T08:07:52

Here's another option for the link using boost

 #include <boost/functional/hash.hpp> struct KeyHasher { std::size_t operator()(const Key& k) const { using boost::hash_value; using boost::hash_combine; // Start with a hash value of 0 . std::size_t seed = 0; // Modify 'seed' by XORing and bit-shifting in // one member of 'Key' after the other: hash_combine(seed,hash_value(k.first)); hash_combine(seed,hash_value(k.second)); hash_combine(seed,hash_value(k.third)); // Return the result. return seed; } };

How to get a hash from an arbitrary class for an unordered container?

2 answers 2

More articles: