Having a large prefix tree (about 300 meg), how can you quickly get a unique hash of the whole tree after a point change? How can you make a unique cast (SHA3 for example) of the whole tree?

    1 answer 1

    I have such an idea: what exactly we are going to hash is not very important, it is actually some sort of byte set. For simplicity, we assume that the tree is not dynamic, but is packed into an array (so that the data continuity). You can do without it.

    After that, any standard hashing algorithm is not hard to apply (we transfer the raw data array, it will be hashed).

    Now about the first question. Usually, hashes are specifically designed to make such an operation difficult. But you can use a series of polynomial hashes. They are not very reliable, but this can be offset by their number.

    A polynomial hash is calculated like this:

    for (char x : _data) hash = (hash * BASE + (size_t)x ) % MOD; 

    Where BASE and MOD are hash parameters.

    The advantage of this particular hash in your task is that it can be "cut". Suppose we changed the element in _data[K] . Then the new hash will be (old_hash + (new_data[K] - old_data[K]) * BASE ^ (data_size - K - 1) ) % MOD To quickly calculate it, it is enough to have an array of pre-calculated values Pow[K] = Pow[K-1] * BASE % MOD then any change can be made in O (1). The time for the pre-calculation will be no more than the initial calculation of the hash.

    This is rather a subjective answer, in a way there are more convenient and reliable ways.