To solve your problem, you can suggest several methods: binary search by polynomial hash, prefix tree.
Polynomial hashes. A hash function is a mapping that assigns a number to an object. In our case, the string. Polynomial hashes are a polynomial. Generally speaking, you can take any function. But in this case it may turn out that for two elements (lines) we have two identical objects. Polynomial functions minimize this risk. They are considered so:
f(строка) = строка[0] + строка[1] ^ 2 + строка[2] ^ 3 + ... + строка[n- 1]^(n-1),
where string [i] is the character code.
As you can see, there are many calculations with degrees. It is not good. If speed is needed, then quick exponentiation can be applied. In addition, it turns out that large degrees are large numbers, lower speed and more memory, so a polynomial hash can be taken modulo:
[f(строка)]_k , where k is a large prime number, and operation [.] is the taking of the remainder of the division. More here , here and here .
I note that taking the remainder of the division can be carried out separately from each term:
f(строка) = [строка[0]] + [строка[1]] ^ 2 + [строка[2]] ^ 3 + ... + [строка[n- 1]]^(n-1)
Thus, counting the hashes, arrange the lines by it. Then we will do sorting by hash. In the sorted array, you can search for the desired element by binary search. Then the running time is O (max (log N, log M)), where M, N are the string lengths.
Prefix Tree This tree contains all string prefixes. Then we can search the tree for O (max (maximum string length)).
Prefix Tree (Trie)