It is clear that no strictly algorithm can be proposed here. You can try to use a variant of the algorithm "collapse into heaps" (clustering) proposed by M. Weinzweig and M. Bongard in 1973. Well, I will not describe the theory here (it is not very simple), but you can estimate the implementation.
- Create a function that calculates the difference between two words. For example, it can work on the following principle:
- If words of the same length, then pairwise match letters and each not matching increases the difference value. The closer the compared letters to the center of the word, the greater the weight of their difference.
- If words of different lengths, then the shorter "chase" along the longer one, complementing the free part with spaces. As a result, we take the minimum value of the difference function
- For all words in the list we calculate pairwise differences.
- Perform the actual collapse:
- We take a random word from the initial list and collect a new list, which includes those words, the difference values with which there is less than some "delta".
- Selected words are removed from the source list.
- Repeat the process until the source list is empty.
- In each heap, choose the shortest word.
It is clear that bash is not write. But you require the elements of AI, in fact.