There is an N-gram search . It perceives a string as a set of individual substrings of length N, and the relevance indicator is the number of such substrings that are common between the document and the search query.
This approach allows you to detect minor typos in words or to find words only piece by piece of any significant (from N + 1) length.
There are enough implementations, there is a choice.
There is an N-gram tokenizer in ElasticSearch . There is also in Apache Solr . In Sphinx, N-gram search can be enabled (it is argued that it makes sense for Korean, Japanese, and Chinese, where the trouble is word-breaking).
As you can see, there are practically all known search engines, and if you need a really powerful search, it is better to use a product developed specifically for this purpose. Digging in the instructions for the selected, you can find other algorithms that you might like more.
And now something less ordinary.
For PostgreSQL there is a trigram search module ( pg_trgm ). As you might guess, this is an N-gram search, where N = 3.
It practically requires a separate GIN or GiST index, compiled according to the class of operators from this module. The GIN is quite large in size, not very quickly updated, but quick to search; while GiST is more compact and faster to update, but it can give false matches. Therefore, for infrequently updated GIN data, it is better.
This is a good option if you already use PostgreSQL, it is not particularly loaded (or you can provide this) and you don’t need much intelligence from the search.
It also has a full-text search implementation , but this no longer applies to N-gram.
like '%text'andlike 'text%quite work for themselves by indices. In addition, directories are usually not very large, and searching even without indices is fast - Viktorov