Sphinx has the ability to search by the presence of words in one sentence. For example, there is a text:
Vasya done, ate a cucumber, because got hungry. So it goes.
If you request
молодец SENTENCE огурец
Then we will find this text. If you request
молодец SENTENCE проголодался
Then we will not find this text anymore, since apparently in Sphinx the implementation of the breakdown into sentences is implemented in a simple way and the first dot that comes across here is considered the end of the sentence. Therefore, the question.
How can Sphinx be configured to make smarter breakdowns into offers when preparing an index? Any option is suitable - specify something in the configs or slip an external package to break into offers, for example, Tomita's parser from Yandex.
UPDATE
There was an idea to break into proposals beforehand with the help of Tomit Parser and specify the Sphinx to use a line break as the separator of sentences, but judging by the source code of the Sphinx, this is unlikely to succeed .
stopwords
usedstopwords
or are they used only for the search string? - Makarenko_I_Vstopwords
now tried, but failed. But it turned out withexceptions
. I tried to askт.к. => тк
т.к. => тк
and it helped. But again, this is a very compromise version, because threatens with unnecessary gluing together of sentences, when the sentence will end with an abbreviation ("Одно предложение и т.п. Другое предложение
"). - mnv