There is an urgent need to break the Russian text from the file into sentences. Simple division (split) by . ! or ? will not work. It is necessary to take into account options for abbreviations such as t. O., Others, and so on; abbreviations in front of a proper name (Moscow), abbreviations of the type Ivanov I. I. and others. Now the regular expression code looks like this:
string[] splitSentences = Regex.Split(sTemp, @"(?<!\w\.\w.)(?<![AZ][az]\.)(?<=\.|\?)(\s|[AZ].*)"); It is clear that this is not enough. Help me please.