There is the following regular expression:

Regex regex = new Regex("^[А-Яа-я]+$"); 

It works correctly, but does not "understand" the Ukrainian language. Those. he "eats" "fyvaproldzh", but it is not - "ilililia".

How to make this regular "understand" the Ukrainian language?

Reported as a duplicate by ReinRaus , VladD c # Aug 26 '15 at 7:39 .

A similar question was asked earlier and an answer has already been received. If the answers provided are not exhaustive, please ask a new question .

2 answers 2

You need to supplement the character class with the symbols ёЁЇїІіЄєҐґ :

 "^[А-Яа-яёЁЇїІіЄєҐґ]+$" 

This is a regular expression for the letters of the Ukrainian language (I took the information from Wikipedia ):

 [а-щА-ЩЬьЮюЯяЇїІіЄєҐґ] 

Perhaps you need to add here ' (see Vlad's comment ).

Concerning ёЁ : these letters in the Unicode table are outside the range of the rest of the letters, so it must be specified separately. The range of capital letters of the Russian language looks like U+0410 - U+042F , the range of lowercase letters - U+0430 - U+044F . Ё has code U+0401 , and ё U+0451 .

And \p{IsCyrillic} : in .NET, you can specify all Cyrillic characters using \p{IsCyrillic} (U + 0400 - U + 04FF) and, for completeness, \p{IsCyrillicSupplement} (U + 0500 - 052F), and then the expression takes the form

 @"^[\p{IsCyrillic}\p{IsCyrillicSupplement}]+$" 

Well, or @"^['\p{IsCyrillic}\p{IsCyrillicSupplement}]+$" .

  • It may make sense to add IsCyrillic to this answer . (Although there is about Java.) - VladD
  • Sorry, I noticed one inaccuracy, the regular expression you provided for the Ukrainian language includes the letter of the Russian alphabet: "s". - Evgeniy Miroshnichenko
  • one
    @EvgeniyMiroshnichenko [а-щА-ЩЬьЮюЯяЇїІіЄєҐґ] does not find ы . ^[А-Яа-яёЁЇїІіЄєҐґ]+$ is a regular expression from a question in which Ukrainian letters are added, since the question was to make it "so that this regular schedule ( ^[А-Яа-я]+$ )" understood "Ukrainian tongue". - Wiktor Stribiżew

I would recommend this expression:

 Regex regex = new Regex("^[\u0400-\u052F\u2DE0-\u2DFF\uA640-\uA69F']+$"); 

It understands all Cyrillic letters in Unicode , for example, such letters:

  • RussianYYYOEE
  • УкраїнськаІІЇїЄє '
  • Belarusian
  • Srpski ...
  • Slavic letters

and so on.

If you only want Ukrainian and Russian letters, here it fits:

 new Regex("^[А-Яа-яЁёЇїІіЄєҐґ']+$");