I need to insert a regular expression pattern:

$@"^(?:[^\p{{L}}]|[{exclusion}])+$" //Цель: Запретить использование каких либо букв в строке, кроме тех что заданы в переменной - exclusion 

string variable:

 string exclusion; 

in which all control characters would be escaped, which would avoid errors associated with the operation of the regular expression.

I found the Regex.Escape() method. But he does not satisfy my needs. For example, if the value of exclusion = @"[text]" passed to the Regex.Escape() method, then it will return the string "\\[text]" . After inserting this line into a pattern instead of alternately exclusion:

 $@"^(?:[^\p{{L}}]|[{exclusion}])+$" //Цель: запретить использование каких либо букв в строке, кроме тех что заданы в переменной - exclusion 

it takes the following form:

 $@"^(?:[^\p{{L}}]|[\[text]])+$" 

As a result, the regular expression does not work correctly. I suspect that the reason for the extra character - ]

Please tell me how to escape all control characters in the string? Is there any other method besides the method - Regex.Escape ()? Maybe I somehow used it wrong and did not notice my mistake?

  • The problem is rather in @ - Grundy
  • @Grundy I tried to remove the @ symbol at the beginning of a regular expression: $ "^ (?: [^ \\ p {{L}}] | [{exclusion}]) + $", this solution did not help. Thank. - Evgeniy Miroshnichenko
  • Are you sure that did not help? did you try to register directly without a variable? regular behavior changed? Maybe just the regular expression is incorrect - Grundy
  • @Grundy Sure. As soon as I remove the square brackets from the - exclusion variable, the regular expression works fine. Also, I tried directly, without a variable, to substitute any test characters instead of a variable. In the end, everything works. For example: ^ (?: [^ \ P {Lu}] | [ABC]) + $ (works), ^ (?: [^ \ P {Lu}] | [\ [ABC]]) + $ (after escaping Regex.Escape, does not work), ^ (?: [^ \ P {Lu}] | [\ [ABC \]]) + $ (when doing shielding with both brackets, everything works) It seems that everything indicates that the problem is square brackets, namely that the Regex.Escape () method escapes only one of them. - Evgeniy Miroshnichenko
  • one
    @Grundy, Evgeniy Miroshnichenko brackets inside brackets must be escaped, only Regex.Escape is written on the basis that its result will be used to directly search for the screened string, and not as a character set inside the character group. Therefore, it screens. $ ^ {[(|) * +? \ but not] and} these closing brackets do not fall into the list of special characters (those that do not match themselves) - docs.microsoft.com/en-us/dotnet/standard/base-types/… - PashaPash

1 answer 1

Regex.Escape escapes those characters that are considered special outside character classes:

Escapes a minimal set of characters ( \ , * , + , ? , | , { , [ , ( , ) , ^ , $ , . , # , And white space) ( Escapes the minimum set of characters ( \ , * , + , ? , | , { , [ , ( , ) , ^ , $ , . , # And whitespace) by replacing them with escape codes )

In fact, inside the character classes only the following characters are considered special:

  • ^ - may mean the exclusive type of a character class, if it is immediately after the opening [
  • ] - closes character class
  • \ - escapes special characters
  • - - sets a range of characters or "subtraction of character classes"

To shield these characters, it is enough to use

 exclusion.Replace("\\", @"\\").Replace("^", @"\^").Replace("-", @"\-").Replace("]", @"\]") 

or

 Regex.Replace(exclusion, @"[]^\\-]", "\\$&") 

Decision:

 var pattern = $@"^(?:[^\p{{L}}]|[{Regex.Replace(exclusion, @"[]^\\-]", "\\$&")}])+$"; 

Or (since [^\p{L}] = \P{L} ):

 var pattern2 = $@"^[\P{{L}}{Regex.Replace(exclusion, @"[]^\\-]", "\\$&")}]+$";