I do the processing of addresses for the site.

There is a line (url) of the form:

site.ru/novosti/page-2/ 

There is a first expression: #^/novosti/page-([0-9]+)/# , which catches the URL and processes it.

But there is another priority rule that does not allow the first to work.

The second expression: #^/novosti/#

The task in theory is simple (as far as I understand this business) - to add an exception to the second expression so that the first one could quietly work out. Roughly speaking, if in the second expression after the last slash there are " page- " characters + any number up to a thousand, then such an expression should return false or simply not work.

  • Are these expressions for htaccess or script? - splash58
  • In general, this bitrix handles, it turns out for the script. - Crabobass
  • I do not know how the bitrix router works, but can it be enough to change their sequence? - splash58
  • but no, you can try this direction - splash58
  • Tried, he sorts them. - Crabobass

2 answers 2

For these purposes in regular expressions there is such a thing as a statement .

There are two classes of statements:

  • statements back ( lookbehind ) impose restrictions on the text in front of them .
  • statements forward ( lookahead ) impose restrictions on the text after themselves .

Each statement can be both positive and negative. Each type of statement is recorded differently:

  • Positive statement back ( positive lookbehind ): (?<=foo)bar
  • Negative statement back ( negative lookbehind ): (?<!foo)bar
  • Positive statement forward ( positive lookahead ): foo(?=bar)
  • Negative statement forward ( negative lookahead ): foo(?!bar)

For example, a regular expression with a negative forward statement foo(?!bar) will match the string foo and the string bar ( foofoo , but not foobar ) does not foobar .

In your particular case, the regular expression might look like:

 #^/novosti/(?!page-[0-9]+).*$# 

And here is the link to the working example on regex101.

If you do not need to capture the entire line, you can do with this expression:

 #^/novosti/(?!page-[0-9]+)# 
  • @peter, is it for htaccess expression or can it be used in the script? - Crabobass
  • This is for php, you just wanted to. - Dmitriy Simushev
  • @peter if (preg_match("#^/novosti/(?!page-[0-9]+).*$#", "site.ru/novosti/page-2/")) { echo "true"; } else { echo "false"; } if (preg_match("#^/novosti/(?!page-[0-9]+).*$#", "site.ru/novosti/page-2/")) { echo "true"; } else { echo "false"; } if (preg_match("#^/novosti/(?!page-[0-9]+).*$#", "site.ru/novosti/page-2/")) { echo "true"; } else { echo "false"; } Just such a construction will return false. Or do something wrong? - Crabobass
  • one
    @peter, so you also need it: " Roughly speaking, if in the second expression after the last slash there are characters <...>, then such an expression should return false " - Dmitriy Simushev
  • one
    Well, I'm not @peter , but @DmitriySimushev =) - Dmitriy Simushev

If the system supports browsing ahead, then the second regular schedule should be replaced by:

 #^/novosti/(?!page-[0-9]+)#