Greetings. There is a line like mysite.com/somepage, mysite.com/anotherpage, ... I want to select from here in the array only the names of the pages on mysite.com, that is, somepage and anotherpage . I use the following regular expression:

 /(?:mysite\.com)(?:\/\S*)?/ 

How can I make sure that mysite.com in front of the page name, but do not include mysite.com in the result? When i do

 preg_match_all("/(?:mysite\.com)(?:\/\S*)?/", $str, $res); 

At the output I get an array along with the domain. Although, in theory, ?: After all, should exclude the domain from the result. What am I doing wrong?

    1 answer 1

    preg_match_all() in addition to the capture groups in parentheses also returns an array of complete template matching. By default, $ res [0] contains an array of full occurrences of the template. That is what you see. Besides it, there may be additional arrays if there are round brackets in the expression without ?: .

    There are several solutions:

    1. Use the expression mysite\.com(\/\S*)? , do not look in $ res [0], but only in $ res [1] - where the occurrences of the first submask (/\S*) will be contained.
    2. Use the \K limiter. If it is found in an expression, then a complete coincidence ($ res [0]) is everything that comes after it. What stands before him is only checked for presence. We mysite\.com\K(?:\/\S*)?
    3. Use a non-exciting precheck (?<=) . all that is enclosed in it is only checked for availability before the statement we need. (?<=mysite\.com)(?:\/\S*)?
    • Thank you very much, now everything works :) - LNK