Comrades, I write the link parser

"https://domain.ru:8080/folder/subfolder/../././?var1=val1&var2=val2" "https://http.google.com/folder/././?var1=val1&var2=val2", "ftp://mail.ru/?hello=world&url=https://http.google.com/folder//././?var1=val1&var2=val2", 

what needs to be fixed in the regular account so that the result would be correct if there is no any part of the link (if there is no protocol, then everything is OK so that the regulars would not climb into the parameters)

here is my regular season:

 $_re = "~^ (?:(?<protocol>http|https|ftp):\\/\\/)? (?<domain_name> (?<domain_2_lvl>[a-z0-9\.]+) \. (?<zone>[az]{2,4}) :?(?<port>[0-9]{4})? )? |(?<raw_folder> ((?<=\/)([az.]+)(?=\/))) |(?<script>script.php)? |\? |(?<key>[a-z0-9?\/_]+)=(?<value>[a-z0-9?\/_]+)? ~ix" preg_match_all($_re, $__url, $matches); 
  • one
    Why regular season and what parse_url did not please? - Deonis
  • And what do you want to achieve? In php there is a function for parsing url, and that's called;) parse_url - Alexey Shchepkin
  • I want to work with regular expressions. This is just a learning task - in order to understand how everything works. - 小uriousBoy
  • mathiasbynens.be/demo/url-regex There are examples of all kinds of url and regulars - Alexey Shchepkin
  • Thank! I think it will help - 小uriousBoy

2 answers 2

You can add an alternative | and "scans" back and forth, and empty elements in the array to filter:

 $url = 'https://domain.ru:8080/folder/subfolder/../././?var1=val1&var2=val2'; $patt = '~^ (?:(?<protocol>(?:ht|f)tps?)://)? (?<domain_name> (?<domain_2_lvl>[\pL\d.-]+)? \. (?<zone>\pL{2,4}) (?::(?<port>\d{4}))? )? | (?<raw_folder>(?<=/)[\pL.]+(?=/)) ~ix'; preg_match_all($patt, $url, $matches); $matches = array_map('array_filter', $matches); 

Result:

 var_dump( $matches['protocol'], // https $matches['domain_name'], // domain.ru:8080 $matches['domain_2_lvl'], // domain $matches['zone'], // ru $matches['port'], // 8080 $matches['raw_folder'] /* array (size=5) 1 => string 'folder' (length=6) 2 => string 'subfolder' (length=9) 3 => string '..' (length=2) 4 => string '.' (length=1) 5 => string '.' (length=1) */ ); 
  • with this link: "mail.ru/?hello=world&url= http.google.com/folder//././?var1=val1&var2=val2?mail=ru " does not work - 小uriousBoy
  • one
    @ CuriousBoy so you have no such links in the question. An example is written for /folder/subfolder/../././ - that is what you asked for. - Edward
  • I agree, then tell me how to handle situations when the link does not indicate the path, but to be precise, any part of the link may be missing - 小uriousBoy
  • @ CuriousBoy your regulars complete with my quite copes with this task. Templates are written for most URLs, you can try to foresee everything, but why, if there are built-in tools for this. - Edward
  • It's interesting to understand this, I can't do it and I want to do it all the same. Tell me how to solve the problem. - 小uriousBoy

Together it turned out

  $_re = "~^ (?:(?<protocol>http|https|ftp):\\/\\/)? (?<domain_name> (?<domain_2_lvl>[a-z0-9\.]+) \. (?<zone>[az]{2,4}) :?(?<port>[0-9]{4})? )? |(?<raw_folder> ((?<=\/)([az.]+)(?=\/))) |(?<script>script.php)? |(?<key>[a-z0-9\/_]+)=(?<value>[a-z0-9?\/_:.]+) ~ix"; 

Correct if I'm wrong