As a result, I want to do: I need to clear all my links from UTF-8-PTO characters.

The guys from the local SB helped me write a few regulars

//Удаляет BOM символы из urls $buffer = preg_replace('/[^[:print:]]/', '', $buffer); 

Discussion about UTF-8-BOM characters can be found here.

 //находит все открывающие tags <а ...> с атрибутами и пока что заменяет на те же значения. //Я так понял действует только с двойными кавычками $buffer = preg_replace('/<a(.*?)href="(.*?)"(.*?)>/','<a$1href="$2"$3>',$buffer); 

Now I need the regularization to remove the BOM characters to apply to <а href . Ie to $2

 $buffer=<<<HTML AAA <a href="http://www.newsru.com/sport/10feb2017/zozu.html%EF%BB%BF">AAA</a> <div>TEXT</div> AAA <a href="http://www.newsru.com/sport/10feb2017/zozu.html%EF%BB%BF">AAA</a> <div>TEXT</div> <hr/> HTML; echo $buffer; $buffer = preg_replace('/<a(.*?)href="(.*?)"(.*?)>/','<a$1href="'.preg_replace('/[^[:print:]]/', '', '$2').'"$3>',$buffer); $buffer = preg_replace("/<a(.*?)href='(.*?)'(.*?)>/",'<a$1href="'.preg_replace('/[^[:print:]]/', '', '$2').'"$3>',$buffer); echo $buffer; 

Do I apply it correctly? If not, please write a working version. And if it is possible to correct the code. I apply it twice to single and double kavichkam.

Thank.

  • lines with \xA0 not needed, [^[:print:]] cuts them off. - vp_arth
  • The most interesting question is, where do these links come from, where can there be non-printable symbols in them? - vp_arth
  • @vp_arth 1)% EF% BB% BF, 2)% C2% A0 [^ [: print:]] cuts out one of them. I checked - user216109
  • Instead of quotes you can write [\'\"] - vp_arth
  • Go back to Demo, and look at the second line. - vp_arth

1 answer 1

You can use the preg_replace_callback function

 $url = urldecode("http://example.org/%EF%BB%BFWord/%C2%A0Word"); $buffer = <<<HTML <body> <a href="$url?q=1">Url 1</a> <a href="$url?q=2">Url 2</a> <a href='$url?q=3'>Url 3</a> </body> HTML; $buffer = preg_replace_callback( '/<a(.*?)href=[\'"](.*?)[\'"](.*?)>/', function ($matches) { return "<a{$matches[1]}href=\"" . preg_replace('/[^[:print:]]/', '', $matches[2])."\"{$matches[3]}>"; }, $buffer ); 
  • Handsome, thank you so much - user216109