There is a magazine in which you need to parse the nicknames of the characters. Here is a snippet of HTML markup. It is necessary to make a regular schedule that will parse nicks, which are marked with bold italic text. I do not even know what to catch in this situation.

<font class='B9'> <font class='B1'> <IMG src=http://img.combats.ru/i/align25.gif width=12 height=15> <IMG src=http://img.combats.ru/i/klan/MadSquirrels.gif width=24 height=15> Manchester Utd </font> </font> [1065/3497], <font class='B9'> <font class='B1'> <IMG src=http://img.combats.ru/i/align25.gif width=12 height=15> <IMG src=http://img.combats.ru/i/klan/KnightsOfTheBalance.gif width=24 height=15> Frans </font> </font> [50/3963] <SPAN style='color: red; font-weight: bold; '>против</SPAN> <font class='B9'> <font class='B2'> <IMG src=http://img.combats.ru/i/align14.gif width=12 height=15> <IMG src=http://img.combats.ru/i/klan/TerriblePower.gif width=24 height=15> рвот </font> </font> [1519/4144]<HR> 
  • habrahabr.ru/post/110112 Such things are not done in regular form. - Stanislav Komar
  • 2
    Oh, once again HTML parsim regular. Remember once and for all: regulars are not an appropriate means for parsing HTML. - VladD

2 answers 2

 /([^>]+)<\/font><\/font>/g 

http://regex101.com/r/vY9vR9

  • Thanks a lot, as always helping out)) It remains to understand the regular season itself - quaresma89
  • one
    I already wrote to you, but once again I will remind you, use any library to work with DOM, this will greatly simplify your life, especially if there are problems with compiling regular expressions. Look at how simple it is. Habrahabr.ru/post/176635 And another tip: if you received a satisfying answer to your question, mark it as “correct” (tick to the left of the answer) - vanchester
  • Thanks, I'll try! It’s not that I’m looking for easy, ready-made solutions. On the one hand, I want to learn how to master regulars well, slowly master them. On the other hand, if it’s stupid to solve such problems with regulars, then it's a different matter) - quaresma89
  • By the way, a small question about your regulars, after processing the regulars, gives out 2 identical arrays, it feels like there are 2 groups being captured - quaresma89
  • If you use PHP and preg_match_all (), then the array should be obtained, in its first key there should be expressions corresponding to the whole regular record, and in the array with the second key - “saved”, i.e. what is in brackets. Array ([0] => Array ([0] => Manchester Utd </ font> </ font> [1] => Frans </ font> </ font> [2] => gag </ font> </ font>) [1] => Array ([0] => Manchester Utd [1] => Frans [2] => gag)) - vanchester
 Array ( [0] => Array ( [0] => Кре [1] => Жирный Зомби (5) [2] => Бес Гнева (8) [3] => Проклятый Оруженосец (4) ) [1] => Array ( [0] => Кре [1] => Жирный Зомби (5) [2] => Бес Гнева (8) [3] => Проклятый Оруженосец (4) ) ) 

This is why for some reason I get an array

  • show the code - vanchester
  • If you are using PHP, the code should be something like this: preg_match_all ('/ ([^>] +) <\ / font> <\ / font> /', $ text, $ matches); print_r ($ matches); - vanchester
  • <? php if (! empty ($ _ POST ['url'])) {$ url = $ _POST ['url']; $ log = zlib_decode (file_get_contents ($ url)); $ regExpPl = "/ ([^>] +) <\ / font> <\ / font> /"; preg_match_all ($ regExpPl, $ log, $ matches); foreach ($ matches [1] as $ key => $ player) {echo $ player. "</ br>"; }}?> - quaresma89
  • probably, nicknames in $ log are repeated. I see no other reason - vanchester
  • one
    > int preg_match_all (string $ pattern, string $ subject [, array & $ matches [, int $ flags = PREG_PATTERN_ORDER [, int $ offset = 0]]]) ...> flags> PREG_PATTERN_ORDER> Arranges the results so that $ matches [0] contains an array of full occurrences of the template, the $ matches [1] element contains an array of occurrences of the first submask, and so on. Apparently the browser simply hides the closing font tags, and so the first set differs by them. - etki