I get the string

'<div class="authors">Редактор: <a href="/authors/111070/">Цветкова Т. В.</a></div>' 

or '<div class="authors">Редактор: <a href="/authors/111070/">Федоров-Немов Владимир Иванович</a></div>'

or '<div class="authors">Редактор: <a href="/authors/111070/">Федоров-Немов В.</a></div>'

I can have different names, how can I write a regular season to take the name or initials from the string?

Last name always begins with a capital letter.

    2 answers 2

    Parsing HTML with regular expressions is a very bad idea. For parsing HTML there are specially designed tools for this.

    But if you really want it, then the most primitive option:

     \/">(.*)<\/a 

    Check here .

    • Thank you very much, but you can clarify for this case '<div class = "authors"> Author: <a href="/authors/176417/"> Lavrova Lyubov Nikolaevna </a>, <a href = "/ authors / 176418 / "> Chebotareva Irina Vasilyevna </a> </ div> ' - shatoidil
    • @shatoidil, Need only first name or both? - post_zeew
    • I did it like this \ / "> (. *?) <\ / A and it seemed to help, but you need to take two last names. Did you do it right? - shatoidil
    • one
      @shatoidil, Yes, you can. - post_zeew

    In your example, you can:

     /\/">(\D+)\ (\D+)\ (\D+)<\/a/ 

    Or like this:

     /\/">([А-Яа-я-.]+)\ ([А-Яа-я-.]+)\ ([А-Яа-я-.]+)<\/a/