There is a text:

"something here mail@mail.com, mail@mail.com, mail@mail.com something here"

Both “clean addresses” and with an admixture of phrases can be added to the text - for example, they ask a person to simply indicate his address and he writes “mail@mail.com”, or he can write “my address is mail@mail.com” - that’s you need to pull phrases with "clean addresses" without the "admixture" of side phrases.

How to choose a phrase containing only "mail@mail.com"?

    2 answers 2

    Here it is possible:

    \b\S+@\S+\.\S+\b 

    See the demo

    This expression finds all substrings beginning with 1 or more characters other than spaces, then @ , then again 1 or more characters other than spaces, a period and again 1 or more characters other than spaces. \b - word boundary, cuts punctuation.

    If the goal is to find the last occurrence of an email address, add a preview block ahead:

     \b\S+@\S+\.\S+\b(?!.*\S+@\S+\.\S+\b) ^^^^^^^^^^^^^^^^^^^^ 

    Demo

    • commas captures the first demo. and Cyrillic does not cover. - Jean-Claude
    • And here they are not: he writes "mail@mail.com", or he can write "my address is mail@mail.com" . Secondly, one can speak about Cyrillic only knowing the used regular expression library / programming language. - Wiktor StribiĹĽew
    • phrases with "impurities" should be filtered out along with the addresses; from the above example, only the second phrase should remain - by phrases I mean the parts of the lines separated by a comma, correct with terminology, if I don’t put it right - titov_andrei
    • Those. something here mail@mail.com, mail@mail.com, mail@mail.com something here -> mail@mail.com something here ? Add your code + language tag to the question. - Wiktor StribiĹĽew
    • at the output only "mail@mail.com", without the other two mentions; The question is not tied to a specific code and language - titov_andrei

    Try this: (^|,)\s*([\w_\.-]+?@[\w_\.-]+?)\s*(,|$)

     puts <<EOT.match(/(^|,)\s*([\w_\.-]+?@[\w_\.-]+?)\s*(,|$)/)[2] something here mail@mail_1.com, mail@mail_2.com, ,, mail@mail_3.com something here mail@mail_4.com" EOT 

    On ideone .

    • catches a space and commas - regex101.com/r/BmH4t3/1 - titov_andrei
    • Corrected a bit and threw a test for ideone on Ruby - Majestio
    • so the source code was changed, they added 1-4 to the mail - initially they are all identical - titov_andrei
    • This was done on purpose - so that it seemed that it was found. If you return back there is mail@mail.com . Only if they are the same, how do you understand - what was the word? - Majestio