Tell me, please, working Regexp (.NET) for hyperlinks. Now I use the regex view:
@"(?<protocol>http(s)?)://(?<server>([A-Za-z0-9-]+\.)*(?<basedomain>[A-Za-z0-9-]+\.[A-Za-z0-9]+))+((:)?(?<port>[0-9]+)?(/?)(?<path>(?<dir>[A-Za-z0-9\._\-/]+)(/){0,1}[A-Za-z0-9.-/_]*)){0,1}"
But parsit not all links, unfortunately, for example, does not accept such:
http://regexlib.com/%28A%28-umS_xFKaqoVVf9qVIJUf1Zy7GjFNovSUv_QSprOszdQi3qMJRTXbLp9XIzGZOdY9B8Xq3gtGPTkGYEe5C6Rg6XjA0fwU_JkMeaaE2ONmrRbxhFFfAt9Y-AEfyujh9NpzsN268y6Dh25xbgqyzTzjkY8AKB8_7uLPmDk2wgufsFSxx39e269HFBLoTs8wMhX0%29%29/DisplayPatterns.aspx
This link can be parsed by another regular program:
"^(ht|f)tp(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-\.\?\,\'\/\\\+&%\$#_]*)?$"
But it breaks off when it encounters a text of the form:
''http://edition.cnn.com/\"javascript:CNN_handleOverlay('opt_out_cnn')\"/''.
http://www.tyres.spb.ru/index.php?mid=31
=> protocol =>' http 'path =>' index.php 'server => 'www.tyres.spb.ru' basedomain => 'spb.ru' dir => 'index.php' - chernomyrdin