HTTPS Link Regular Expression

Question

I parse the link with the following code:

m = re.search('http\://([^/]*)/?.*', url) host = m.group(1)

But if you insert a link with https, an error will occur:

 AttributeError: 'NoneType' object has no attribute 'group'

Is it possible to somehow rewrite the code so that the .search function can accept both http and https at the same time?

in the following way: re.search('http[s]?\://([^/]*)/?.*', url) , but it's better to do this with special modules ...
I advise you to check if there was a match: m = re.search(...) , if m: host=m.group(1) , otherwise such errors cannot be avoided.

Dmitry Petukhov Dmitry Petukhov 567 6 silver marks 19 bronze marks · Accepted Answer · 2016-07-31T18:19:41

Of course available

 m = re.search('https?://([^/]*)/?.*', url) host = m.group(1)

ReinRaus ReinRaus 16k 3 gold marks 32 silver marks 77 bronze marks · Answer 2 · 2016-07-31T18:33:12

You can change the regular expression to this:

 'https?\://([^/]*)/?.*'

but since you do not even know the basics of regular expressions, I do not recommend you to use them in real-world problems.
Python has a module urllib.parse and a function urlparse in it:

 >>> import urllib.parse as urlparse >>> urlparse.urlparse( 'https://google.com/q=' ).netloc 'google.com'

HTTPS Link Regular Expression

2 answers 2

More articles: