It is necessary to break the next line (at the same time it’s better to deal with regulars)

Location: http://www.google.ru/?gfe_rd=cr&ei=wHKcWInnDurA7gThta_YBw 

on:

  • http protocol
  • google.ru address
  • Parameters ?gfe_rd=cr&ei=wHKcWInnDurA7gThta_YBw ...

I did it using STL algorithms (two times std :: find_if and three times std :: copy ), but somehow it looks wretched.

How to do it with regulars, and will it be faster?

1 answer 1

Regular expressions do not always optimize the speed of program execution, often the usual string methods work faster than patterns.

In this particular case, the regular expression can be set as follows:

 ^Location:\s+([az]+)://(?:www\.)?([^/]+)/?([^?]*)(?:\?(.*))?$ 

See the demo

  • ^ - anchor start line
  • Location: - a sequence of characters
  • \s+ - 1+ spaces
  • ([az]+) - ( exciting group 1 ) 1+ lat. letters
  • :// - sequence of characters
  • (?:www\.)? - optional substring www.
  • ([^/]+) - ( exciting group 2 ) 1+ characters other than /
  • /? - optional slash
  • ([^?]*) - ( exciting group of 3 ) 0+ characters other than ?
  • (?:\?(.*))? - optional sequence ...
    • \? - question mark
    • (.*) - ( exciting group 4 ) 0+ any (except line feed) characters
  • $ -

An example of using a regular expression in C ++ (since regex_match requires a full line match, no anchors are needed):

 #include <regex> #include <string> #include <iostream> using namespace std; int main() { string s("Location: http://www.gogggle.ru/ru/index.php?gfe_rd=cr&ei=wHKcWInnDurA7gThta_YBw"); regex r(R"(Location:\s+([az]+)://(?:www\.)?([^/]+)/?([^?]*)(?:\?(.*))?)"); smatch matches; if (regex_match(s, matches, r)) { cout << "Протокол: " << matches.str(1) << endl; cout << "Домен: " << matches.str(2) << endl; cout << "Путь: " << matches.str(3) << endl; cout << "Строка запроса: " << matches.str(4) << endl; } return 0; } 

See C ++ Demo