Python search engine [closed]

Question

It is necessary to check dozens of sites daily for the presence of specific texts within them (the text can be located on an arbitrary web page within the site). I would like to make such a search engine in Python. Help, please, where to start implementation?

Sites are new every day, the input parameter to start the script should be exactly the url.
Site structures and everything related to their content are undefined.

garrythehotdog garrythehotdog 385 ten · Answer 1 · 2018-08-18T16:43:50

It is worth studying: 1. Requests- for working with web-resources. 2. Beautiful Soup - for parsing sites (if you know the structure) 3. Re - regular expressions for text search. You can use other search methods, such as algorithms a la Boera-Moore-Horpsula

The easiest way is to get the text of the pages you need and immediately, without parsing, search for the text you want.

Python search engine [closed]

Closed due to the fact that the issue is too general for participants Suvitruf ♦ , Sergey Gornostaev , 0xdb , insolor , Pavel Durmanov Aug 18 '18 at 18:54 .

1 answer 1

More articles: