Hello, I am interested in the next question. I search the site as follows - I site:http://ask.fm москва writing a " site:http://ask.fm москва " (i.e., all the pages where the word "Moscow" was written) will appear and Google writes that 230,000 results have been found, but it only shows me first 40-60 pages. Is it possible to somehow get the remaining at least a couple of thousand pages? or maybe there are other search engines that give all the pages found?
2 answers
Estimation of the number of results on Google is very rough, and has at least some relation to reality only for the trivial requests "foo bar", and better in general "foo".
My Google only shows 7 pages, which is a sure sign that it can’t find anything else: google.com/search? q = site: http: //ask.fm+ Moscow & num = 100 & start = 600 & filter = 0 .
Usually, when Google reaches the last page, it changes the counter to a real value:
- https://www.google.com/search?q=ycuen&num=100&nfpr=1&start=0 - 41,200 results
- https://www.google.com/search?q=ycuen&num=100&nfpr=1&start=100 - 128 results
When filtering the site for some reason, he does not.
If you want perfectly accurate results, write your robot. Or negotiate with the site so that it provides you with the necessary information (of course, this is possible only if you want to help the site, and not vice versa; well, the scale of your site and should be related).
Search engines have no restrictions on the output of pages. Technically, the output of the number of pages and the pages themselves is divided, which gives different results.
It is better to use site:example.com (without http). In this case, the search engine will show all that he has in the index . Sometimes the domains www.example.com and example.com are different. Therefore, I recommend to watch both. This is due to the lack of 'gluing' domains, i.e. The search engine did not recognize that both domains are identical in content.
You should not be limited to only one search engine, yet the intensity of scanning and the rules for adding to the index are different for everyone.
I summarize: The search engine shows all the pages in its 'memory', if there are no pages, this is not a limitation.
- This is what Google constantly writes to me: “Sorry, but Google does not give out more than 1000 results, and you requested results from the number 1100.”, and also “We have hidden some results that are very similar to those already presented above (597).” and in any search engine (Yandex, Rambler, Yakao, Nigma) shows only 40-90 pages .. - inkorpus
- In this case, it is better to use variable queries. Google gives you the opportunity to see the options. - GrayHoax
- Can you tell me how to do this? - inkorpus
- Variable request is when one big request, like
site:ask.fm москваis divided into smaller ones, it is specified, for examplesite:ask.fm химки. Thus, the results will be less. Parse them, and then again another lookup. - GrayHoax - so what should I do if I need to get a couple of thousand pages from Moscow? - inkorpus