There are about 100 sites among which you need to arrange a search, well, and pre-index them. That is, you need to write a complete search engine with a bot and indexing in Java. Can anyone describe or give a link to information, which describes the principle of the search engine, its structure, search and indexing algorithms. Any information would be relevant.
1 answer
You may need the Apache Lucene full-text search library. It is quite common (and therefore well supported), as evidenced by at least the release of the corresponding "... in Action" and the list of Powered By . It, however, is focused on indexing and searching directly in the content given to it: "crawling", for example, it will be necessary to do it yourself (or using a third-party library).
- Lucene is what you need. Thank! - Badaboom
|