It is necessary to determine the approach of organizing a search on the site. In this regard, this question. It may seem somewhat common, but nonetheless.
A couple of years ago, they implemented a project on Orchard CMS (.net mvc). Duck here the framework used Lucene.Net for full-text search of site materials. From the developer’s point of view, the indexing connection was quite simple, and looked something like this:
OnIndexing<EduPart>((ctx, ep) => ctx.DocumentIndex .Add("EduPart_Description", ep.Description).RemoveTags().Analyze() .Add("EduPart_Mission", ep.Mission).RemoveTags().Analyze() ); The index is stored, I do not even remember where, either in the database or in files. But the point is that this search did not have any external calls and dependencies.
Realized projects are quite small, the task of indexing Wikipedia is not worth it. The main framework used is CakePHP , but it doesn’t matter, all popular mvc frameworks implement the same things anyway.
What are the possibilities of search and their applicability:
banal use of
LIKEoperator. the easiest and least functional option for searching for content. Yes, you can search for titles, for example, but it’s far from full-text here. As a solution is not considered.Using tools Full text search database engines. The problem (albeit solvable) here is the use of various DBMSs. In principle, I use only MS SQL Server and MySQL in development, both of these cores support full-text indexes, although I honestly admit I have never used it. In general, you will have to write search implementations for each DBMS.
using search engines like elasticSearch / sphinx / solr, etc. In such approaches, these are separate services / daemons that are accessed through the API. Cons - the possibility of transferring projects to hosting sites may be limited. CakePHP itself has a plugin for ElasticSearch , oddly enough, requiring changes in the inheritance hierarchy of application models / entities, although it would seem that some implementation of some behavior that reacts to content add / edit / delete events is required.
The last option (which I originally relied on) was the Lucene PHP port. Previously, such was part of the
Zend Framework, and was calledZend.Search. However, the trouble is that this project is not compatible with pkhp 7 and more is not supported. How is the latest version of php written for it 5.3.
It follows from these reflections that the task is to find a solution for full-text search on the site (for all types of documents / content) without using third-party search engines and not depending on the SQL dialect used.
Who can share experience on this issue? What are some other solutions to this problem, or libraries suitable for the conditions.