In order to find a row in the table for some unique key (for example, "car"), there are no problems. But how should the data be stored in a table in order to quickly search for data on the array of keywords (for example, "red car" or "car moscow" or "moscow car loan")?

We want to get, in some cases, the rows of the table that have matches by the maximum number of keywords, in other cases - the rows of the table, sorted by the number of matches (that is, first come the lines that have matches for all three keywords from "Moscow ", then - lines that match only two of these words, and so on.

    1 answer 1

    What you are trying to do is full-text search with rankings.

    MySQL has MATCH , but it only works on the columns on which the index is built, for example, FULLTEXT (title,body) . Previously used on MyISAM, now available in Innodb 5.6+ .

     SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('москва Π°Π²Ρ‚ΠΎΠΌΠΎΠ±ΠΈΠ»ΡŒ ΠΊΡ€Π΅Π΄ΠΈΡ‚'); 

    Read more

    But there will be problems with the morphology of words: в москвС they will not find it anymore. Here they propose to process the entire text (stemming), and already search for this text. So-so solution.

    You are probably not satisfied with the quality of the search. The fact is that databases are not sharpened by full-text search and ranking, they evolve in other directions.

    Correct solution

    Therefore, you should use special solutions: Sphinx, its fork Mandragora, or ElasticSearch.

    For Sphinx there is a SQL-like language , the connection also goes through the MySQL interface.

     SELECT * FROM index WHERE MATCH('москва Π°Π²Ρ‚ΠΎΠΌΠΎΠ±ΠΈΠ»ΡŒ ΠΊΡ€Π΅Π΄ΠΈΡ‚'); 

    You like? Use.

    More Sphinx chips:

    • rt-index allows you to update the Sphinx index on the fly. Have you updated the news in MySQL? The second command update rt-index
    • snippets - "podstvetka" keywords in the found text
    • faceted search
    • Thank you for your reply! The tools you offer are indispensable when searching through articles, news, and so on. But still, the analysis of the text in the table cell entirely is not always rational. For example, I do not know how the search in Evernote is implemented, but in general there you can search for notes only by tags, not by content. In this case, it is reasonable to save the labels in the database cell in the form of a serialized array, with which it is much easier to work with the text. Any tools for this case? - Bokov Gleb
    • Π½ΠΎ ΠΎΠ½ Ρ€Π°Π±ΠΎΡ‚Π°Π΅Ρ‚ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Π½Π° MyISAM . Your information is very outdated. fulltext search was delivered to innodb in another 5.6. By the quality of the search, however, I do not know if there have been any improvements. - Small
    • @ Small, thanks, corrected. @BokovGleb, I don’t know what is inside Evernote, but it looks like the Lucene Search Index – Server side Lucene search index for user content , on the basis of which, again, made Elasticsearch. Do you need tags? Write them down like :москва:Π°Π²Ρ‚ΠΎΠΌΠΎΠ±ΠΈΠ»ΡŒ:ΠΊΡ€Π΅Π΄ΠΈΡ‚: and look in MySQL via LIKE '%:москва:%' OR ... Want morphology and ranking? The answer is above. - Total Pusher