There is a field in the table that stores the id product categories, separated by a comma (do not ask why ... this, alas, cannot be changed - and it’s too late, the base is huge).

Type of such:

 productID categories --------- ------------ 13730 101,103 15336 101,109 15320 103,104 15310 104,105 1314 19,20 348 19,26 1309 20,21 4521 21,25 3739 29,68 4019 32,69 

In php there is an array (for example) $categories = Array(20, 21, 101, 103); .

Task : it is necessary to find all the goods that have at least one category from the given array.

You can iterate over all the elements of the array one at a time, something like

 WHERE (FIND_IN_SET(20, table.categories)>0 OR FIND_IN_SET(21, table.categories)>0 OR FIND_IN_SET(101, table.categories)>0 OR FIND_IN_SET(103, table.categories)>0) 

But, I'm afraid, with a large array (and this may well happen), the request will be long: after all, text processing is going on, not by whole numbers, as we would like ...

Does anyone have any other options?

  • 2
    Not only can you change, you can stupidly duplicate all the data in a normalized form, and look for it there. - etki
  • @Mike second option - class! What you need! Issue as an answer, please ... - cyadvert

2 answers 2

Build your array in php using implode('|',$categories , then create a SQL query in the form:

 select * from table where table.categories regexp '(^|,)(20|21|101|103)(,|$)' 

But if you need to search very quickly, there are not too many categories and categories for a single product 2-3-4, then in order not to change the existing system that forms this field of categories, you can offer the following clue-oriented solution:

  1. Create a table of "category groups", something like this (group_id, catlist varchar). Those. It will contain all the options for a set of categories.
  2. Fill the table with select distinct table.categories from table
  3. add the group_id column to the table , build an index on it, fill in the id groups of the "category groups".
  4. We create a trigger for changing the categories field in the table, which firstly guarantees that the categories listed separated by commas go in ascending order, secondly, when the field is changed, it receives a group_id for this list of categories (possibly adding an entry to the table of groups) and fills the field that we have added
  5. Create a table "categories in groups" of the form (group_id, category_id). Those. 2 records 103 and 104 are put into it for recording in groups “103,104”.
  6. We do a trigger on the "category groups" that fills the table "categories in groups".
  7. We build indexes, we carry out search directly on id categories in "categories in groups", on the received group_id go at once to the main table
  8. Perhaps doing a monthly procedure to remove unused groups.

If all products have 3 categories, 150 possible categories, then the maximum is possible of the order of 560k groups, despite the fact that in general the permutation options are 150 ^ 3 = 3.3M. This is because in the first place there are no recurring numbers "1,1,1", secondly we sorted the categories and the options "3,2,1" cannot be. In this case, for 150 categories, we received 6 times less possible values ​​than the maximum. Taking into account the fact that many categories in life are not compatible, there will be much fewer options and you can expect that the groups of categories will be 10%, or even less, from the number of entries in the main table. A query like select count(1),count(distinct categories) from table will give you the exact answer to what the ratio is on your data and you can decide if the game is worth the candle.

    Mike, thank you very much - you helped solve the problem ... and you simplified the previous solution

     WHERE field_cats REGEXP('^{$id_cat}|^{$id_cat}\\\.|^\\\.{$id_cat}$|\\\.{$id_cat}\\\.|\\\.{$id_cat}$') 

    This is if one айди to be found, for example, айди user of some category or another, but also it was necessary to display several checkbox-parameters depending on the category .. (That is, in the parameters there is a field with category ID 1.5.3.4 And when entering the category selects all subcategories with attachments, that is, an array of their IDs ... And you need to find matches and display only those parameters that match their categories with their attachments)

    And here's the code

     WHERE field_cats REGEXP('(^|.)($id_cat)(.|$)') 

    just elegantly performs the task.

    And, as he said, it simplifies the code, even when you need to find a single ID (you no longer have to add 4 possible options to the regexp ) ...

    I hope not prematurely happy)))) for as long as it works as it should be.

    SPSB.