Greetings. There is such a task

Suppose there are 2 multi-dimensional arrays

One of them (let it be $ array1 ):

Array ( [299292] => Array ( [ID] => 299292 [ID_EL] => [ARTICUL] => 336867 [TOVAR] => Шезлонг Garden Way LOUIS-A 76015 черный [MODEL] => LOUIS-A 76015 черный [ROD] => Спорт и отдых [CATEGORY] => Спорт и отдых [SUBCATEGORY] => Качели и гамаки [PRICE] => 7200 [BRAND] => Garden Way [NAME] => Шезлонг [RRC] => РРЦ ) [299291] => Array ( [ID] => 299291 [ID_EL] => [ARTICUL] => 336866 [TOVAR] => Шезлонг Garden Way LOUIS-A 76015 бежевый [MODEL] => LOUIS-A 76015 бежевый [ROD] => Спорт и отдых [CATEGORY] => Спорт и отдых [SUBCATEGORY] => Качели и гамаки [PRICE] => 7200 [BRAND] => Garden Way [NAME] => Шезлонг [RRC] => РРЦ ) [299290] => Array ( [ID] => 299290 [ID_EL] => [ARTICUL] => 336865 [TOVAR] => Шезлонг Garden Way BRIGO 770505 черный [MODEL] => BRIGO 770505 черный [ROD] => Спорт и отдых [CATEGORY] => Спорт и отдых [SUBCATEGORY] => Качели и гамаки [PRICE] => 5700 [BRAND] => Garden Way [NAME] => Шезлонг [RRC] => РРЦ ) /*и тд*/ ) 

The second array is $ array2 :

 Array ( [111113] => Array ( [ID] => 1314124 [NAME] => Шезлонг Garden Way LOUIS-A123 76015 черный ) [111112] => Array ( [ID] => 299291 [NAME] => Шезлонг Garden Way LOUIS-A12 76015 бежевый ) [11111] => Array ( [ID] => 13253325 [NAME] => Шезлонг Garden Way BRIGO 770505 черный ) /*и тд*/ ) 

Each array contains 10,000+ nested arrays.

I need to compare the first array using the TOVAR key with the second array using the NAME key. If there is a match, perform some simple operations. But that's not the point. The question is: how to do it as quickly and efficiently as possible?

After all, sorting (comparison algorithm) type:

 foreach($array1 as $row) foreach ($array2 as $cat) if ($row['TOVAR']==$cat['NAME']) echo 'Равны: '.$row['TOVAR'].' и '.$cat['NAME'].'<br/>'; else echo 'Не равны: '.$row['TOVAR'].' и '.$cat['NAME'].'<br/>'; 

will run idle on many iterations and it will greatly affect the speed. How to use array_intersect () in my case; I also do not know, because the comparison is based on different keys (and a separate array is not needed). I think the operations should be carried out immediately in a loop on the fly (in case of coincidence, for example, you need to copy the ID from the second array instead of the ID array of the first)

What do you advise? Thank!

    3 answers 3

     foreach($array1 as $row) foreach ($array2 as $cat){} 

    The main problem of this algorithm is quadratic asymptotics.
    Speaking of simple - the number of iterations here N*M

    To make it linear, let's create an index:

     $index = []; foreach($array1 as $key => $row) { if (!isset($index[$row['TOVAR']])) $index[$row['TOVAR']] = []; $index[$row['TOVAR']][] = $key; } 

    Now that we have an index, we can search for it:

     foreach ($array2 as $cat){ if (!empty($index[$cat['NAME']])) { echo 'Найдены совпадения:' . count($index[$cat['NAME']])."\n"; } } 
    • But on the construction of the index, what is the assymptotic? if the bubble is approximately NxN, well, let's say, the rules of the algorithm, say NxNx0.4 ... total received (NxNx0.4 + M) instead of NxM - have you won much? - Eugene Bartosh February
    • one
      In the above code, the asymptotics of N + M, the insertion of an element into an array (in php it is something like HashMap) is a constant operation. What kind of bubble is it? - vp_arth
    • about bubble sorting algorithm - Eugene Bartosh
    • I mean, where are you going to add it up? - vp_arth
    • vp_arth, I didn’t remember about the bubble for “clarity”, hinted that index building is not free :-) - Eugene Bartosh

    After reading this, I want to fight with a tiger ...

    1. Where in PHP came from 2 ( JVA , Karl!) Array of 10k + elements? It is impossible to delay such packets in PHP! It is necessary to take 20-50-100 records (page by page), do you want to kill the server? you have a million users coming, will launch this ... and what - half of your base will stick out in each session? What will happen to the server?
    2. All these comparisons in PHP are monkey work, you just need to correctly write a SQL query (yes, a query for 5-10 lines), which will do the same at the DBMS level, which is exactly what was done to choose something, with something to compare, and in large batches as well.

    I would have introduced SQL from the 6th high school to a compulsory starting class, and then this is written in PHP! If, Mlyn, you don’t want / can’t / don’t have enough time to write a request - well, write you here, explain the task - in a day they will write several options for you ... Well, you can’t mock Universal Intelligence like that!

    • 2
      Write some nonsense, you understand that the task can come from anywhere, like the data. Maybe there granny collected these 10k lines in some textbook, it does not matter. The logical component of the question “how to do this faster than brute force” is important. It is clear that if you start to initially understand the question, before you even have the data, you will do more correctly. And if you already have the data? We'll have to dance from this, and the writing language has absolutely nothing to do with it (taken from the comments from another answer) - Vasily Barbashev
    • @ Vasily Barbashev, I don’t write anything nonsense, and I wish you something ;-) - Eugene Bartosh
    • By the arrays themselves it is clear that the data was selected by a query from the database, and that the data are typical for the store - there’s nothing to dream about women and with "good" tips about algorithms to introduce novice developers into even bigger misconceptions than they already are. And then people have to go to these stores, and the owners think why sales do not go ... - Eugene Bartosh
    • I wrote a general comment on the subject of answers to questions, and you only look at this question. Think wider. And do not argue, we are both right in principle. - Vasily Barbashev
    • one
      Too toxic . Who told you that there is a server, subd and so on. Deliberations on the topic of high school programs in general beyond. The world is not limited to the class of tasks that you solve. - user236014

    I fully agree with the previous speaker, this should be done on the side of the database.

    But if you really want array_udiff to help.

    I will explain a little - Kolbek function returns 0 if we consider that the elements are equal and not a zero value otherwise.

    So we need to get the elements that are present in both arrays:

     function compare_func($a, $b) { if($a['TOVAR']==$b['NAME']) return 1; else return 0; } $in1andIn2 = array_udiff($array1, $array2, 'compare_func')); 
    • The question is not a word about the database. What makes you think that she is there. - mJeevas
    • And where is the data from? Out of the air? Well, maybe from a file, etc. Then you need to drive in the database and work with them there. If this is not a one-time task of course. For a one-time task is not worth it, I agree. - Maxim Stepanov
    • one
      return 1; else return 0; - unnecessary letters. - vp_arth
    • because the question is not, because the right question is half the answer, so we are trying to hint the author directly and indirectly) - Eugene Bartosh
    • and if this is a one-time task - then anything can be taken - Python, C ++, BASIC - but not PHP, agree)) - Eugene Bartosh