There is an array:
[ {ids: [1]}, {ids: [9,10]}, {ids: [3]}, {ids: [4,5]}, {ids: [1,2]}, {ids: [5,8]}, {ids: [8]}, {ids: [2,3]}, {ids: [10]}, {ids: [11]} ]

Each element of a given array (node) can be associated with another node by common elements of the ids array.
For example: the 0th and 4th nodes are connected by element 1, in turn the 4th node is connected to the 7th element 2, and the 7th node is connected to the 2nd element 3.
It turned out such a chain of nodes 0-2-4-7, containing elements 1, 2 and 3.
It is necessary to find all such chains and get an output on the output:

[ [ {ids: [1]}, {ids: [3]}, {ids: [1,2]}, {ids: [2,3]} ], [ {ids: [4,5]}, {ids: [5,8]}, {ids: [8]} ], [ {ids: [9,10]}, {ids: [10]} ], [ {ids: [11]} ] ]

The number of nodes at the entrance can be up to 100,000.
The task is to do it in the shortest possible time, without unnecessary iterations.
I feel that graph theory may be useful here, but unfortunately, it is not strong in it ...
Suggest algorithms :)

=============== update =================

@VladD , @a_gura , thanks for the tips. Here's what happened:

 <pre><code>class UnionFind { public $groups = array(); public function Put($ids, $el) { $index = $this->find($ids); $this->groups[$index]['ids'] = array_unique(array_merge($this->groups[$index]['ids'], $ids)); $this->groups[$index]['els'][] = $el; foreach ($this->groups as $index2=>$group) { if ($index !== $index2 && array_intersect($group['ids'], $this->groups[$index]['ids'])) { $index = $this->union($index, $index2); } } } private function make() { $index = uniqid(); $this->groups[$index] = array( 'ids' => array(), 'els' => array() ); return $index; } private function find($ids) { foreach ($this->groups as $key=>$group) { if (array_intersect($ids, $group['ids'])) { return $key; } } return $this->make(); } private function union($x, $y) { $index = $this->make(); $this->groups[$index]['ids'] = array_merge($this->groups[$x]['ids'], $this->groups[$y]['ids']); $this->groups[$index]['els'] = array_merge($this->groups[$x]['els'], $this->groups[$y]['els']); unset($this->groups[$x]); unset($this->groups[$y]); return $index; } } $set = new UnionFind(); foreach ($data as $el) { $set->Put($el['ids'], $el); }</code></pre> 

The results of the work with a small number of groups (7):

total count: 1000 1,2,3: [{"ids":[1,2]},{"ids":[2]},{"ids":[2,3]},... 4,5,6: [{"ids":[4]},{"ids":[4,5]},{"ids":[5]},{"... 8,9: [{"ids":[8]},{"ids":[8,9]},{"ids":[8]},{"id... 10: [{"ids":[10]},{"ids":[10]},{"ids":[10]},{"id... 12: [{"ids":[12]},{"ids":[12]},{"ids":[12]},{"id... 13: [{"ids":[13]},{"ids":[13]},{"ids":[13]},{"id... 7,14,11: [{"ids":[7]},{"ids":[14,7]},{"ids":[11,... total time: 0.12882208824158 seconds

The results of the work with a large number of groups (1000 - all ids in the source array are unique):

total count: 1000 1: [{"ids":[1]}] 2: [{"ids":[2]}] 3: [{"ids":[3]}] ... ... 998: [{"ids":[998]}] 999: [{"ids":[999]}] 1000: [{"ids":[1000]}] total time: 9.0800588130951 seconds

How can you improve performance with a large number of groups?

============== update ==============

The @Johny solution turned out better:
The results of the work with a small number of groups (7):
total time: 0.073331117630005 seconds

The results of the work with a large number of groups (1000 - all ids in the source array are unique):
total time: 3.4532918930054 seconds

  • I understand that it is not so important, but I am curious in what programming language do you plan to implement all this? ) - Johny
  • Language - PHP - Artem Ryzhov
  • one
    Self-evident. Get a lot of current groups, in each id belonging to it and elements of the original array. Process the source array one by one, merging groups, if necessary. - VladD

2 answers 2

The decision is easier if you introduce the concept of "group" and begin to act on it.

We get an array of $groups . In it, each subarray will be considered a separate group. We place in the first group the numbers from the first array of the initial data $data , unset($data[0]) this element (well, the element itself is added to where you need it).

Next, we take in turn the numbers from the first group, run through the original $ data array and look for elements with this number. If we find - we rake the numbers that are with it in the current element and add them to the same group, and the element itself is again doing unset() .

When the array is completed, we take the next number from this group and repeat the previous procedure. At the end of the group, two options are possible: either the $data array will be empty, and therefore all the numbers belong to the same group, or there are still elements left. In the second case, start everything from the beginning.

IMPORTANT! The numbers themselves are best preserved as keys of group arrays, this will avoid duplication of data, and you can additionally calculate how many each number is encountered in the original data.

An example implementation . Take measurements on your data and show plz results :)

  • How it has become fashionable now: a solution in 40 lines! :)))) - Johny
  • one
    Thanks, the results posted! - Artem Ryzhov

Very similar to Union-Find .