How can I implement an algorithm for finding the intersection of lists consisting of integers? Through the set I tried, but then does not display duplicate elements. I know that it is possible to go through the lists with nested loops and output the same elements, but I cannot implement it. It is necessary that the complexity of the algorithm is O (len (list1) + len (list2)) . If anyone helps, I will be grateful. Example:

list1 = [1,3,7,9,10] list2 = [2,3,4,7,8,11] 

Answer: [3, 7]

An example with repetitions:

 list1 = [1,3,3,4,6] list2 = [3,3,3,6,8,9] 

then should withdraw [3,3,6] .

  • one
    Give an example with repeating elements and that the expected output is m9_psy
  • You have everything already sorted in both lists, in this case it is better to use an associative array index: value, then you can run on the longest list and check for existence in another one by index; the index and there and there will be equal to the value, that is, the index for the value of 3 will be equal to 3. - Daniel Protopopov
  • @danielprotopopov I didn’t quite understand what kind of associative array should be created, and also not sure that the specified algorithm complexity will be performed. - sleet
  • @ m9_psy for example list1 = [1,3,3,4,6] list2 = [3,3,3,6,8,9] then should output [3,3,6] - sleet
  • one
    Although your comment will not work. I do not even know, BUT there is already a ready-made intersection method, for this you have to convert the list to set. Again, I can not say what the complexity of the search will be there. ( stackoverflow.com/questions/642763/… ) - Daniel Protopopov

4 answers 4

If the lists are sorted, then you can:

 def find_intersection(l1, l2): out = [] i1 = 0 i2 = 0 while (i1 < len(l1)) and (i2 < len(l2)): if l1[i1] > l2[i2]: i2 += 1 elif l1[i1] < l2[i2]: i1 += 1 else: #l1[i1] == l2[i2] out.append(l1[i1]) i1 += 1 i2 += 1 return out 
  • And if not sorted, then I doubt the reachability of O (N) - andy.37
  • @ andy.37 through the hash tables should work, just do something like this. Duplicate elements displayed - Flowneee
  • Thanks for the help. I don’t consider the situation with unsorted lists, so the algorithm is correct. - sleet

To find the intersection of lists with regard to repetitions, you can use instead of a simple set (unique elements) a class with multiset semantics (elements can appear more than once) such as collections.Counter :

 from collections import Counter common_items = list((Counter(list1) & Counter(list2)).elements()) # -> [3, 3, 6] 
  • one
    in my opinion, this is the most professional solution (of the above) - MaxU

For unsorted lists, assuming that the operation d.get(key, None) for an arbitrary dictionary d has complexity O (1):

 def count(l): d = {} for i in l: v = d.get(i, 0) d[i] = v + 1 return d l1 = [1, 1, 2, 3, 4, 3] # count(l1) = {1:2, 2:1, 3:2, 4:1} l2 = [1, 1, 3, 3, 5, 1, 2] # count(l2) = {1:3, 2:1, 3:2, 5:1} r = [] d = count(l2) for i, v in count(l1).items(): w = d.get(i, 0) for j in range(min(v, w)): r.append(i) print r # r = [1, 1, 2, 3, 3] 

If you write shorter:

 def count(l): d = {} # d = collections.defaultdict(int) for i in l: d[i] = d.get(i, 0) + 1 # d[i] += 1 return d r = [] d = count(l2) for i, v in count(l1).items(): r += [i] * min(v, d.get(i, 0)) 

The list generator here (IMHO) does not fit in any way.

  • Such horrors need nothing (there is dict.setdefault , even better there are collections.defaultdict and even better collections.Counter ) - jfs
  • @jfs is generally funny, that in general, knowing the python well, I went completely past the collections module. But in five lines it says how collection.Counter works )) - andy.37
  • ++ for your implementation of Counter ;-) It would be interesting to compare the performance for large lists - I think Counter will be much faster - MaxU
  • @MaxU even without Counter, try count () to write using defaultdict and compare the code. Other variable names could improve code readability. A list generator could also be used here. - jfs
  • @jfs, I agree, much more can be improved here ... - MaxU
 print(i for i in list1 if i in list2) 

Scrolling just one array

  • Print <generator object <genexpr> at 0x7f3cfa25fd00> - andreymal
  • So the specified complexity of the algorithm is not achieved - sleet
  • 2
    N * M complexity ... - andy.37
  • one
    And with duplicate elements it will work incorrectly - it will deduce all the repetitions from the first list, if there is at least one such element in the second one. - Xander