There was a similar topic , but the answer is not clear to me. Why does not remove all values from the list?
data = [1, 2, 3, 4, 5, 6] for i in data: if data.count(i) == 1: data.remove(i) print data Prints [2,4,6]
There was a similar topic , but the answer is not clear to me. Why does not remove all values from the list?
data = [1, 2, 3, 4, 5, 6] for i in data: if data.count(i) == 1: data.remove(i) print data Prints [2,4,6]
The most important thing you have to do is never change the size of the array while passing through it.
Let's see how resizing an array affects the logic of a loop:
In [3]: l = list(range(6)) In [4]: for x in l: ...: print(x) ...: l.remove(x) ...: 0 2 4 Let's look at it through the prism of wonderful ASCII drawings:
+---+---+---+---+---+---+---+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | <- l +---+---+---+---+---+---+---+ ^ x Print x and remove it from the list:
# print(0) +---+---+---+---+---+---+ | 1 | 2 | 3 | 4 | 5 | 6 | <- l +---+---+---+---+---+---+ ^ x Let's move on to the next element, as the great Guido van Rossum has bequeathed to us:
+---+---+---+---+---+---+ | 1 | 2 | 3 | 4 | 5 | 6 | <- l +---+---+---+---+---+---+ ^ x To fix, repeat the steps: type, delete and go to the next iteration of the cycle:
# print(2) +---+---+---+---+---+ | 1 | 3 | 4 | 5 | 6 | <- l +---+---+---+---+---+ ^ x (до перехода) +---+---+---+---+---+ | 1 | 3 | 4 | 5 | 6 | <- l +---+---+---+---+---+ ^ x (после перехода) It is obvious that by changing the size of the array during the iteration on it, another small evil is born, which can lead (and leads) to errors.
The shortest working equivalent of this cycle was presented by @andreymal:
data = [x for x in data if data.count(x) > 1] But the solutions presented have one common drawback - they have quadratic complexity.
In [9]: data = list(range(10000)) In [10]: %timeit [x for x in data if data.count(x) > 1] 1 loops, best of 3: 1.66 s per loop The standard Python library provides the class Counter , which counts the number of occurrences of each element. Thus, the speed will be linear (strictly speaking, amortized linear):
In [11]: from collections import Counter In [12]: def f(xs): ....: counter = Counter(xs) ....: return [x for x in xs if counter[x] > 1] ....: In [13]: %timeit f(range(10000)) 100 loops, best of 3: 2.33 ms per loop As we wrote in the previous answer, the array changes, but the index does not change, in fact, an extra offset to the next element is obtained when another element is deleted.
When I'm too lazy to be smart, and the list should be cleaned, I create a copy of the array:
for i in tuple(data): if data.count(i) == 1: data.remove(i) (tuple instead of list, because it is said to be more productive)
When I'm not too lazy to wise, I can get a separate list for deleted items:
rm = [] for i in data: if data.count(i) == 1: rm.append(i) for x in rm: data.remove(i) When I recall the existence of generator expressions, I write a one-liner variant:
data = [x for x in data if data.count(x) > 1] The fourth option I know is given in another answer.
data small, then it is better: non_uniq = [item for item, count in Counter(data).items() if count > 1] (linear algorithm) - jfsif you add a seal
data = [1, 2, 3, 4, 5, 6] for i in data: print i if data.count(i) == 1: data.remove(i) print data will get
1 3 5 apparently in python, when you delete an element, the index remains, and you step through one. those. after deletion, you must again check the item with the same index. The easiest way is to run a descending index
like this, delete everything (my first code on python :), for sure you can be more beautiful)
data = [1, 2, 3, 4, 5, 6] i = len(data)-1 while i>=0 : if data.count(data[i]) == 1: data.remove(data[i]) i = i-1 print data Source: https://ru.stackoverflow.com/questions/433673/
All Articles