Elements in the dictionary: count the number of identical values in dictionaries for a given key?

Question

How optimally calculate the same dictionaries by key?
For example, there are dictionaries:

S1={ 'имя':'женя', 'дом':'77'} S2={ 'имя':'егор', 'дом':'77'} S3={ 'имя':'иван', 'дом':'55'} S4={ 'имя':'вася', 'дом':'44'} S5={ 'имя':'жора', 'дом':'33'}

Conclusion: in the house 77 lives 2, in the house 55 - 1, in the house 44 - 1, in the house 33-1.

Accepted Answer · 2016-10-15T07:17:08

Optimally can be considered in different ways. You can optimally in terms of speed, in memory, in the complexity of understanding what is written, in the number of libraries used.

Here is one of the options, in my opinion it is easy to understand what is happening, and optimally from memory, since the source data is not copied

 from itertools import groupby a=[{ 'имя':'женя', 'дом':'77'}, { 'имя':'егор', 'дом':'77'}, { 'имя':'иван', 'дом':'55'}, { 'имя':'вася', 'дом':'44'}, { 'имя':'жора', 'дом':'33'}] r = groupby(sorted(a, key=lambda x: x['дом']), lambda x: x['дом']) for k, g in r: print(k, len(list(g)))

I think it is important to note that your solution will only work on a list sorted by house number.

Community spirit ♦ one · Answer 2 · 2016-10-15T07:21:07

I would use collections.Counter () :

 In [10]: from collections import Counter In [11]: Counter([d['дом'] for d in [S1, S2, S3, S4, S5]]) Out[11]: Counter({'33': 1, '44': 1, '55': 1, '77': 2})

or, as @TimofeyBondarev advised , a more economical option:

 In [4]: Counter(d['дом'] for d in [S1, S2, S3, S4, S5]) Out[4]: Counter({'33': 1, '44': 1, '55': 1, '77': 2})

If you use the generator expression, you can not create an additional intermediate list and simplify the recording: Counter(d['дом'] for d in [S1, S2, S3, S4, S5])
We can mention Counter(map(itemgetter('дом'), dicts)) , but this is a less readable alternative (the answer is better).

Answer 3 · 2016-10-15T17:19:52

 sg = tuple(s.get('дом') for s in [S1, S2, S3, S4, S5]) x = {s: sg.count(s) for s in set(sg)} print(x)

out:

 {'55': 1, '44': 1, '33': 1, '77': 2}

speed

 import timeit, random from collections import Counter from itertools import groupby def T_tuple(): sg = tuple(s.get('дом') for s in S) return {s: sg.count(s) for s in set(sg)} def C_Counter(): return Counter([d['дом'] for d in S]) def G_groupby(): r = groupby(sorted(S, key=lambda x: x['дом']), lambda x: x['дом']) return [(k, len(list(g))) for k, g in r] if __name__ == '__main__': fns = C_Counter, G_groupby, T_tuple for s, h in [(10000, 10), (10000, 100), (10000, 1000), (10, 10), (10, 100), (10, 1000)]: S = [{'имя': r, 'дом': random.randrange(h)} for r in range(s)] print('\nlen(S):', s, 'len(дом):', h) t = [(fn.__name__, timeit.Timer(fn).timeit(10)) for fn in fns] for e, (n, tmt) in enumerate(sorted(t, key=lambda r: r[1]), start=1): print("{}' {:.4} {}".format(e, tmt, n))

out:

 len(S): 10000 len(дом): 10 1' 0.01853 C_Counter 2' 0.05015 T_tuple 3' 0.06918 G_groupby len(S): 10000 len(дом): 100 1' 0.01672 C_Counter 2' 0.08368 G_groupby 3' 0.2862 T_tuple len(S): 10000 len(дом): 1000 1' 0.02135 C_Counter 2' 0.09862 G_groupby 3' 2.634 T_tuple len(S): 10 len(дом): 10 1' 7.795e-05 T_tuple 2' 0.0001365 G_groupby 3' 0.000157 C_Counter len(S): 10 len(дом): 100 1' 0.0001153 C_Counter 2' 0.0001153 T_tuple 3' 0.000209 G_groupby len(S): 10 len(дом): 1000 1' 0.0001025 T_tuple 2' 0.0001058 C_Counter 3' 0.0001683 G_groupby

This solution is not very optimal: you create a temporary tuple with house numbers, a temporary set with house numbers and use a quadratic calculation algorithm.

Elements in the dictionary: count the number of identical values in dictionaries for a given key?

3 answers 3

More articles:

Elements in the dictionary: count the number of identical values ​​in dictionaries for a given key?

3 answers 3

More articles:

Elements in the dictionary: count the number of identical values in dictionaries for a given key?