Is it possible to replace the name of the keys with references to them so that the function can be applied to the dictionary with other names of the keys?

Question

Probabilities = {'A': 0.25, 'C': 0.25, 'G': 0.25, 'T': 0.25} def WeightedDie(Probabilities): import random r = random.uniform(0, 1) letter = '' for symbol in "ACGT": if r >= 0 and r <= Probabilities['A']: letter = 'A' elif r >= Probabilities['A'] and r <= (Probabilities['A'] + Probabilities['C']): letter = 'C' elif r >= (Probabilities['A'] + Probabilities['C']) and r <= (Probabilities['A'] + Probabilities['C'] + Probabilities['G']): letter = 'G' elif r >= (Probabilities['A'] + Probabilities['C'] + Probabilities['G']) and r <= (Probabilities['A'] + Probabilities['C'] + Probabilities['G'] + Probabilities['T']): letter = 'T' return letter

I need a function for randomly selecting events with given probabilities. When I set the intervals, I get the data on the key. However, the keys can be called differently, not necessarily ACGT, and then my function will not work. That's the problem.

use the pep-8 naming convention if there is no particular reason in your case for the contrary.

Answer 1 · 2016-12-05T05:22:59

Comments to your code:

You have random numbers in the range from 0 to 1, so you do not need to check that r >= 0 (this condition is obviously fulfilled)
If the condition r <= x not fulfilled, then after that in elif no longer necessary to check that r > x (you have r >= x ).
You can get by simply looping over key-value pairs with accumulating a value with which you need to compare a random number at each stage. If a random number falls into the range, then immediately return the corresponding key.

My version of the implementation:

 import random probabilities = {'A': 0.25, 'C': 0.25, 'G': 0.25, 'T': 0.25} def WeightedDie(prob): r = random.uniform(0, 1) x = 0 for letter, p in prob.items(): x += p if r <= x: return letter print(WeightedDie(probabilities))

In practice, it is better to use a ready-made implementation, for example, the option proposed by @jfs .

better is random.random() instead of random.uniform(0,1) and r < x (strict inequality, without the right border).
By the way, in general, you can add without loss of accuracy using an algorithm similar to math.fsum()

Community spirit ♦ one · Answer 2 · 2016-12-04T19:32:18

To select a random value with an uneven distribution defined by specified weights, you can use the weighted_choice(weights) function :

 def WeightedDie(Probabilities): #XXX non pep-8 names! letters = list(Probabilities) weights = Probabilities.values() return letters[weighted_choice(weights)]

This is an O(n) in memory and time approach (which for n=4 is quite likely to be quite effective). In the general case, with a large n , each time not to add all the weights, one can calculate the partial sums once as shown in the documentation for the random module :

 >>> import itertools >>> weighted_choices = {'A': 0.25, 'C': 0.25, 'G': 0.25, 'T': 0.25} >>> choices, weights = zip(*weighted_choices.items()) >>> cumdist = list(itertools.accumulate(weights)) >>> cumdist [0.25, 0.5, 0.75, 1.0]

Then, each choice requires only O(log n) instead of O(n) steps, using the bisect module that implements a binary search on a sorted sequence:

 >>> import bisect >>> import random >>> x = random.random() * cumdist[-1] >>> choices[bisect.bisect(cumdist, x)] 'G'

In Python 3.6, it can be written as random.choices(choices, weights)[0] or what is the same, using the notation from the code in question:

 import random def WeightedDie(Probabilities): return random.choices(*zip(*Probabilities.items()))[0]

You can query several values at a time (pass a k named parameter) and explicitly set cum_weights so that you do not calculate them again.

When discussing the implementation of random.choices() , more complex methods were also considered, such as the alias method , which allows obtaining random values in O(1) after O(n log n) or O(n) initialization.

Is it possible to replace the name of the keys with references to them so that the function can be applied to the dictionary with other names of the keys?

2 answers 2

More articles: