A dictionary is given, where different keys correspond to different size vectors. It is necessary to find the difference between each value of one key and each value of the others. The structure of the dictionaries varies depending on the incoming dataset. Ie. The algorithm should work with different sizes of values. More precisely, the dictionary stores the indices of points that are centroids, and the values ​​of the vectors - indices of points that belong to this cluster. If this information somehow helps :) Tried to use broadcasting. It worked on a toy example, but not on a real one.

in: a = [5, 6, 6, 6, 7] b = [4,3,4,5,6,7,8,9,0,5,2,46] Centroids = defaultdict(list) for i, j in zip(a, b): Centroids[i].append(j) Centroids out: defaultdict(list, {5: [4], 6: [3, 4, 5], 7: [6]}) in: k = [] for i in list(Centroids.values()): k.append(np.array(i)) print(np.array(k)) a = [] for i in range(3): for j in range(3): a.append(k[i] - k[j]) print(a) out: [array([4]) array([3, 4, 5]) array([6])] [array([0]), array([ 1, 0, -1]), array([-2]), array([-1, 0, 1]), array([0, 0, 0]), array([-3, -2, -1]), array([2]), array([3, 2, 1]), array([0])] 

This is an example that did not work:

  in: k = centers(data) out: [array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 64, 65, 67, 68, 69, 70, 71, 72, 75]), array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40]), array([41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 73, 74]), array([59, 60, 63, 66]), array([61, 62])] in: a = [] for i in range(5): for j in range(5): a.append(k[i] - k[j]) 

Departure error:

 ValueError: operands could not be broadcast together with shapes (29,) (20,) 

The question is, how can this difference be found? Above, I brought my failed attempts. Thank you in advance.

  • What difference are you trying to calculate? Between point indexes? Between points is the distance? Or something else? - MaxU
  • @MaxU I have difficulty in any operations between values. The question is how easy it is to find the difference between key values ​​in this case. {a: [1,2,3], b: [4,5]}. Find [1-4, 1-5, 2-4, 2-5, 3-4, 3-5]. - iamfina

1 answer 1

The answer to the question from the comment:

The question is how easy it is to find the difference between key values ​​in this case. {a: [1,2,3], b: [4,5]} . Find [1-4, 1-5, 2-4, 2-5, 3-4, 3-5] .

You can use Numpy broadcasting :

 In [299]: a = np.array([1,2,3]) In [300]: b = np.array([4, 5]) In [301]: (a[:, None] - b).ravel() Out[301]: array([-3, -4, -2, -3, -1, -2]) 

as a 2D array:

 In [302]: (a[:, None] - b) Out[302]: array([[-3, -4], [-2, -3], [-1, -2]])