There is an area with a bunch of scattered points; we need to unite in the area those points that are at a distance less than a certain constant from each other.
Something like that:
There is an area with a bunch of scattered points; we need to unite in the area those points that are at a distance less than a certain constant from each other.
Something like that:
The use of the distance matrix as a solution to the clustering problem in many cases is justified, although it leads, in general, to redundant memory reservation due to the so-called data mirroring effect. However, consider this method.
In the question tags there is no mention of the implementation language, and therefore, probably, it is permissible to use any, in particular C ++.
To begin with, we will create a structure traditionally used in such cases (it is possible and a class, but there is no special sense):
struct Point { Point() : _x(0), _y(0) {} Point(int x, int y) : _x(x), _y(y) {} int _x, _y; }; We will experiment with the same points that are indicated in the question (of course, any others are possible):
std::vector<Point> pnts; pnts.push_back(Point(45,45)); pnts.push_back(Point(0,0)); pnts.push_back(Point(21,21)); pnts.push_back(Point(29,29)); pnts.push_back(Point(5,5)); pnts.push_back(Point(33,33)); pnts.push_back(Point(40,40)); pnts.push_back(Point(80,80)); pnts.push_back(Point(20,20)); pnts.push_back(Point(2,2)); pnts.push_back(Point(25,25)); The distance matrix is ββcreated by enumerating each point with each and saving, in fact, the distance between them:
const int n = pnts.size(); std::vector<std::vector<int> > distances(n); for(int i = 0; i < n; ++i) { const Point &p1 = pnts.at(i); // ΠΠΎ ΡΠΌΠΎΠ»ΡΠ°Π½ΠΈΡ Π·Π°ΠΏΠΎΠ»Π½ΡΠ΅ΠΌ ΠΊΠ°ΠΆΠ΄ΡΡ ΠΊΠΎΠ»ΠΎΠ½ΠΊΡ Π½ΡΠ»ΡΠΌΠΈ. std::vector<int> dist(n,0); for(int j = 0; j < n; ++j) { const Point &p2 = pnts.at(j); if(i == j) { // ΠΡΠ»ΠΈ ΡΠ΅ΡΡ ΠΎΠ± ΠΎΠ΄Π½ΠΎΠΉ ΠΈ ΡΠΎΠΉ ΠΆΠ΅ ΡΠΎΡΠΊΠ΅, // ΡΠΎ ΠΈ Π½Π΅ΡΠ΅Π³ΠΎ Π²ΡΡΠΈΡΠ»ΡΡΡ, ΡΠ°ΡΡΡΠΎΡΠ½ΠΈΠ΅ - 0. continue; } else if(i > j) { // ΠΡΠ»ΠΈ ΡΠΆΠ΅ ΡΠ°Π½Π΅Π΅ ΡΡΠΏΠ΅Π»ΠΈ Π²ΡΡΠΈΡΠ»ΠΈΡΡ ΡΠ°ΡΡΡΠΎΡΠ½ΠΈΠ΅ ΠΌΠ΅ΠΆΠ΄Ρ // Π΄Π²ΡΠΌΡ ΡΠΎΡΠΊΠ°ΠΌΠΈ, ΡΠΎ ΠΏΡΠΎΡΡΠΎ ΠΊΠΎΠΏΠΈΡΡΠ΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅. dist[j] = distances[j][i]; } else { // Π‘ΠΎΠ±ΡΡΠ²Π΅Π½Π½ΠΎ, Π²ΡΡΠΈΡΠ»Π΅Π½ΠΈΠ΅ ΡΠ°ΡΡΡΠΎΡΠ½ΠΈΡ. dist[j] = std::sqrt(std::pow(p1._x-p2._x,2) + std::pow(p1._y-p2._y,2)); } } distances[i] = dist; } The resulting matrix on the existing data set will look like this:
0 63 33 22 56 16 7 49 35 60 28 63 0 29 41 7 46 56 113 28 2 35 33 29 0 11 22 16 26 83 1 26 5 22 41 11 0 33 5 15 72 12 38 5 56 7 22 33 0 39 49 106 21 4 28 16 46 16 5 39 0 9 66 18 43 11 07 56 26 15 49 9 0 56 28 53 21 49 113 83 72 106 66 56 0 84 110 77 35 28 1 12 21 18 28 84 0 25 7 60 2 26 38 4 43 53 110 25 0 32 28 35 5 5 28 11 21 77 7 32 0 As is well seen, it is mirror, and therefore redundant, and it is this factor that can be the main reason for the impossibility of using the method in question.
I deliberately do not use double type for distance storage, since accuracy with an integer ( int ) is more than enough, plus some memory savings, plus this same vector can be easily used as an index matrix.
The matrix of indices serves exactly the task of reducing the number of iterations of an array of points. This is actually a map of indices, sorted by distance. Thus, it turns out that for each point the closest to it other points will always be in the first places.
// ΠΠ΄Π΅ΡΡ ΠΌΠΎΠΆΠ½ΠΎ ΡΡΠΊΠΎΠ½ΠΎΠΌΠΈΡΡ ΠΈ Π½Π΅ ΡΠΎΠ·Π΄Π°Π²Π°ΡΡ ΠΎΡΠ΄Π΅Π»ΡΠ½ΡΠΉ Π²Π΅ΠΊΡΠΎΡ // ΠΏΠΎΠ΄ ΠΌΠ°ΡΡΠΈΡΡ ΠΈΠ½Π΄Π΅ΠΊΡΠΎΠ², ΠΏΡΠΎΡΡΠΎ ΠΏΠ΅ΡΠ΅Π·Π°ΠΏΠΈΡΠ°Π² Π·Π½Π°ΡΠ΅Π½ΠΈΡ Π²Π΅ΠΊΡΠΎΡΠ° ΡΠ°ΡΡΡΠΎΡΠ½ΠΈΠΉ, // Π΅ΡΠ»ΠΈ ΠΊΠΎΠ½Π΅ΡΠ½ΠΎ Π½Π΅ ΠΏΠ»Π°Π½ΠΈΡΡΠ΅ΡΡΡ Π΅Π³ΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ Π΄Π°Π»Π΅Π΅. std::vector<std::vector<int> > indexes(n); // Π¨Π°Π±Π»ΠΎΠ½ ΠΊΠΎΠ»ΠΎΠ½ΠΊΠΈ ΠΈΠ½Π΄Π΅ΠΊΡΠΎΠ². std::vector<int> tpl(n); for(int i = 0; i < n; ++i) tpl[i] = i; for(int i = 0; i < n; ++i) { std::vector<int> vct = tpl; const std::vector<int> &dist = distances[i]; std::sort(vct.begin(), vct.end(), [&dist](int i1, int i2) { return dist[i1] < dist[i2]; }); indexes[i] = vct; } The result is the following set of indices of points:
0 6 5 3 10 2 8 7 4 9 1 1 9 4 8 2 10 3 5 6 0 7 2 8 10 3 5 4 6 9 1 0 7 3 5 10 2 8 6 0 4 9 1 7 4 9 1 8 2 10 3 5 6 0 7 5 3 6 10 0 2 8 4 9 1 7 6 0 5 3 10 2 8 4 9 1 7 7 0 6 5 3 10 2 8 4 9 1 8 2 10 3 5 4 9 1 6 0 7 9 1 4 8 2 10 3 5 6 0 7 10 2 3 8 5 6 0 4 9 1 7 The first column (index "0") of the matrix under consideration contains the indices of points in relation to themselves. The distance here is obviously zero and it is for this reason that after sorting the cluster formation cycle can be started from the column under the index "1".
The second column (index "1") is the index of the closest point with respect to the point with the index in the first column. Similarly, the contents of all subsequent columns, but with increasing distance.
Having on hand a matrix of sorted indexes, nothing more remains, how to proceed, in fact, to clustering.
The problem's author has a feature that requires consideration of different maxima for horizontal and vertical pixel distances:
const int hrz_max = 10; const int vrt_max = 10; Clustering is convenient to perform recursively:
void check(const std::vector<Point> &pnts , const std::vector<std::vector<int> > &indexes , int pnt_idx, std::vector<int> &d) { const Point &p1 = pnts[pnt_idx]; d.push_back(pnt_idx); // ... ΡΠΎ Π²ΡΠΎΡΠΎΠΉ ΠΊΠΎΠ»ΠΎΠ½ΠΊΠΈ. for(int i = 1, n = pnts.size(); i < n; ++i) { const int pnt2_idx = indexes[pnt_idx][i]; if(std::find(d.begin(),d.end(),pnt2_idx) != d.end()) continue; const Point &p2 = pnts[pnt2_idx]; if(std::abs(p1._x-p2._x) <= hrz_max && std::abs(p1._y-p2._y) <= vrt_max) { check(pnts, indexes, pnt2_idx, d); } else { // ΠΠ΅Π·Π°ΡΠ΅ΠΌ ΠΊΡΡΡΠΈΡΡ ΡΠΈΠΊΠ» Π΄Π°Π»Π΅Π΅, Ρ.ΠΊ. ΡΠ°ΡΡΡΠΎΡΠ½ΠΈΡ // ΠΎΡΡΠΎΡΡΠΈΡΠΎΠ²Π°Π½Ρ ΠΏΠΎ Π²ΠΎΠ·ΡΠ°ΡΡΠ°Π½ΠΈΡ Π·Π½Π°ΡΠ΅Π½ΠΈΡ. break; } } } Function Arguments:
const std::vector<Point> &pnts - source point vector;
const std::vector<std::vector<int> > &indexes - matrix of indices;
int pnt_idx - index of the point for which we are forming the cluster;
std::vector<int> &d is a vector that will contain indices of points of a single cluster.
wrote my bike in javascript, can anyone else come in handy
<script type="text/javascript"> /* ΠΏΡΠ΅Π²Π΄ΠΎΠΊΠ»Π°ΡΡ ΡΠΎΡΠΊΠΈ */ function Point(x,y) { this._x = x; this._y = y; } Point.prototype.getX = function() { return this._x; } Point.prototype.getY = function() { return this._y; } /* ΡΠ°ΡΡΠ΅ΡΡ */ var points = new Array(); // ΠΌΠ°ΡΡΠΈΠ² ΡΠΎΡΠ΅ΠΊ points.push(new Point(45, 45)); points.push(new Point(0, 0)); points.push(new Point(21, 21)); points.push(new Point(29, 29)); points.push(new Point(5, 5)); points.push(new Point(33, 33)); points.push(new Point(40, 40)); points.push(new Point(80, 80)); points.push(new Point(20, 20)); points.push(new Point(2, 2)); points.push(new Point(25, 25)); const mX = 5; // ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡΠ½ΠΎΠ΅ ΡΠ°ΡΡΠΎΡΠ½ΠΈΠ΅ ΠΏΠΎ X const mY = 5; // ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡΠ½ΠΎΠ΅ ΡΠ°ΡΡΠΎΡΠ½ΠΈΠ΅ ΠΏΠΎ Y /* ΡΠ°ΡΡΠΎΡΠ½ΠΈΠ΅ ΠΌΠ΅ΠΆΠ΄Ρ Π΄Π²ΡΠΌΡ ΡΠΎΡΠΊΠ°ΠΌΠΈ */ function calcDistance(p1, p2) { return Math.sqrt(Math.pow(p1.getX() - p2.getX(), 2) + Math.pow(p1.getY() - p2.getY(), 2)) } /* ΡΡΠ»ΠΎΠ²ΠΈΡ ΠΎΠ±ΡΠ΅Π΄ΠΈΠ½Π΅Π½ΠΈΡ */ function isNear(p1, p2, mX, mY) { return Math.abs(p1.getX() - p2.getX()) <= mX && Math.abs(p1.getY() - p2.getY()) <= mY ? true : false; } /* ΡΠ°ΡΡΠ΅Ρ ΠΌΠ°ΡΡΠΈΡΡ ΡΠ°ΡΡΡΠΎΡΠ½ΠΈΠΉ */ function calcMasDistances(mas) { var l = mas.length; var r = []; for (var i1 = 0; i1 < l; i1++) { r[i1] = []; for (var i2 = 0; i2 < l; i2++) { if (i1 == i2) { r[i1][i2] = 0; } else if (i1 < i2) { r[i1][i2] = calcDistance(mas[i1], mas[i2]); } else { r[i1][i2] = r[i2][i1]; } } } return r; } /* ΡΠΌΠ΅Π½Π° ΡΠ»Π΅ΠΌΠ΅Π½ΡΠΎΠ² ΠΌΠ°ΡΡΠΈΠ²Π° ΠΌΠ΅ΡΡΠ°ΠΌΠΈ */ function swap(n1, n2, mas) { var x; x = mas[n1]; mas[n1] = mas[n2]; mas[n2] = x; return mas; } /* ΡΠ°ΡΡΠ΅Ρ ΠΌΠ°ΡΡΠΈΡΡ ΠΈΠ½Π΄Π΅ΠΊΡΠΎΠ² Π±Π»ΠΈΠ·ΠΎΡΡΠΈ */ function calcMasIndexes(mas) { var l = mas.length; var r = []; var n = []; for (var i = 0; i < l; i++) { n[i] = i; } for (var i = 0; i < l; i++) { var ns = n.slice(); var ds = mas[i].slice(); for (var i1 = 0; i1 < l - 1; i1++) { for (var i2 = i1 + 1; i2 < l; i2++) { if (ds[i1] > ds[i2]) { ns = swap(i1, i2, ns); ds = swap(i1, i2, ds); } } } r[i] = ns.slice(); } return r; } /* ΡΡΡΠ΅ΡΡΠ²ΠΎΠ²Π°Π½ΠΈΠ΅ ΡΠ»Π΅ΠΌΠ΅Π½ΡΠ° Π² ΠΌΠ°ΡΡΠΈΠ²Π΅ */ function exist(n, mas) { for (var i = 0; i < mas.length; i++) { if (n == mas[i]) { return true; } } return false; } /* ΠΊΠ»Π°ΡΡΠ΅ΡΠΈΠ·Π°ΡΠΈΡ */ function clastering(points, indexes, used, pn, cl) { if (!used[pn]) { cl[cl.length] = pn; used[pn] = true; for (var i = 1; i < points.length; i++) { var pn2 = indexes[pn][i]; if (exist(pn2, cl)) { continue; } // ΡΡΠ»ΠΎΠ²ΠΈΠ΅ Π²ΡΡ
ΠΎΠ΄Π° ΠΈΠ· ΡΠΈΠΊΠ»Π° if(isNear(points[pn], points[pn2], mX, mY)) { clastering(points, indexes, used, pn2, cl); } else { break; } } } } var distances = calcMasDistances(points); // ΠΌΠ°ΡΡΠΈΡΠ° ΡΠ°ΡΡΠΎΡΠ½ΠΈΠΉ var indexes = calcMasIndexes(distances); // ΠΌΠ°ΡΡΠΈΡΠ° ΠΈΠ½Π΄Π΅ΠΊΡΠΎΠ² var used = []; // ΠΌΠ°ΡΡΠΈΠ² Π·Π°Π΄Π΅ΠΉΡΡΠ²ΠΎΠ²Π°Π½Π½ΡΡ
ΡΠ»Π΅ΠΌΠ΅Π½ΡΠΎΠ² for (var i = 0; i < points.length; i++) { used[i] = false; } var clasters = []; // ΠΌΠ°ΡΡΠΈΡΠ° ΠΊΠ»Π°ΡΡΠ΅ΡΠΎΠ² /* ΠΊΠ»Π°ΡΡΠ΅ΡΠΈΠ·Π°ΡΠΈΡ */ for (var i = 0; i < points.length; i++) { var cl = []; clastering(points, indexes, used, i, cl); if (cl.length > 0) { clasters.push(cl.slice()); } } /* Π²ΡΠ²ΠΎΠ΄ */ distances.forEach(function(item1, i1, distances) { item1.forEach(function(item2, i2, item1) { document.write(Math.round(parseFloat(item2)) + (i2 != item1.length - 1 ? ',' : '')); }); document.write('</br>'); }); document.write('<hr>'); indexes.forEach(function(item1, i1, indexes) { document.write(item1 + '</br>'); }); document.write('<hr>'); clasters.forEach(function(item1, i1, clasters) { document.write(item1 + '</br>'); }); </script> Conclusion:
distance matrix
0,64,34,23,57,17,7,49,35,61,28 64,0,30,41,7,47,57,113,28,3,35 34,30,0,11,23,17,27,83,1,27,6 23,41,11,0,34,6,16,72,13,38,6 57,7,23,34,0,40,49,106,21,4,28 17,47,17,6,40,0,10,66,18,44,11 7,57,27,16,49,10,0,57,28,54,21 49,113,83,72,106,66,57,0,85,110,78 35,28,1,13,21,18,28,85,0,25,7 61,3,27,38,4,44,54,110,25,0,33 28,35,6,6,28,11,21,78,7,33,0 proximity matrix
0,6,5,3,10,2,8,7,4,9,1 1,9,4,8,2,10,3,5,6,0,7 2,8,10,3,5,4,9,6,1,0,7 3,5,10,2,8,6,0,4,9,1,7 4,9,1,8,2,10,3,5,6,0,7 5,3,6,10,0,2,8,4,9,1,7 6,0,5,3,10,2,8,4,9,7,1 7,0,6,5,3,10,2,8,4,9,1 8,2,10,3,5,4,9,6,1,0,7 9,1,4,8,2,10,3,5,6,0,7 10,3,2,8,5,6,0,4,9,1,7 cluster matrix
0,6 1,9,4 2,8,10,3,5 7 Source: https://ru.stackoverflow.com/questions/524772/
All Articles