I compared all the images in pairs - it turns out a matrix of numbers (for convenience, compressed to 0-254) N ​​x N, how to arrange now so that the similar ones are near

int r[][] = { { 255, 100, 93 }, { 100, 255, 113 }, { 93, 113, 255 }}; 

in this example, the order of = 1,0,2 seems to be correct. n-number of images will be less than 100,000, the numbers in the array are how different are the images, i.e. 1 line 0/0 (such a special case = 255), 0 and 1, 0 and 2 ..

Option 1, the search function looks for the most similar insertion point; working is not bad at all, better than complete bust. In the long run, the result was generally interesting, like preview or cutting. It would be better, of course, if it was continuous with 1 piece, and so the resulting scenes should be connected at the borders (a bug or comparison is not exact)

 of.push_back(0); for (int i = 1; i < n; i++) { auto find_bm = [&](int j) { std::pair<int, int> m = { 0, 255 }; for (int i = 0; i < of.size(); i++) { int _1 = of[i], x = r[j][_1], _2, y = x; if (i < of.size() - 1) _2 = of[i + 1], y = r[j][_2]; if (m.second > x + y) m = { i, x + y }; } return of.begin() + m.first + 1; }; auto it = find_bm(i); of.emplace(it, i); } 

matrix creation, the code is too integrated, I hope it is clear like this:

 srcc = avs_vfbuf_t(new avs_vfbuf(_srcc, env)); cv::Ptr<cv::img_hash::ImgHashBase> func = cv::img_hash::BlockMeanHash::create(); std::vector<cv::Mat> img_hash(n); for (int i = 0; i < n; i++) { avs_vf_t srcf = srcc->get(i); s_vf_plane srcp = srcf->plane(); cv::Mat img(srcp.h, srcp.w, CV_8U, srcp.p(), srcp.pitch); func->compute(img, img_hash[i]); } for (int i = 0; i < n - 1; i++) { for (int j = i + 1; j < n; j++) r[i][j] = r[j][i] = std::min(int(func->compare(img_hash[i], img_hash[j])), 254); } 

this option does not work with more than ~ 15 size, and I’m not sure what is right:

 int of[] = { 0, 1, 2 }; std::pair<int, std::vector<int>> bm = { INT32_MAX, of }; do { int _ = 0; for (int i = 1; i < n; i++) _ += r[of[i - 1]][of[i]]; if (_ < bm.first) bm = { _, of }; } while (std::next_permutation(of.begin(), of.end())); of = bm.second; 

    2 answers 2

    Bitwise comparison of images is akin to the atomic comparison of two people in order to understand whether they are similar or not. Both the first and second cases do not make sense and will work only for absolutely identical objects. Imagine - the camera shot the frame, then moved a couple of centimeters (or turned a couple of degrees) and immediately shot the second frame. Looks like the pictures? Without a doubt, anyone who looks will say "the same", if not really peer. And a bit-by-bit comparison will give a complete bullshit - no coincidences. Just like the atomic comparison of two twins will give some bullshit - one caught a cold, the other ate something wrong, and they will have a minimum of coincidences, although they are very, very similar in appearance.

    There are many methods of comparing images - based on color balance (histograms), based on the characteristics of the contours, based on key points, and so on. The most common (but not the best) method is the comparison of key points - in detail with examples in the opencv documentation .

    • No, the quality is quite. The problem is not in comparing and sorting / ordering. I tried several ways and it turns out badly. the best is a full brute force or - I will add the old method. the main thing in the question "that the similar should be near," the rest is not so important - J. Doe
    • I do not know what you are doing there and what is good for you)) to find just more or less similar ones - count N average colors (get a histogram for each picture, a vector of dimension N) and compare vectors for example by the minimum Euclidean distance method. - Eugene Bartosh
    • Yes, I find them)), by the way through img_hash opencv, only after it turns out that 1 for example looks like 2 and 3 but 3 for 2 is not, or something like that, I didn’t even understand the reason, I need to sort them "in ascending order" ) - J. Doe
    • 2
      you need to understand what this hash is, I have never used img_hash so I don’t know how it is formed, and if you want, here are the primary sources - phash.org/docs/pubs/thes_zauner.pdf :-))) - Eugene Bartosh
    • one
      Vector is very glad (vector is very smooth :-)) Glad yourself please :-)) What are the only terms that programmers do not invent when they have to deal with mathematics - but nefig was forgetting everything after uni) PS give a deep triangle and shaggy Square - Eugene Bartosh

    I believe that in general an attempt to convert a bunch of images into a linear order with an adequate result will not work. I would divide the images into groups of similarities.

    this option does not work with more than ~ 15 size, and I’m not sure what’s right

    I didn’t understand what this code is doing. In the case of a decrease, I would try to interchange all the images in a row and save the changes. However, on a large number of images (just about 15) adequate work is unlikely to be able to achieve something.

    And in addition, why use int and reduce it to 0-254 in such a case is more logical than uchar (although this does not increase the speed of calculations).

    In general, I advise you to think about the division of images into 5-10 groups of similarity and then they are already in line. In addition, a lot of time is needed to create a matrix of 100 per 100 images, and there is less time to divide the similarity into groups (there is no need to completely fill the table).

    • the matrix is ​​created quickly, and at 40,000 isobr. the code reduces the total amount of differences, I think incorrectly: for example, the most suitable pair is selected and further suitable as carriages to it, the error is that it is slightly smaller than the initial pair but can thus go far far (probably, maybe it works correctly) - J .Doe
    • On 40000 I’m not believing for this is 40 * 40 mb of memory. A matrix like this only on a super computer can be quickly filled. If the problem is at the beginning, then we create the possibility of adding to the beginning. But again, it is impossible to line up the image adequately. I'd rather try to place in a volume of 3x and better 5 dimensional, and then convert it into a vector. - Nikita Samoukov
    • 2GB is normal (uint8_t), the complexity is quadratic, one way or another I have ~ 15-25 minutes, that is, in a reasonable time. Adequately build, you can think, this is a matrix all the same. This is a fallback into groups - there are usually no more than 2 similar frames, somehow sorting will have to happen later - J. Doe
    • you're right. I have already written, for example, this person img.morphthing.com/i/40178073/2/0/add03860/… gives out 100% similar to Tom Cruise and Bruce Willis, although they themselves are not similar to each other :-) there are a lot of similarity criteria. .. in some cases some criteria dominate, in others - others - Eugene Bartosh
    • @EugeneBartosh of the first volume in the middle of the Bruce after, like this, I need sorting) - J. Doe