What neural network to choose to solve the problem

Question

Recently began to study neural networks. As it turned out, there is a huge number of different types of neural networks. I am not able to choose one of them on my own, so I’m asking for help here. What neural network do I need?

The task is approximately as follows: at the entrance an image is given, on which four other images of the same size are located. They all stand in one line, as in the example below.

Between them there may or may not be a distance. The background can also be any other.

There is a directory with all possible images (~ 50 pcs.), Which can be found on the input image. The directory looks like this:

At the exit you need to get the names of the files whose images match the found ones. That is, if briefly, on the input image you need to find four images from the directory and return the names of the images found. In the input data, on the images that need to be recognized, there may be a little noise in the form of an inscription in the corner or in the center.

Thank you in advance.

If there are really only 50 samples and there will never be others - it is better to do without NA.
Neural networks are designed to work with unknown input data (for example, recognize a machaon), you have the same input data.
Calculate image centers and compare neighborhoods, for example.
In addition, you can always mark your images in a special way (such as steganography) =)

Accepted Answer · 2017-02-09T17:01:53

If working with images, then definitely OpenCV . The neural network module is one of many there, and this is understandable - the image needs to be properly prepared before the National Assembly enters it, it is very likely that the task will be solved without the National Assembly (basically it does). Nevertheless, in versions 3.1 and 3.2 some attention was paid to the neural network module, one can hope for further development.

As for your task, NA is definitely not needed. If you need to find pictures that exactly match - calculate and compare the sums (or average) of pixel colors by rows or by columns. If not exactly, you need to search and compare contours (findContours). If you need to find one picture in another - search by pattern (templateMatching). You can continue until about lunch tomorrow :-)

Can findContours find the outlines of the image, if it is not rectangular, but in the shape of a diamond, for example?
@Newbie, if findContours would only search for rectangles, it would be called findRectangles :-) A diamond-like image cannot exist, but there may be a figure in a diamond-shaped image.
To identify this figure, you need to have contoured edges, edges, in turn, can be identified in many ways - Canny, Laplacian, threshold, color filter, etc. For your task, pay attention to templateMatching - there are many examples - for example, docs.opencv .org / 3.1.0 / d4 / dc6 / tutorial_py_template_matching.html
I recently recommended here - excellent video tutorials on identifying objects, on OpenCV - intorobotics.com/how-to-detect-and-track-object-with-opencv
Well, I would do it in your place - scan each time images poorly (there may be 500 of them), select the points and descriptors when entering the template and save it in the database (the format needs to be developed).
Then, when searching from the analyzed image, the points and descriptors are also extracted once, and are compared in turn with the points / descriptors in the database - choose the best result and check for min.

Community spirit ♦ one · Answer 2 · 2017-02-10T06:36:31

This is not an obvious answer from @Eugene Bartosh . When the task is to recognize the images there is one very simple answer - the convolutional neural network . This is the best option that will work in all cases. And when I read comments from the category of the National Assembly here it will not be entirely helpful or the best not to use it at all, it surprises me very much. For the most part, such "advisers" do not know the National Assembly, because there is no such task where the National Assembly would be useless and unnecessary. Yes, in some cases, the National Assembly may be overkill, but you will not rely on manual comparison of poxels and other aspects that will immediately break down if the task changes just a little (in other words, scalability). The HC, no matter what the coordinates, is the image and what size it is and absolutely DO NOT need to prepare the image for processing. You just submit a photo to the National Assembly and if there is something we are looking for, the National Assembly will simply find it and you do not need to worry about how she will do it, the main thing is to teach her about it. All you need to know about convolutional NA can be read from my answer to this question .

What you described by reference and implemented in the TensorFlow library is a primitive type of LBP and HAAR algorithms.
The “recognition level” can be assessed by their demovideo - even face recognition did not show ... close-up high-contrast inscriptions ... well, somehow not at all impressive ... API 10 functions ... but everything is “just” and not necessary cook anything :-))
If you throw out the criticism of a neighboring answer from another participant from your answer (which you should not post as an answer), then the "Convolutional neural network will remain. Everything you need to know about convolutional NA can be read from my answer to this question."
"what size is it" - Yeah, because the input data can be of any size.

What neural network to choose to solve the problem

2 answers 2

More articles: