Hello! At the very first stage, when we get the original image at the entrance (for example, a photo of a letter), we need to go through it with a window of n * n size and multiplication by the core (convolution matrix) to build feature maps. But nowhere is it written what values ​​should be in the kernel itself (the convolution matrix, that is, what values ​​the matrix should be multiplied with), can this matrix be used as a convolution kernel to determine edges?

enter image description here

Also, if the size of the input image is 30 * 30, will it be possible to walk through it with a 5 * 5 window, is this enough to achieve optimal recognition accuracy?

On which core of convolution is best to multiply the area of ​​the input image for the highest recognition accuracy? Or initially all values ​​in the kernel matrix are equal to zeros? Is it still possible to ask, by what rule or formula is the number of attribute cards determined? Or if the task is to recognize the 26 letters of the English alphabet, then at each stage of the construction of maps of their signs should be exactly 26? Thank you in advance!

1 answer 1

First, fill in with random values, and in the process of learning, correct the values

  • 9
    Try to write more detailed answers. Explain what is the basis of your statement? - Nicolas Chabanovsky