Hello! At the very first stage, when we get the original image at the entrance (for example, a photo of a letter), we need to go through it with a window of n * n size and multiplication by the core (convolution matrix) to build feature maps. But nowhere is it written what values should be in the kernel itself (the convolution matrix, that is, what values the matrix should be multiplied with), can this matrix be used as a convolution kernel to determine edges?
Also, if the size of the input image is 30 * 30, will it be possible to walk through it with a 5 * 5 window, is this enough to achieve optimal recognition accuracy?
On which core of convolution is best to multiply the area of the input image for the highest recognition accuracy? Or initially all values in the kernel matrix are equal to zeros? Is it still possible to ask, by what rule or formula is the number of attribute cards determined? Or if the task is to recognize the 26 letters of the English alphabet, then at each stage of the construction of maps of their signs should be exactly 26? Thank you in advance!