RGB and RGBA dimensions after convolution

Question

For example, there is an image of 500х500х3 (or x4, if RGBA ) and a core for a convolution of 3х3х3 .

Why, after convolution, the output image already has a larger dimension of channels ( 500х500х9 )? How does this happen and what values are written there on the output?

Good afternoon, I would like to advise you on the Data Science resource, there are a lot of answers on this topic, and on this site the question may not be understood by the community or closed.

MaxU MaxU 52.3k 6 18 51 · Answer 1 · 2018-10-05T22:52:48

One of the parameters of the convolutional layer is the number of filters, which sets the depth of the output (next) layer on this convolutional layer. In general, the Глубина слоя is not the number of color channels; rather, it can be viewed as a set of detected features (for example, vertical lines, horizontal, dianonal, at a certain angle X, Y, Z, arcs, circles, ellipses, etc.). The more convolutional layers and filters we have in them, the more difficult the signs of ANN are to learn to recognize (for example, the human eye or bird beak or the contour of a motorcycle or car). Each filter in your case has a dimension of 3x3x3 and judging by the dimension of 500x500x9 - there were 9 filters in this convolutional layer.

If 500х500х3 apply one filter with a 3х3х3 convolution 3х3х3 and padding='same' to a color image of dimension 500х500х3 then we will have a 2D matrix / tensor of dimension 500x500x1 . The last dimension number corresponds to the number of filters.

RGB and RGBA dimensions after convolution

1 answer 1

More articles: