Actually, the task is to copy the behavior of the SpatialMaxPooling and SpatialMaxUnpooling layers on theano as precisely as possible.

In this case, SpatialMaxUnpooling fills only those "cells" that correspond to the indexes of maximum values ​​in the corresponding SpatialMaxPooling.

For example - here is the input image.

Image before pooling

SpatialMaxPooling will save from each section a 2x2 pixel with a maximum value and its index.

And SpatialMaxUnpooling - will set the value only to those pixels that correspond to the indexes. That is, the output will be

image after unpooling

I threw such an implementation:

def pooling2d_2x2(self, x): reshaped = x.reshape([ x.shape[0], x.shape[1], x.shape[2] // 2, 2, x.shape[3] // 2, 2 ]) max_values, max_indices = T.max_and_argmax(reshaped, (3,5,)) return max_values, max_indices def unpooling2d_2x2(self, pooled, indices): tmp_shape = [pooled.shape[0], pooled.shape[1], pooled.shape[2], 2, pooled.shape[3], 2] # Resize image resized = pooled.repeat(2, 2).repeat(2, 3) pooled_reshaped = resized.reshape(tmp_shape) # Resize indices indices_repeaten = indices.repeat(2, 2).repeat(2, 3).reshape(tmp_shape) # Calculate output result = pooled_reshaped * 0.0 result = T.set_subtensor(result[:, :, :, 0, :, 0], pooled_reshaped[:, :, :, 0, :, 0] * T.eq(indices_repeaten[:, :, :, 0, :, 0], 0)) result = T.set_subtensor(result[:, :, :, 0, :, 1], pooled_reshaped[:, :, :, 0, :, 1] * T.eq(indices_repeaten[:, :, :, 0, :, 1], 1)) result = T.set_subtensor(result[:, :, :, 1, :, 0], pooled_reshaped[:, :, :, 1, :, 0] * T.eq(indices_repeaten[:, :, :, 1, :, 0], 2)) result = T.set_subtensor(result[:, :, :, 1, :, 1], pooled_reshaped[:, :, :, 1, :, 1] * T.eq(indices_repeaten[:, :, :, 1, :, 1], 3)) result_shape = [pooled.shape[0], pooled.shape[1], pooled.shape[2] * 2, pooled.shape[3] * 2] return result.reshape(result_shape) 

But she didn’t distinguish herself with speed (by the way, I wouldn’t refuse recommendations - than profile) Hence the question - what can be improved here?

    1 answer 1

    The next replacement (as far as I understand theano (and this is probably a very mediocre understanding :-)) - here we no longer allocate memory for the new tensor and refer only to the "increased" input tensor and indices) somewhat increased speed. However, it is possible - there are other possible improvements?

     def unpooling2d_2x2(self, pooled, indices): tmp_shape = [pooled.shape[0], pooled.shape[1], pooled.shape[2], 2, pooled.shape[3], 2] # Resize image resized = pooled.repeat(2, 2).repeat(2, 3) pooled_reshaped = resized.reshape(tmp_shape) # Resize indices indices_repeaten = indices.repeat(2, 2).repeat(2, 3).reshape(tmp_shape) # Calculate output result = pooled_reshaped * 0.0 # Calculate output result = T.set_subtensor(pooled_reshaped[:, :, :, 0, :, 0], pooled_reshaped[:, :, :, 0, :, 0] * T.eq(indices_repeaten[:, :, :, 0, :, 0], 0)) result = T.set_subtensor(result[:, :, :, 0, :, 1], pooled_reshaped[:, :, :, 0, :, 1] * T.eq(indices_repeaten[:, :, :, 0, :, 1], 1)) result = T.set_subtensor(result[:, :, :, 1, :, 0], pooled_reshaped[:, :, :, 1, :, 0] * T.eq(indices_repeaten[:, :, :, 1, :, 0], 2)) result = T.set_subtensor(result[:, :, :, 1, :, 1], pooled_reshaped[:, :, :, 1, :, 1] * T.eq(indices_repeaten[:, :, :, 1, :, 1], 3)) result_shape = [pooled.shape[0], pooled.shape[1], pooled.shape[2] * 2, pooled.shape[3] * 2]