Convolutional neural networks (CNNs) have been the source of recent breakthroughs in many vision tasks. Feature pooling layers are being widely used in CNNs to reduce the spatial dimensions of the feature maps of the hidden layers. This gives CNNs the property of spatial invariance and also results in speed-up and reduces over-fitting. However, this also causes significant information loss. All existing feature pooling layers follow a one-step procedure for spatial pooling, which affects the overall performance due to significant information loss. Not much work has been done to do efficient feature pooling operation in CNNs. To reduce the loss of information at this critical operation of the CNNs, we propose a new EDS layer (Expansion Downsampling learnable-Scaling) to replace the existing pooling mechanism. We propose a two-step procedure to minimize the information loss by increasing the number of channels in pooling operation. We also use feature scaling in the proposed EDS layer to highlight the most relevant channels/feature-maps. Our results show a significant improvement over the generally used pooling methods such as MaxPool, AvgPool, and StridePool (strided convolutions with stride > 1). We have done the experiments on image classification and object detection task. ResNet-50 with our proposed EDS layer has performed comparably to ResNet-152 with stride pooling on the ImageNet dataset.
Recommended citation: P. Singh, P. Raj and V.P. Namboodiri, “EDS Pooling Layer”, Journal of Image and Vision Computing, Volume 98, June 2020, 103923