Grouped Pointwise Convolutions Reduce Parameters in Convolutional Neural Networks

In DCNNs, the number of parameters in pointwise convolutions rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Our proposal makes pointwise convolutions parameter efficient via grouping filters into parallel branches or groups, where each branch processes a fraction of the input channels. However, by doing so, the learning capability of the DCNN is degraded. To avoid this effect, we suggest interleaving the output of filters from different branches at intermediate layers of consecutive pointwise convolutions. We applied our improvement to the EfficientNet, DenseNet-BC L100, MobileNet and MobileNet V3 Large architectures. We trained these architectures with the CIFAR-10, CIFAR-100, Cropped-PlantDoc and The Oxford-IIIT Pet datasets. When training from scratch, we obtained similar test accuracies to the original EfficientNet and MobileNet V3 Large architectures while saving up to 90% of the parameters and 63% of the flops.

Keywords

EfficientNet, Deep Learning, Computer Vision, CNN, DCNN

Citation

Mendel. 2022 vol. 28, č. 2, s. 23-31. ISSN 1803-3814
https://mendel-journal.org/index.php/mendel/article/view/169

Document type

Peer-reviewed

Document version

Published version

Language of document

en

Document licence

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license
http://creativecommons.org/licenses/by-nc-sa/4.0