【论文阅读】GhostNet: More Features from Cheap Operations（GhostNet学习笔记）-白红宇

【论文阅读】GhostNet: More Features from Cheap Operations（GhostNet学习笔记）

阅读量：2055 次

发布时间：2019-04-28

本文共 2591 字，大约阅读时间需要 8 分钟。

在这里插入图片描述

华为诺亚方舟实验室提出新型端侧神经网络架构GhostNet。论文地址：

从论文的摘要可以对GhostNet有个大概的了解：

Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the limited memory and computation resources. The redundancy in feature maps is an important characteristic of those successful CNNs, but has rarely been investigated in neural architecture design. This paper proposes a novel Ghost module to generate more feature maps from cheap operations. Based on a set of intrinsic feature maps, we apply a series of linear transformations with cheap cost to generate many ghost feature maps that could fully reveal information underlying intrinsic features. The proposed Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks. Ghost bottlenecks are designed to stack Ghost modules, and then the lightweight GhostNet can be easily established. Experiments conducted on benchmarks demonstrate that the proposed Ghost module is an impressive alternative of convolution layers in baseline models, and our GhostNet can achieve higher recognition performance (e.g. 75:7% top-1 accuracy) than MobileNetV3 with similar computational cost on the ImageNet ILSVRC-2012 classification dataset.

通常情况下，为了保证模型对输入数据有全面的理解，训练好的深度神经网络中，会包含丰富甚至冗余的特征图。如下图所示，在ResNet-50中，经过第一个残差块处理后的特征图，会有出现很多相似的“特征图对”——它们用相同颜色的框注释。“特征图对”中的一个特征图，可以通过廉价操作（cheap operations）（上图中的扳手）将另一特征图变换而获得，则可以认为其中一个特征图是另一个的“Ghost”。这样操作，虽然能实现较好的性能，但要更多的计算资源驱动大量的卷积层，来处理这些特征图。摘要中加粗部分的第一句话可以看出本文的motivation是用更少的参数，生成与普通卷积层相同数量的特征图。

即，并非所有特征图都要用卷积操作来得到，“Ghost”特征图，也可以用更廉价的操作来生成。于是就有了GhostNet的基础——Ghost Module，其需要的算力资源，要比普通卷积层要低，集成到现有设计好的神经网络结构中，则能够降低计算成本。

上图展示的是普通卷积和Ghost Module输出相同数量特征图的过程。Ghost模块将普通的卷积层分解为两个部分，第一部分包含固有特征图，这部分特征图尺寸通常较小，且由普通卷积得到。通过下式的卷积操作，产生m个固有的特征图（Y’），其中 f’ 是卷积核：

第二部分是对Y’中的每一张特征图进行一系列的廉价线性操作，得到n张特征图：

与普通卷积神经网络相比，在不更改输出特征图大小的情况下，该Ghost模块中所需的参数总数和计算复杂度均已降低。

GhostNet的核心就是Ghost Module，Ghost Bottleneck是团队专门为小型CNN设计的，集成了多个卷积层和shortcut。Ghost bottleneck主要由两个堆叠的Ghost模块组成。第一个用作扩展层，增加了通道数。第二个用于减少通道数，以与shortcut路径匹配。然后，使用shortcut连接这两个Ghost模块的输入和输出。这里说的Ghost bottleneck，适用于上图Stride= 1情况。对于Stride = 2的情况，shortcut路径由下采样层和Stride = 2的深度卷积（Depthwise Convolution）来实现。

此外，而且出于效率考虑，Ghost模块中的初始卷积是点卷积（Pointwise Convolution）。在Ghost bottleneck的基础上，研究团队提出了GhostNet——遵循MobileNetV3的基本体系结构的优势，用Ghost bottleneck替换MobileNetV3中的bottleneck。第一层是具有16个卷积核的标准卷积层，然后是一系列Ghost bottleneck，通道逐渐增加。Ghost bottleneck根据输入特征图的大小分为不同的阶段，除了每个阶段的最后一个Ghost bottleneck是Stride = 2，其他所有Ghost bottleneck都以Stride = 1进行应用。最后，会利用全局平均池和卷积层将特征图转换为1280维特征向量以进行最终分类。SE模块也用在了某些Ghost bottleneck中的残留层。与MobileNetV3相比，这里用ReLU换掉了Hard-swish激活函数。

参考：

转载地址：http://thdlf.baihongyu.com/

你可能感兴趣的文章

Intellij IDEA使用（十四）—— 在IDEA中创建包（package）的问题