Pelee: A Real-Time Object Detection System on Mobile Devices
ICLR 2018
Code: https://github.com/Robert-JunWang/Pelee
CNN模型在嵌入式设备中运行成为一种趋势,为此提出了一些模型如 MobileNet, ShuffleNet, and NASNet-A,他们主要依赖于 depthwise separable convolution , 但是在当前主要深度学习框架中,depthwise separable convolution 没有被高效的实现, lacks efficient implementation。这里我们提出了一个网络结构 PeleeNet,它基于 conventional convolution。
PeleeNet 主要特征模块:
1)Two-Way Dense Layer : to get different scales of receptive fields
2)Stem Block:This stem block can effectively improve the feature expression ability without adding computational cost too much
3) Dynamic Number of Channels in Bottleneck Layer 根据输入形状 动态 决定通道数目, this method can save up to
28.5% of the computational cost with a small impact on accuracy
4) Transition Layer without Compression 我们的实验显示 在 transition layers 中进行压缩会伤害 feature expression
5)Composite Function 为了提速,我们使用 post-activation (Convolution - Batch Normalization - Relu
For post-activation, all batch normalization layers can be merged with convolution layer at the inference stage, which can accelerate the speed greatly.
针对目标检测问题,我们将 PeleeNet 嵌入到 SSD 中,主要注意点如下:
1) Feature Map Selection:我们选择了 5 scale feature maps (19 x 19, 10 x 10, 5 x 5, 3 x 3, and 1 x 1). 为了节约计算,我们没有选择 38 x 38 feature map
2)Residual Prediction Block : 加了一个 a residual block (ResBlock) before conducting prediction.
3) Small Convolutional Kernel for Prediction:1x1 kernels reduce the computational cost by 21.5%
Overview of PeleeNet architecture
11