老早就琢磨过deconvolution的问题,不过今天又想起这个,发现理解得还是比较模糊,具体就是还是不知道caffe或者tensorflow中实现deconvolution是怎么实现的,刚才看到下面这篇文章
https://arxiv.org/ftp/arxiv/papers/1609/1609.07009.pdf
里面的两张图一下子就把我的所有疑惑解开了:
Fig.2描述的是卷积过程,caffe中的im2col就是干的这件事
Fig.3描述的是deconvolution/transposed convolution的过程,很简单的理解就是把先前卷积用的C矩阵转置了一下。
(PS:这里之所以说C矩阵是沿用了这篇文章的叫法)
嗯。 本文给出的两个链接基本把deconvolution解释清楚了,另外theano专门有一个convolution arithmetic turorial(链接在这里)也讲了这个,不过应该先前的两个链接应该就够了 :)
来自distill上的google brain的一篇文章——deconvolution有毒啊。。会造成 checkerboard pattern
https://distill.pub/2016/deconv-checkerboard/
文中提到了他们尝试解决checkerboard pattern的一个办法:
We’ve had our best results with nearest-neighbor interpolation, and had difficulty making bilinear resize work. This may simply mean that, for our models, the nearest-neighbor happened to work well with hyper-parameters optimized for deconvolution. It might also point at trickier issues with naively using bilinear interpolation, where it resists high-frequency image features too strongly. We don’t necessarily think that either approach is the final solution to upsampling, but they do fix the checkerboard artifacts.
Resize-convolution layers can be easily implemented in TensorFlow using tf.image.resize_images(). For best results, use tf.pad() before doing convolution with tf.nn.conv2d() to avoid boundary artifacts.