编程技术网

关注微信公众号,定时推送前沿、专业、深度的编程技术资料。

 找回密码
 立即注册

QQ登录

只需一步,快速开始

极客时间

Keras与Caffe的卷积之间的区别是什么?:What's the difference between convolution in Keras vs Caffe?

wr1ttenyu zhao CNN 2022-5-10 14:51 11人围观

腾讯云服务器
Keras与Caffe的卷积之间的区别是什么?的处理方法

我正在尝试将大型Caffe网络复制到Keras(基于tensorflow后端).但是即使在单个卷积层上,我也遇到了很大的麻烦.

I'm trying to replicate a large Caffe network into Keras (based on tensorflow backend). But I'm having a large trouble doing it even at a single convolutional layer.

简单的一般卷积:

假设我们有一个形状为(1,500,500,3)的4D输入,并且我们必须使用 96 过滤器对此输入执行一次卷积内核大小为 11 4x4 的步幅.

Let's say we had a 4D input with shape (1, 500, 500, 3), and we had to perform a single convolution on this input with 96 filters with kernel size of 11 and 4x4 strides.

让我们设置权重和输入变量:

Let's set our weight and input variables:

w = np.random.rand(11, 11, 3, 96) # weights 1 b = np.random.rand(96) # weights 2 (bias) x = np.random.rand(500, 500, 3) 

Keras中的简单卷积:

这是在Keras中的定义方式:

This is how it could be defined in Keras:

from keras.layers import Input from keras.layers import Conv2D import numpy as np inp = Input(shape=(500, 500, 3)) conv1 = Conv2D(filters=96, kernel_size=11, strides=(4, 4), activation=keras.activations.relu, padding='valid')(inp) model = keras.Model(inputs=[inp], outputs=conv1) model.layers[1].set_weights([w, b]) # set weights for convolutional layer predicted = model.predict([x.reshape(1, 500, 500, 3)]) print(predicted.reshape(1, 96, 123, 123)) # reshape keras output in the form of Caffe 

Caffe中的简单卷积:

simple.prototxt :

name: "simple" input: "inp" input_shape { dim: 1 dim: 3 dim: 500 dim: 500 } layer { name: "conv1" type: "Convolution" bottom: "inp" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 pad: 0 stride: 4 } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } 

Python中的Caffe:

Caffe in Python:

import caffe net = caffe.Net('simple.prototxt', caffe.TEST) net.params['conv1'][0].data[...] = w.reshape(96, 3, 11, 11) # set weights 1 net.params['conv1'][1].data[...] = b # set weights 2 (bias) net.blobs['inp'].reshape(1, 3, 500, 500) # reshape input layer to fit our input array x print(net.forward(inp=x.reshape(1, 3, 500, 500)).get('conv1')) 

问题:

如果我们执行了两个代码段,我们将注意到输出彼此不同.我知道Caffe的对称填充等差异很小,但是我什至没有在这里使用填充.但是Caffe的输出不同于Keras的输出...

If we executed both of the snippets of code, we would notice that outputs are different from each other. I understand that there are few differences such as symmetric padding of Caffe, but I didn't even use padding here. Yet the output of Caffe is different from output of Keras...

为什么会这样?我知道Theano后端不像Caffe那样利用相关性,因此它需要将内核旋转180度,但是对于tensorflow是否相同?据我所知,Tensorflow和Caffe都使用互相关而不是卷积.

Why is this so? I know that Theano backend doesn't utilize correlation like Caffe does and hence it requires kernel to be rotated by 180 degrees, but is it the same for tensorflow? from what I know, both Tensorflow and Caffe use cross-correlation instead of Convolution.

我如何在Keras和Caffe中制作两个使用卷积的相同模型?

How could I make two identical models in Keras and Caffe that use convolution?

任何帮助将不胜感激,谢谢!

Any help would be appreciated, thanks!

问题解答

我找到了问题,但我不确定如何解决...

I found the problem, but I'm not sure how to fix it yet...

这两个卷积层之间的区别是它们的项对齐.仅当过滤器的数量等于 N 从而使 N>1&&&S ,其中 S 是过滤器的尺寸.换句话说,只有当我们从卷积中得到行数和列数均大于1 的多维数组时,才会出现这种问题.

The difference between these two convolutional layers is alignment of their items. This alignment problem only occurs when number of filters are equal to N such that N > 1 && N > S where S is dimension of filter. In other words, such problem only occurs when we get a multi-dimensional array from convolution which has both number of rows and number of columns greater than 1.

为此,我简化了输入和输出数据,以便我们可以更好地分析这两层的机理.

To see this, I simplified my input and output data so that we can better analyze the mechanics of both layers.

simple.prototxt :

input: "input" input_shape { dim: 1 dim: 1 dim: 2 dim: 2 } layer { name: "conv1" type: "Convolution" bottom: "input" top: "conv1" convolution_param { num_output: 2 kernel_size: 1 pad: 0 stride: 1 } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } 

simple.py :

import keras import caffe import numpy as np from keras.layers import Input, Conv2D from keras.activations import relu from keras import Model filters = 2 # greater than 1 and ker_size ker_size = 1 _input = np.arange(2 * 2).reshape(2, 2) _weights = [np.reshape([[2 for _ in range(filters)] for _ in range(ker_size*ker_size)], (ker_size, ker_size, 1, filters)), np.reshape([0 for _ in range(filters)], (filters,))] # weights for Keras, main weight is array of 2`s while bias weight is array of 0's _weights_caffe = [_weights[0].T, _weights[1].T] # just transpose them for Caffe # Keras Setup keras_input = Input(shape=(2, 2, 1), dtype='float32') keras_conv = Conv2D(filters=filters, kernel_size=ker_size, strides=(1, 1), activation=relu, padding='valid')(keras_input) model = Model(inputs=[keras_input], outputs=keras_conv) model.layers[1].set_weights([_weights[0], _weights[1]]) # Caffe Setup net = caffe.Net("simpler.prototxt", caffe.TEST) net.params['conv1'][0].data[...] = _weights_caffe[0] net.params['conv1'][1].data[...] = _weights_caffe[1] net.blobs['input'].data[...] = _input.reshape(1, 1, 2, 2) # Predictions print("Input:\n---") print(_input) print(_input.shape) print("\n") print("Caffe:\n---") print(net.forward()['conv1']) print(net.forward()['conv1'].shape) print("\n") print("Keras:\n---") print(model.predict([_input.reshape(1, 2, 2, 1)])) print(model.predict([_input.reshape(1, 2, 2, 1)]).shape) print("\n") 

输出:

Input: --- [[0 1] [2 3]] (2, 2) Caffe: --- [[[[0. 2.] [4. 6.]] [[0. 2.] [4. 6.]]]] (1, 2, 2, 2) Keras: --- [[[[0. 0.] [2. 2.]] [[4. 4.] [6. 6.]]]] (1, 2, 2, 2) 

分析:

如果您查看Caffe模型的输出,您会注意到我们的 2x2 数组首先加倍(因此我们有2个 2x2 数组)然后使用我们的权重矩阵对这两个数组中的每一个执行矩阵乘法.像这样:

If you look at output by the Caffe model, you'll notice that our 2x2 array is first doubled (so that we have an array of 2 2x2 arrays) and then matrix multiplication is performed on each of those two arrays with our weight matrix. Something like this:

原始:

[[[[0. 2.] [4. 6.]] [[0. 2.] [4. 6.]]]] 

已转化:

[[[[(0 * 2) (2 * 2)] [(4 * 2) (6 * 2)]] [[(0 * 2) (2 * 2)] [(4 * 2) (6 * 2)]]]] 

Tensorflow的功能有所不同,在执行与Caffe相同的操作后,似乎首先按升序对齐输出的2D向量.这似乎是一种怪异的行为,我无法理解他们为什么会这样做.

Tensorflow does something different, it seems to first align 2D vectors of output in ascending order after doing the same thing as Caffe did. This seems like a weird behavior, and I'm unable to understand why would they do such thing.

我已经回答了有关问题的原因的自己的问题,但是我还没有任何干净的解决方案.我仍然没有找到令人满意的答案,因此我将接受具有实际解决方案的问题.

I have answered my own question about the cause of the problem, but I'm not aware of any clean solution yet. I still don't find my answer satisfying enough hence I'm going to accept the question which has the actual solution.

我知道的唯一解决方案是创建自定义层,这对我来说不是一个很整洁的解决方案.

The only solution I know is creation of custom layer, which is not a very neat solution to me.

这篇关于Keras与Caffe的卷积之间的区别是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程技术网(www.editcode.net)!

腾讯云服务器 阿里云服务器
关注微信
^