Author: Yann Lecun
Applications:
Example with zero padding:
Given
x[i] = [6,2] h[i] = [1,2,5,4]
With zero padding, and inverted filter x (otherwise operation would be cross-correlation).
[2 6] | | V V 0 [1 2 5 4] = 2 * 0 + 6 * 1 = 6
Second step:
[2 6] | | V V 0 [1 2 5 4] = 2 * 1 + 6 * 2 = 14 (the arrows represent the connection between the kernel and the input)
Third step:
[2 6] | | V V 0 [1 2 5 4] = 2 * 2 + 6 * 5 = 34
Fourth step:
[2 6] | | V V 0 [1 2 5 4] = 2 * 5 + 6 * 4 = 34
Fifth step:
[2 6] | | V V 0 [1 2 5 4] 0 = 2 * 4 + 6 * 0 = 8
The result of the convolution for this case, listing all the steps, would then be: Y = [6 14 34 34 8]
$n \times n$ image, $f \times f$ kernel, $n-f+1 \times n-f+1$ result (Valid padding = no padding)
With padding:
For padding $p$: $n+2p-f+1 \times n+2p-f+1$
Same padding $p=(f-1)/2$
$(n+2p-f)/s + 1 \times (n+2p-f)/s + 1 $
$ 6 \times 6 \times 3$ * $3 \times 3 \times 3$ = $4 \times 4$
$n-f+1 \times n-f+1 \times n_c'$ Number of filters $n_c'$
Use many filters, to detect multiple features
# full method np.convolve(x,h,"full") array([ 6, 14, 34, 34, 8]) # same method np.convolve(x,h,"same") #no zero padding at end array([ 6, 14, 34, 34]) # valid method np.convolve(x,h,"valid") #no zero padding array([14, 34, 34])
import tensorflow as tf #Building graph input = tf.Variable(tf.random_normal([1,10,10,1])) filter = tf.Variable(tf.random_normal([3,3,1,1])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID') op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME') #Initialization and session init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) print("Input \n") print('{0} \n'.format(input.eval())) print("Filter/Kernel \n") print('{0} \n'.format(filter.eval())) print("Result/Feature Map with valid positions \n") result = sess.run(op) print(result) print('\n') print("Result/Feature Map with padding \n") result2 = sess.run(op2) print(result2)
Fixed hyper-parameters:
Typical values: $f=2, s=2$
Usually no padding is used.
Channel is the same (depth)
In deep networks 7x7x1000 ⇒ 1x1x1000