Table of Contents

Convolutional neural network

Author: Yann Lecun

Applications:

Convolution operation

Example with zero padding:

Given

x[i] = [6,2]
h[i] = [1,2,5,4]

With zero padding, and inverted filter x (otherwise operation would be cross-correlation).

[2  6]
 |  |
 V  V
 0 [1 2 5 4]
= 2 * 0 + 6 * 1 = 6

Second step:

  [2  6]  
   |  |  
   V  V  
0 [1  2  5  4]  
= 2 * 1 + 6 * 2 = 14 (the arrows represent the connection between the kernel and the input)

Third step:

     [2  6]  
      |  |  
      V  V  
0 [1  2  5  4]  
= 2 * 2 + 6 * 5 = 34

Fourth step:

        [2  6]
         |  |
         V  V
0 [1  2  5  4]  
= 2 * 5 + 6 * 4 = 34

Fifth step:

           [2  6]
            |  |
            V  V
0 [1  2  5  4] 0  
= 2 * 4 + 6 * 0 = 8

The result of the convolution for this case, listing all the steps, would then be: Y = [6 14 34 34 8]

Result size

$n \times n$ image, $f \times f$ kernel, $n-f+1 \times n-f+1$ result (Valid padding = no padding)

With padding:

For padding $p$: $n+2p-f+1 \times n+2p-f+1$

Same padding $p=(f-1)/2$

With strides

$(n+2p-f)/s + 1 \times (n+2p-f)/s + 1 $

With Volumes:

$ 6 \times 6 \times 3$ * $3 \times 3 \times 3$ = $4 \times 4$

$n-f+1 \times n-f+1 \times n_c'$ Number of filters $n_c'$

Use many filters, to detect multiple features

In Python with scipy

# full method
np.convolve(x,h,"full")
array([ 6, 14, 34, 34,  8])
# same method
np.convolve(x,h,"same")  #no zero padding at end
array([ 6, 14, 34, 34])
# valid method
np.convolve(x,h,"valid")  #no zero padding
array([14, 34, 34])

In tensor flow

import tensorflow as tf
 
#Building graph
 
input = tf.Variable(tf.random_normal([1,10,10,1]))
filter = tf.Variable(tf.random_normal([3,3,1,1]))
op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')
op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
 
#Initialization and session
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
 
    print("Input \n")
    print('{0} \n'.format(input.eval()))
    print("Filter/Kernel \n")
    print('{0} \n'.format(filter.eval()))
    print("Result/Feature Map with valid positions \n")
    result = sess.run(op)
    print(result)
    print('\n')
    print("Result/Feature Map with padding \n")
    result2 = sess.run(op2)
    print(result2)

Max Pooling

Fixed hyper-parameters:

Typical values: $f=2, s=2$

Usually no padding is used.

Channel is the same (depth)

Average Pooling

In deep networks 7x7x1000 ⇒ 1x1x1000

Winning competitions