Convolutional Neural Networks

UMaine COS 470/570 – Introduction to AI

Created: 2019-04-29 Mon 21:45

One of the major kinds of ANNs in use
One of the reasons deep learning is so successful:
- addresses computational tractability
- addresses vanishing/exploding gradient problem
Start – 80s
First truly successful modern version: LeNet (LeCun, 1989)
LeNet-5 (LeCun, 1998): 7-layer CNN for reading numbers on checks

Image credit: user Clock, CC BY-SA 3.0, via Wikimedia Commons

Instead of fully-connected layer, think of using a layer whose neurons each have a receptive field:
Overlapping receptive fields
Neurons then learn local features, only have a few weight each
Have multiple feature-detecting layers per convolutional layer
Problem:
- Features should be location-independent
- ⇒ Weights for nodes should be shared, learned together

So—how to compute the layer?
For a \(n \times n\) receptive field:
- \(n\times n\) weights
- If input layer is \(m \times m\), hidden layer is \(m-n+1 \times m-n+1\)
- For hidden layer neuron at \(x,\ y\), activation is: \[\sigma(b + \sum_{i=0}^{n-1}\sum_{j=0}^{n-1} w_{i,j} a_{x+i, y+j})\]
Slide the kernel across, down the image by some stride
\(b\) weights = kernel or filter
Hidden layer = feature map
Update weights based on entire hidden layer’s computed loss function

Convolutional layers are coupled with pooling layers
Each node of pooling layer connected to some \(i\times j\) region of feature map

Pool based on some function—max, average, etc.

(Aphex34 [CC BY-SA 4.0], via Wikimedia Commons)
Purpose(s):
- Reduce # weights needed
- Blur/average/smooth feature map
- Determining if a feature is in a particular region

Pooled layers’ output ⇒ fully-connected layer – e.g., for MNIST:

(From Nielson))
Learn configuration of features
Could have multiple fully-connected layers, too

Multiple convolutional + pooling layers:

(Aphex34 [CC BY-SA 4.0], via Wikimedia Commons)
Deeper layers ⇒ more complex features

LeNet-5:
- 7 layers
- Recognize numbers on checks
Recall the DQN we talked about used CNNs
Many additional variants of CNNs now
ResNet: 152 layers, general image recognition, lots of additions to LeNet’s basic architecture

Build a CNN
- Get into groups, one of whom has a laptop with Keras on it
- Create a simple CNN for MNIST
Explain a CNN
- Get into groups with at least 2 laptops
- Part of group: Look up an “inception” layer in (e.g.) GoogleNet
- Other part: Look up ResNet
- Explain them to each other after a few minutes