Convolutional Neural Networks
UMaine COS 470/570 – Introduction to AI
Spring 2019
Created: 2019-04-29 Mon 21:45
Convolutional neural networks
- One of the major kinds of ANNs in use
- One of the reasons deep learning is so successful:
- addresses computational tractability
- addresses vanishing/exploding gradient problem
- Start – 80s
- First truly successful modern version: LeNet (LeCun, 1989)
- LeNet-5 (LeCun, 1998): 7-layer CNN for reading numbers on checks
The problem
- Goal: High-accuracy image recognition
- Standard supervised learning with deep (fully-connected) networks:
- Images require connections from each pixel → each neuron
- E.g., 1028 × 768 image ⇒ about 789,504 weights per neuron
- Slow to train
- Vanishing/exploding gradient problem
- Also no spatial locality exploited
- Can we take inspiration from biological vision systems?
Human visual system
Image credit: user Clock, CC BY-SA 3.0, via Wikimedia Commons
Shared weights
- So—how to compute the layer?
- For a \(n \times n\) receptive field:
- \(n\times n\) weights
- If input layer is \(m \times m\), hidden layer is \(m-n+1 \times m-n+1\)
- For hidden layer neuron at \(x,\ y\), activation is:
\[\sigma(b + \sum_{i=0}^{n-1}\sum_{j=0}^{n-1} w_{i,j} a_{x+i, y+j})\]
- Slide the kernel across, down the image by some stride
- \(b\) weights = kernel or filter
- Hidden layer = feature map
- Update weights based on entire hidden layer’s computed loss function
Pooling layers
Pool based on some function—max, average, etc.
(Aphex34 [CC BY-SA 4.0], via Wikimedia Commons)
- Purpose(s):
- Reduce # weights needed
- Blur/average/smooth feature map
- Determining if a feature is in a particular region
Learning in CNNs
- Backpropagation learning, gradient descent
- Equations for fully-connected nets have to be modified, though
- Theano, TensorFlow, PyTorch – all have support for training CNNs
Multiple convolutional layers
Multiple convolutional layers
- LeNet-5:
- Recall the DQN we talked about used CNNs
- Many additional variants of CNNs now
- ResNet: 152 layers, general image recognition, lots of additions to LeNet’s basic architecture
Feature detection in CNNs
From ConvNet:
Your turn
- Build a CNN
- Get into groups, one of whom has a laptop with Keras on it
- Create a simple CNN for MNIST
- Explain a CNN
- Get into groups with at least 2 laptops
- Part of group: Look up an “inception” layer in (e.g.) GoogleNet
- Other part: Look up ResNet
- Explain them to each other after a few minutes