Reading:
All artificial neural networks can be viewed as combinations of some kind of "perceptrons" in various architectures
perhaps with small changes to the perceptron in terms of the activation function.
A key to a given architecture's success (or lack thereof) is the existence of a training algorithm.
(also a lot of trial and error; other ideas may have been easy to train but just didn't work well)
Note that on the diagram the images are 6 × 6 pixels, so the neural network on the left should contain 36 neurons (and 648 connections)
Hebb’s rule: for each training image, the weight between two neurons is increased if the corresponding pixels are both on or both off, but decreased if one pixel is on and the other is off.
Neurons separated into two groups: visible units (receive inputs and produce outputs) and hidden units
Image correction/denoising
Generative model for classification.
A Boltzmann machine in which there are no connections between visible units or between hidden units, only between visible and hidden units
A very efficient training algorithm introduced in 2005 by Miguel Á. Carreira-Perpiñán and Geoffrey Hinton.
$$ w_{ij} = w_{ij} + \eta (\mathbf x \mathbf h^T - \dot{\mathbf x}\dot{\mathbf h}^T) $$
Combine data with labels as inputs to first RBM
Motivation - a baby
Benefit of semi-supervised approach is that you don’t need much labeled training data.
If the unsupervised RBMs do a good enough job, then only a small amount of labeled training instances per class will be necessary.
This new instance will usually look like a regular instance of the class whose label unit you activated.
Due to the stochastic nature of RBMs and DBNs, the caption will keep changing randomly, but it will generally be appropriate for the image. If you generate a few hundred captions, the most frequently generated ones will likely be a good description of the image.
unsupervised - works by having all the neurons compete against each other.
This algorithm tends to make nearby neurons gradually specialize in similar inputs
Use random network to generate a collection of pseudorandom sequences
Use regression layer at output to form desired waveform by linear combination
All can be viewed as combinations of "perceptrons" in networks, perhaps with small changes to the perceptron in terms of the activation function.
A key to their success is the existence of a training algorithm.
(also a lot of trial and error; other ideas may have been easy to train but just didn't work well)
What are the inputs and outputs?
How might you use this for multi-class classification?
An MLP is composed of
Every layer except the output layer includes a bias neuron and is fully connected to the next layer.
When an ANN has two or more hidden layers, it is called a deep neural network (DNN).
Perceptron training algorithm doesn't extend to MLP's.
No good training algorithm for many years.