# Restricted Boltzmann Machines

- Boltzmann machine, with restriction, that neurons form a bipartite graph - pair of nodes from each of the visibile and hidden units have symmetric connections, but no connection between nodes in the group (e.g. connections between hidden units).
- Shallow, two-layer structure
- Building block for deep belief nets

- Unsupervised learning
- Dim reduction
- No labels needed
- Autoencoder

## Programming

Units in an RBM have binary states: they're either on or off. We'll represent those states with the number 1 for the on state and the number 0 for the off state.

visible_state_to_hidden_probabilities

When we have the (binary) state of all visible units in an RBM, the conditional probability for each hidden unit to turn on (conditional on the states of the visible units) can be calculated.

Contrastive Divergence gradient estimator with 1 full Gibbs update, a.k.a. CD-1

one where every time after calculating a conditional probability for a unit, we sample a state for the unit from that conditional probability (using the functionsample_bernoulli), and then we forget about the conditional probability.

we'll sample a binary state for the hidden units conditional on the data; we'll sample a binary state for the visible units conditional on that binary hidden state (this is sometimes called the “reconstruction” for the visible units); and we'll sample a binary state for the hidden units conditional on that binary visible “reconstruction” state. Then we base our gradient estimate on all those sampled binary states.