Show pageOld revisionsBacklinksBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Hopfield nets ====== Architecture: * Recurrent connections * Binary theshold units Idea: If connections are **symmetric**, there is a **global energy function**. Each binary configuration has an energy. Binary threshold decision rule causes network to settle to a **minimum of this energy function**. ===== Energy function ===== **Global energy:** $E = - \sum_i s_i b_i - \sum_{i<j} s_i s_j w_ij$ * $w_ij$ is connection weight. * Binary states of two neurons Local computation for each unit: $\Delta E_i = E(s_i=0) - E(s_i=1) = b_i \sum_j s_j w_{ij}$ ===== Storing memories ===== * memories could be energy minima of a neural net. * Binary threshold decision rule can the nbe used to clean up incomplete or corrupted memories. * Activities of 1 and -1 to store a binary state vector (by incrementing weight betweend any two units by the product of their activities). Bias is permantly on unit. * $\Delta w_{ij} = s_i s_j$ ===== Storage capacity ===== N units = 0.15 N memories (At N bits per memory this is only 0.15N^2). * Net has $N^2$ weights and biases. * After storing $M$ memories, each connection weights has an integer value in the range of $[-M, M]$. * Number of bits required to store weights and biases: $N^2 log(2M+1)$. ===== Spurious minima limit capactiy ===== When new configuration is memorizes, we hope to create a new energy minimum (if two minima merge, capacity decreases). ==== Avoiding spurious minima by unlearning ==== Let net settle from random initial state, then do unlearning. Better storage rule: Instead of trying to store vectors in one shot, cycle through training set many times. (Pseudo likelihood technique). ===== Hopfield nets with hidden units ===== Instead of memories, store interpretations of sensory input. * Input represented as visible units * Intepretation represented as hidden units. * Badness of interpreation rep. as energy. Issues: * How to avoid getting trapped in poor local minima of the energy function? * How to learn the weights on the connections to the hidden units and between the hidden units? ===== Improve search with stochastic units ===== Hopfield net always reduces energy (trapped in local minima). Random noise: * Lot of noise, easy to cross barriers * Slowly reduce noise so that system ends up in a deep minimum (Simulated annealing) Stochastic binary units * Replace binary threshold units with binary stochastic units that make biased random decisions ("temperature" controls noise amount; raising noise is equivalen to decreasing all the energy gaps betweend configurations) $p(s_i=1) = \frac{1}{1+e^{-\Delta E_i/T}}$ ===== Thermal equilibrium ===== Thermal equi. at temperature of 1 Reaching thermal equilibrium is difficult concept. Probability distribution over configurations settles down to statonary distribution. Intuitively: Huge ensemble of systems that have same energy function. Probabiltiy of configuration is just fraction of the systems that have configuration. Approaching equilibrium: * Start with any distribution over all identical systems * Apply stochastic update rule, to pick next configuration for each individual system * May reach situation where fraction of systems in each configuration remains constant. * This stationary distribution is called thermal equilibrium. * Any given system keeps changing its configuration, but the fraction of systems in each configuration does not change. ===== Boltzman machine ===== Given: Training set of binary vectors. Fit model that will assign a probability to every possible binary vector. Useful for deciding if other binary vectors come from some distribution (e.g. to detect unusual behavious). data_mining/neural_network/hopfield.txt Last modified: 2017/04/09 09:10by phreazer