data_mining:neural_network:backpropagation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
data_mining:neural_network:backpropagation [2017/08/19 21:48] – [Overfitting (How well does the network generalize?)] phreazerdata_mining:neural_network:backpropagation [2017/08/19 22:27] – [Overfitting (How well does the network generalize?)] phreazer
Line 76: Line 76:
  
 ==== Overfitting (How well does the network generalize?) ==== ==== Overfitting (How well does the network generalize?) ====
- + See [[data_mining:neural_network:overfitting|Overfitting & Parameter tuning]]
-  * Target values unreliable? +
-  * Sampling errors (accidental regularities of particular training cases) +
- +
-Regularization Methods: +
-  * Weight decay (small weights, simpler model) +
-  * Weight-sharing (same weights) +
-  * Early-stopping (Fake testset, when performance gets worse, stop training) +
-  * Model-Averaging +
-  * Bayes fitting (like model averaging) +
-  * Dropout (Randomly ommit hidden units) +
-  * Generative pre-training) +
- +
-=== $L_1$ and $L_2$ regularization === +
- +
-== Example for logistic regression== +
- +
-$L_2$ regularization: +
- +
-E.g. for Logistic regression, add to cost function $J$: $\dots + \frac{\lambda}{2m} ||w||^2_2 = \dots + \sum_{j=i}^{n_x} w_j^2 = \dots + w^T w$ +
- +
-$L_1$ regularization: +
- +
-$\frac{\lambda}{2m} ||w||_1$  +
- +
-$w$ will be sparse. +
- +
-Use hold-out test set to set hyperparameter. +
- +
-== Neural Network == +
- +
-Cost function  +
- +
-$J(\dots) = 1/m \sum_{i=1}^n L(\hat{y}^{i}, y^{i}) + \frac{\lambda}{2m} \sum_{l=1}^L ||W^{[l]}||_F^2$ +
- +
-Frobenius Norm: $||W^{[l]}||_F^2$ +
- +
-For gradient descent: +
- +
-$dW^{[l]} = \dots + \frac{\lambda}{m} W^{[l]}$ +
- +
-Called **weight decay** (additional multiplication with weights) +
- +
-Large $\lambda$: Every layer ~ linear; z small range of values (in case of tanh activation fct)+
 ==== History of backpropagation ==== ==== History of backpropagation ====
  
  • data_mining/neural_network/backpropagation.txt
  • Last modified: 2018/05/12 14:14
  • by phreazer