data_mining:neural_network:backpropagation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
data_mining:neural_network:backpropagation [2017/08/19 21:38] – [Overfitting (How well does the network generalize?)] phreazerdata_mining:neural_network:backpropagation [2017/08/19 22:27] – [Overfitting (How well does the network generalize?)] phreazer
Line 76: Line 76:
  
 ==== Overfitting (How well does the network generalize?) ==== ==== Overfitting (How well does the network generalize?) ====
- + See [[data_mining:neural_network:overfitting|Overfitting & Parameter tuning]]
-  * Target values unreliable? +
-  * Sampling errors (accidental regularities of particular training cases) +
- +
-Regularization Methods: +
-  * Weight decay (small weights, simpler model) +
-  * Weight-sharing (same weights) +
-  * Early-stopping (Fake testset, when performance gets worse, stop training) +
-  * Model-Averaging +
-  * Bayes fitting (like model averaging) +
-  * Dropout (Randomly ommit hidden units) +
-  * Generative pre-training) +
- +
-=== Weight decay === +
- +
-== Example for logistic regression== +
- +
-$L_2$ regularization: +
- +
-E.g. for Logistic regression, add to cost function $J$: $\dots + \frac{\lambda}{2m} ||w||^2_2 = \dots + \sum_{j=i}^{n_x} w_j^2 = \dots + w^T w$ +
- +
-$L_1$ regularization: +
- +
-$\frac{\lambda}{2m} ||w||_1$  +
- +
-$w$ will be sparse. +
- +
-Use hold-out test set to set hyperparameter. +
- +
-== Neural Network == +
- +
-Cost function  +
- +
-$J(\dots) = 1/m \sum_i=1^n L(\hat{y}^{i}, y^{i}) + \frac{\lambda}{2m} \sum_{l=1}^L ||W^l||^2$ +
- +
 ==== History of backpropagation ==== ==== History of backpropagation ====
  
  • data_mining/neural_network/backpropagation.txt
  • Last modified: 2018/05/12 14:14
  • by phreazer