Differences

This shows you the differences between two versions of the page.

--- data_mining:logistic_regression [2018/05/10 17:45] – [Regularization] phreazer
+++ data_mining:logistic_regression [2018/05/10 17:46] – [Regularization] phreazer
@@ Line 74: / Line 74: @@
 $min \dots + \lambda \sum_{i=1}^n \theta_j^2$
+=== L2 Regularization ===
 For large $\lambda$, $W^{[l]} => 0$
@@ Line 82: / Line 83: @@
 Another effect, wehn $W$ is small, $z$ has a smaller range, resulting activation e.g. for tanh is more linear.
+=== Dropout ===
 === Gradient descent (Linear Regression) ===