Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
data_mining:logistic_regression [2014/07/20 14:49] – [Adressing Overfitting] phreazer | data_mining:logistic_regression [2018/05/10 15:48] (current) – phreazer | ||
---|---|---|---|
Line 56: | Line 56: | ||
Z.B. aus 3-Klassenproblem 3 binäre Probleme erzeugen. $h_\theta(x)^{(i)} = P(y=i|x; | Z.B. aus 3-Klassenproblem 3 binäre Probleme erzeugen. $h_\theta(x)^{(i)} = P(y=i|x; | ||
- | Dann wähle Klasse i, die $max_i h_\theta^{(i)}(x)$ | + | Dann wähle Klasse i, die $\max_i h_\theta^{(i)}(x)$ |
===== Adressing Overfitting ===== | ===== Adressing Overfitting ===== | ||
Line 74: | Line 74: | ||
$min \dots + \lambda \sum_{i=1}^n \theta_j^2$ | $min \dots + \lambda \sum_{i=1}^n \theta_j^2$ | ||
+ | === Gradient descent (Linear Regression) === | ||
+ | |||
+ | $\dots + \lambda / m \theta_j$ | ||
+ | |||
+ | $\theta_j := \theta_j (1-\alpha \frac{\lambda}{m}) - \alpha \frac{1}{m} \sum^m_{i=1} (h_\theta(x^{(i)} - y^{(i)}) x_j^{(i)}$ | ||
+ | |||
+ | |||
+ | === Normalengleichung (Linear Regression) === | ||
+ | $$(x^T X + \lambda | ||
+ | \begin{bmatrix} | ||
+ | 0 & \dots & \dots & 0 \\ | ||
+ | \vdots & 1 & 0 & \vdots \\ | ||
+ | \vdots & 0 & \ddots & 0 \\ | ||
+ | 0 & \dots & 0 & 1 | ||
+ | \end{bmatrix})^{-1} X^T y$$ | ||
+ | |||
+ | === Gradient descent (Logistic Regression) === | ||
+ | |||
+ | Unterscheide $\theta_0$ und $\theta_j$! | ||
+ | |||
+ | Für $\theta_j$: $\dots + \frac{\lambda}{m} \theta_j$ |