Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
data_mining:neural_network:debugging [2017/08/19 22:05] – phreazer | data_mining:neural_network:debugging [2017/08/19 22:12] (current) – phreazer | ||
---|---|---|---|
Line 3: | Line 3: | ||
Approximating derivatives: | Approximating derivatives: | ||
- | (Large triangle (+/- triangle)) | + | (Large triangle (+/- triangle, two-sided difference)) |
- | $\frac{f(\Theta + \epsilon) - f(\Theta - \epsilon)}{2 \epsilon} | + | $\frac{f(\Theta + \epsilon) - f(\Theta - \epsilon)}{2 \epsilon} |
$f' | $f' | ||
Approx error is in $O(\epsilon^2)$ | Approx error is in $O(\epsilon^2)$ | ||
+ | |||
+ | Take $W^{[1]}, b^{[1]}, \dots, W^{[L]}, | ||
+ | |||
+ | Take $dW^{[1]}, db^{[1]}, \dots, dW^{[L]}, | ||
+ | |||
+ | J is now $J(\Theta) = J(\Theta_1, ...)$ | ||
+ | |||
+ | For each i: | ||
+ | |||
+ | $d\Theta_{approx}[i] = \frac{J(\dots, | ||
+ | |||
+ | $\epsilon = 10^{-7}$ |