Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
data_mining:neural_network:deep_neural_nets [2017/08/20 17:25] – [Backwad prop] phreazer | data_mining:neural_network:deep_neural_nets [2017/08/20 18:04] (current) – [Vectorized] phreazer | ||
---|---|---|---|
Line 24: | Line 24: | ||
$A^{[0]}$ is input set. | $A^{[0]}$ is input set. | ||
- | ===== Backwad | + | ===== Backward |
Input: $da^{[l]}$ | Input: $da^{[l]}$ | ||
Line 31: | Line 31: | ||
- | $dz^{[l]} = da^{[l]} * g' | + | $dZ^{[l]} = dA^{[l]} * g' |
- | $dW^{[l]} = dz^{[l]} * a^{[l-1]}$ | + | $dW^{[l]} = 1/m * dZ^{[l]} * A^{[l-1]^T}$ |
- | $db^{[l]} = dz^{[l]}$ | + | $db^{[l]} = 1/m * np.sum(dZ^{[l]}, axis=1, keep.dims=True)$ |
- | $da^{[l-1]} = W^{[l]^T} * dz^{[l]}$ | + | $dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$ |
+ | ===== Flow ===== | ||
+ | Forward: | ||
+ | |||
+ | X -> ReLU -> ReLU -> Sigmoid -> $\hat{y}$ -> $L(\hat{y}, y)$ | ||
+ | |||
+ | Init backprop with derivative of $L$. | ||
+ | |||
+ | |||
+ | ===== Matrix dimensions ===== | ||
+ | $l=5$ | ||
+ | 2-3-5-4-2-1 | ||
+ | |||
+ | $Z^1 = W^1 * x + b^1 $ | ||
+ | |||
+ | $Z^1 :(3,1)$ | ||
+ | |||
+ | $x : (2,1)$ | ||
+ | |||
+ | $W^1 :(n^1,n^0) => W^1 (3,2), W^2(5,3)$ | ||
+ | |||
+ | $W^l :(n^l, n^{l-1})$ | ||
+ | |||
+ | $b^1 : (3,1)$ | ||
+ | |||
+ | $b^L : (n^l, 1)$ | ||
+ | |||
+ | analog with $dW^l$ and $db^l$ | ||
+ | |||
+ | ==== Vectorized ==== | ||
+ | |||
+ | $Z^1 : (n^1,m)$ | ||
+ | |||
+ | $W^1 :(n^1, n^0)$ | ||
+ | |||
+ | $X : (n^0, m)$ | ||
+ | |||
+ | $b^1: (n^1,m)$ |