Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
data_mining:neural_network:deep_neural_nets [2017/08/20 17:29] – [Backward prop for layer l] phreazer | data_mining:neural_network:deep_neural_nets [2017/08/20 18:04] (current) – [Vectorized] phreazer | ||
---|---|---|---|
Line 39: | Line 39: | ||
$dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$ | $dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$ | ||
+ | ===== Flow ===== | ||
+ | |||
+ | Forward: | ||
+ | |||
+ | X -> ReLU -> ReLU -> Sigmoid -> $\hat{y}$ -> $L(\hat{y}, y)$ | ||
+ | |||
+ | Init backprop with derivative of $L$. | ||
+ | |||
+ | |||
+ | ===== Matrix dimensions ===== | ||
+ | $l=5$ | ||
+ | 2-3-5-4-2-1 | ||
+ | |||
+ | $Z^1 = W^1 * x + b^1 $ | ||
+ | |||
+ | $Z^1 :(3,1)$ | ||
+ | |||
+ | $x : (2,1)$ | ||
+ | |||
+ | $W^1 :(n^1,n^0) => W^1 (3,2), W^2(5,3)$ | ||
+ | |||
+ | $W^l :(n^l, n^{l-1})$ | ||
+ | |||
+ | $b^1 : (3,1)$ | ||
+ | |||
+ | $b^L : (n^l, 1)$ | ||
+ | |||
+ | analog with $dW^l$ and $db^l$ | ||
+ | |||
+ | ==== Vectorized ==== | ||
+ | |||
+ | $Z^1 : (n^1,m)$ | ||
+ | |||
+ | $W^1 :(n^1, n^0)$ | ||
+ | |||
+ | $X : (n^0, m)$ | ||
+ | |||
+ | $b^1: (n^1,m)$ |