This is an old revision of the document!
Deep neural networks
Notation
$l = 4$ layers $n^{[l]} = \text{# of units in layer } l$
$n^{[0]} = 3$ $n^{[1]} = 5$ $n^{[2]} = 5$ $n^{[3]} = 3$ $n^{[4]} = n^{[l]} = 1$
Forward prop
Input: $a^{[l - 1]}$
Output: $a^{[l]}$, cache $(z^{[l]})$ and $W^{[l]}$, $b^{[l]}$
$Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$
$A^{[l]} = g^{[l]}(Z^{[l]})$
$A^{[0]}$ is input set.
Backward prop for layer l
Input: $da^{[l]}$
Output: $da^{[l-1]}, dW^{[l]}, db^{[l]}$
$dZ^{[l]} = dA^{[l]} * g'^{[l]}(Z^{[l]})$ # element-wise product
$dW^{[l]} = 1/m * dZ^{[l]} * A^{[l-1]^T}$
$db^{[l]} = 1/m * np.sum(dZ^{[l]}, axis=1, keep.dims=True)$
$dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$
Flow
Forward:
X → ReLU → ReLU → Sigmoid → $\hat{y}$ → $L(\hat{y}, y)$
Init backprop with derivative of $L$.