Deep neural networks

This is an old revision of the document!

Notation

$l = 4$ layers $n^{[l]} = \text{# of units in layer } l$

$n^{[0]} = 3$ $n^{[1]} = 5$ $n^{[2]} = 5$ $n^{[3]} = 3$ $n^{[4]} = n^{[l]} = 1$

Input: $a^{[l - 1]}$

Output: $a^{[l]}$, cache $(z^{[l]})$ and $W^{[l]}$, $b^{[l]}$

$Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$

$A^{[l]} = g^{[l]}(Z^{[l]})$

$A^{[0]}$ is input set.

Input: $da^{[l]}$

Output: $da^{[l-1]}, dW^{[l]}, db^{[l]}$

$dZ^{[l]} = dA^{[l]} * g'^{[l]}(Z^{[l]})$ # element-wise product

$dW^{[l]} = 1/m * dZ^{[l]} * A^{[l-1]^T}$

$db^{[l]} = 1/m * np.sum(dZ^{[l]}, axis=1, keep.dims=True)$

$dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$

Forward:

X → ReLU → ReLU → Sigmoid → $\hat{y}$ → $L(\hat{y}, y)$

Init backprop with derivative of $L$.

Forward prop