data_mining:neural_network:deep_neural_nets

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
data_mining:neural_network:deep_neural_nets [2017/08/20 19:13] phreazerdata_mining:neural_network:deep_neural_nets [2017/08/20 20:04] – [Matrix dimensions] phreazer
Line 4: Line 4:
  
 $l = 4$ layers $l = 4$ layers
-$n^{[l]} = \text{# of units in layer} l$+$n^{[l]} = \text{# of units in layer } l$
  
 $n^{[0]} = 3$ $n^{[0]} = 3$
Line 11: Line 11:
 $n^{[3]} = 3$ $n^{[3]} = 3$
 $n^{[4]} = n^{[l]} = 1$ $n^{[4]} = n^{[l]} = 1$
 +
 +===== Forward prop =====
 +
 +Input: $a^{[l - 1]}$
 +
 +Output: $a^{[l]}$, cache $(z^{[l]})$ and $W^{[l]}$, $b^{[l]}$
 +
 +$Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$
 +
 +$A^{[l]} = g^{[l]}(Z^{[l]})$
 +
 +$A^{[0]}$ is input set.
 +
 +===== Backward prop for layer l =====
 +
 +Input: $da^{[l]}$
 +
 +Output: $da^{[l-1]}, dW^{[l]}, db^{[l]}$
 +
 +
 +$dZ^{[l]} = dA^{[l]} * g'^{[l]}(Z^{[l]})$ # element-wise product
 +
 +$dW^{[l]} = 1/m * dZ^{[l]} * A^{[l-1]^T}$
 +
 +$db^{[l]} = 1/m * np.sum(dZ^{[l]}, axis=1, keep.dims=True)$
 +
 +$dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$
 +
 +===== Flow =====
 +
 +Forward:
 +
 +X -> ReLU -> ReLU -> Sigmoid -> $\hat{y}$ -> $L(\hat{y}, y)$
 +
 +Init backprop with derivative of $L$.
 +
 +
 +===== Matrix dimensions =====
 +$l=5$
 +2-3-5-4-2-1
 +
 +$Z^1 = W^1 * x + b^1 $
 +
 +$Z^1 :(3,1)$
 +
 +$x : (2,1)$
 +
 +$W^1 :(n^1,n^0) => W^1 (3,2), W^2(5,3)$
 +
 +$W^l :(n^l, n^{l-1})$
 +
 +$b^1 : (3,1)$
 +
 +$b^L : (n^l, 1)$
 +
 +analog with $dW^l$ and $db^l$
 +
 +==== Vectorized ====
 +
 +$Z^1 : n^1,m$
 +
 +$W^1 :(n^1, n^0)$
 +
 +$X : (n^0, m)$
 +
 +$b^1: (n^1,m)$
  • data_mining/neural_network/deep_neural_nets.txt
  • Last modified: 2017/08/20 20:04
  • by phreazer