data_mining:neural_network:deep_neural_nets

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
data_mining:neural_network:deep_neural_nets [2017/08/20 19:15] phreazerdata_mining:neural_network:deep_neural_nets [2017/08/20 20:04] (current) – [Vectorized] phreazer
Line 14: Line 14:
 ===== Forward prop ===== ===== Forward prop =====
  
-Input $a^{[l - 1]}$ +Input$a^{[l - 1]}$ 
-Output $a^{[l]}$, cache $(z^{[l]})$ and $W^{[l]}$, $b^{[l]}$+ 
 +Output$a^{[l]}$, cache $(z^{[l]})$ and $W^{[l]}$, $b^{[l]}
 + 
 +$Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$ 
 + 
 +$A^{[l]} = g^{[l]}(Z^{[l]})$ 
 + 
 +$A^{[0]}$ is input set. 
 + 
 +===== Backward prop for layer l ===== 
 + 
 +Input: $da^{[l]}$ 
 + 
 +Output: $da^{[l-1]}, dW^{[l]}, db^{[l]}$ 
 + 
 + 
 +$dZ^{[l]} = dA^{[l]} * g'^{[l]}(Z^{[l]})$ # element-wise product 
 + 
 +$dW^{[l]} = 1/m * dZ^{[l]} * A^{[l-1]^T}$ 
 + 
 +$db^{[l]} = 1/m * np.sum(dZ^{[l]}, axis=1, keep.dims=True)$ 
 + 
 +$dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$ 
 + 
 +===== Flow ===== 
 + 
 +Forward: 
 + 
 +X -> ReLU -> ReLU -> Sigmoid -> $\hat{y}$ -> $L(\hat{y}, y)$ 
 + 
 +Init backprop with derivative of $L$. 
 + 
 + 
 +===== Matrix dimensions ===== 
 +$l=5$ 
 +2-3-5-4-2-1 
 + 
 +$Z^1 = W^1 * x + b^1 $ 
 + 
 +$Z^1 :(3,1)$ 
 + 
 +$x : (2,1)$ 
 + 
 +$W^1 :(n^1,n^0) => W^1 (3,2), W^2(5,3)$ 
 + 
 +$W^l :(n^l, n^{l-1})$ 
 + 
 +$b^1 : (3,1)$ 
 + 
 +$b^L : (n^l, 1)$ 
 + 
 +analog with $dW^l$ and $db^l$ 
 + 
 +==== Vectorized ==== 
 + 
 +$Z^1 : (n^1,m)$ 
 + 
 +$W^1 :(n^1, n^0)$ 
 + 
 +$X : (n^0, m)$ 
 + 
 +$b^1: (n^1,m)$
  • data_mining/neural_network/deep_neural_nets.1503249319.txt.gz
  • Last modified: 2017/08/20 19:15
  • by phreazer