data_mining:neural_network:deep_neural_nets

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
data_mining:neural_network:deep_neural_nets [2017/08/20 19:17] – [Forward prop] phreazerdata_mining:neural_network:deep_neural_nets [2017/08/20 20:04] (current) – [Vectorized] phreazer
Line 19: Line 19:
  
 $Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$ $Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$
 +
 $A^{[l]} = g^{[l]}(Z^{[l]})$ $A^{[l]} = g^{[l]}(Z^{[l]})$
  
 $A^{[0]}$ is input set. $A^{[0]}$ is input set.
 +
 +===== Backward prop for layer l =====
 +
 +Input: $da^{[l]}$
 +
 +Output: $da^{[l-1]}, dW^{[l]}, db^{[l]}$
 +
 +
 +$dZ^{[l]} = dA^{[l]} * g'^{[l]}(Z^{[l]})$ # element-wise product
 +
 +$dW^{[l]} = 1/m * dZ^{[l]} * A^{[l-1]^T}$
 +
 +$db^{[l]} = 1/m * np.sum(dZ^{[l]}, axis=1, keep.dims=True)$
 +
 +$dA^{[l-1]} = W^{[l]^T} * dZ^{[l]}$
 +
 +===== Flow =====
 +
 +Forward:
 +
 +X -> ReLU -> ReLU -> Sigmoid -> $\hat{y}$ -> $L(\hat{y}, y)$
 +
 +Init backprop with derivative of $L$.
 +
 +
 +===== Matrix dimensions =====
 +$l=5$
 +2-3-5-4-2-1
 +
 +$Z^1 = W^1 * x + b^1 $
 +
 +$Z^1 :(3,1)$
 +
 +$x : (2,1)$
 +
 +$W^1 :(n^1,n^0) => W^1 (3,2), W^2(5,3)$
 +
 +$W^l :(n^l, n^{l-1})$
 +
 +$b^1 : (3,1)$
 +
 +$b^L : (n^l, 1)$
 +
 +analog with $dW^l$ and $db^l$
 +
 +==== Vectorized ====
 +
 +$Z^1 : (n^1,m)$
 +
 +$W^1 :(n^1, n^0)$
 +
 +$X : (n^0, m)$
 +
 +$b^1: (n^1,m)$
  • data_mining/neural_network/deep_neural_nets.1503249471.txt.gz
  • Last modified: 2017/08/20 19:17
  • by phreazer