Differences

This shows you the differences between two versions of the page.

--- data_mining:neural_network:initialization [2017/08/19 22:34] – phreazer
+++ data_mining:neural_network:initialization [2017/08/19 22:38] (current) – [Random initialization] phreazer
@@ Line 4: / Line 4: @@
 Weights need to be randomly initialized. For bias zero is ok.
-If weights are zero => $dz_{1,2}$ are the same. Hidden units would compute same function.
+If weights are zero: in backprop => $dz_{1,2}$ are the same. Hidden units would compute same function (= are symmetric).
+Solution: $W^{[i]}=np.random.randn((2,2)) * 0.01$
+$0.01$ because else we would end up at ends of activation function values (and slopes would be small), e.g. if values would be large.