Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
data_mining:neural_network:initialization [2017/08/19 20:37] – [Random initialization] phreazer | data_mining:neural_network:initialization [2017/08/19 20:38] (current) – [Random initialization] phreazer | ||
---|---|---|---|
Line 8: | Line 8: | ||
Solution: $W^{[i]}=np.random.randn((2, | Solution: $W^{[i]}=np.random.randn((2, | ||
- | $0.01$ because else we would end up at ends of activation function values (and slopes would be small). | + | $0.01$ because else we would end up at ends of activation function values (and slopes would be small), e.g. if values would be large. |