Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
data_mining:neural_network:model_combination [2017/04/01 13:29] – [Dropout] phreazer | data_mining:neural_network:model_combination [2017/08/19 20:12] (current) – [Approximating full Bayesian learning in a NN] phreazer | ||
---|---|---|---|
Line 85: | Line 85: | ||
More complicated and effective methods than MCMC method: Don't need to wander the space long. | More complicated and effective methods than MCMC method: Don't need to wander the space long. | ||
- | If we compute gradient of cost function on a **random mini-batch**, | + | If we compute gradient of cost function on a **random mini-batch**, |
====== Dropout ====== | ====== Dropout ====== | ||
- | Ways to combine output of multiple models: | + | See [[data_mining:neural_network:regularization|Regularization]] |
- | * MIXTURE: Combine models by averaging their output probabilities. | + | |
- | * PRODUCT: by geometric mean (typically less than one) $\sqrt{x*y}/ | + |