Differences

This shows you the differences between two versions of the page.

--- data_mining:neural_network:model_combination [2017/04/01 15:20] – [Approximating full Bayesian learning in a NN] phreazer
+++ data_mining:neural_network:model_combination [2017/08/19 22:11] – phreazer
@@ Line 67: / Line 67: @@
     * Expensive, but works much better than ML learning, when posteriror is vague or multimodal (data is scarce).
+Monte Carlo method
 Idea: Might be good enough to sample weight vectors according to their posterior probabilities.
@@ Line 72: / Line 73: @@
 $p(y_{\text{test}} | \text{input}_\text{test}, D) = \sum_i p(W_i|D) p(y_{\text{test}} | \text{input}_\text{test}, W_i)$
-Monte Carlo method
+Sample weight vectors $p(W_i|D)$.
+In Backpropagation, we keep moving weights in the direction that decreases the costs.
+With sampling: Add some gaussion noise to weight vector, after each update.
+Markov Chain Monte Carlo:
+If we use just the right amount of noise, and if we let thei weight vector wander around for long enough before we take a sample, we will get an ubiased sample form the true posterior over weight vectors.
+More complicated and effective methods than MCMC method: Don't need to wander the space long.
-Random weights
+If we compute gradient of cost function on a **random mini-batch**, we will get an unbiased estimate with sampling noise.