data_mining:neural_network:model_combination

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
data_mining:neural_network:model_combination [2017/04/01 13:16] – [Full Bayesian Learning] phreazerdata_mining:neural_network:model_combination [2017/08/19 22:11] phreazer
Line 65: Line 65:
   * NN with few parameters. Put grid over parameter space and evaluate $p(W|D)$ at each grid-point. (xpensive, but no local optimum issues).   * NN with few parameters. Put grid over parameter space and evaluate $p(W|D)$ at each grid-point. (xpensive, but no local optimum issues).
   * After evaluating each grid point, we use all of them to make predictions on test data.   * After evaluating each grid point, we use all of them to make predictions on test data.
-    * Expensive, but works much better than ML learning, when posteriror is vague or multimodal (data is scarce). +    * Expensive, but works much better than ML learning, when posteriror is vague or multimodal (data is scarce)
 + 
 +Monte Carlo method 
 + 
 +Idea: Might be good enough to sample weight vectors according to their posterior probabilities. 
 + 
 +$p(y_{\text{test}} | \text{input}_\text{test}, D) = \sum_i p(W_i|D) p(y_{\text{test}} | \text{input}_\text{test}, W_i)$ 
 + 
 +Sample weight vectors $p(W_i|D)$. 
 + 
 +In Backpropagation, we keep moving weights in the direction that decreases the costs. 
 + 
 +With sampling: Add some gaussion noise to weight vector, after each update. 
 + 
 +Markov Chain Monte Carlo: 
 + 
 +If we use just the right amount of noise, and if we let thei weight vector wander around for long enough before we take a sample, we will get an ubiased sample form the true posterior over weight vectors. 
 + 
 +More complicated and effective methods than MCMC method: Don't need to wander the space long. 
 + 
 +If we compute gradient of cost function on a **random mini-batch**, we will get an unbiased estimate with sampling noise.
  • data_mining/neural_network/model_combination.txt
  • Last modified: 2017/08/19 22:12
  • by phreazer