data_mining:crossvalidation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
data_mining:crossvalidation [2018/05/10 14:09] – created phreazerdata_mining:crossvalidation [2018/05/10 14:30] (current) – [Mismatched train/test distribution] phreazer
Line 1: Line 1:
 ====== Crossvalidation ====== ====== Crossvalidation ======
 +===== Train/Dev/Test sets =====
  
 Training set | Hold-out / Dev set | Test set Training set | Hold-out / Dev set | Test set
Line 7: Line 8:
  
 For NNs with large data sets: Often 98%/1%/1% For NNs with large data sets: Often 98%/1%/1%
 +
 +===== Mismatched train/test distribution =====
 +  * Should have same distribution, but sometimes training set needs more data which comes from different distribution
 +      * Then dev/test set should come from same distribution
 +
 +Sometimes there is no test set, so you don't have a **unbiased** estimation (might overfit to dev set).
  • data_mining/crossvalidation.txt
  • Last modified: 2018/05/10 14:30
  • by phreazer