Training set | Hold-out / Dev set | Test set

  • Goal of dev set: Which algorithm does better
  • Goal of test set: Performance estimate of final classifier

For NNs with large data sets: Often 98%/1%/1%

  • Should have same distribution, but sometimes training set needs more data which comes from different distribution
    • Then dev/test set should come from same distribution

Sometimes there is no test set, so you don't have a unbiased estimation (might overfit to dev set).