Train / Dev / Test set
Traditionally: 60 / 20 / 20
Big Data (>1 mio cases): 95.5 / 0.4 / 0.1
Another problem: Make sure that dev and test set came from same distribution (e.g. user uploaded potato pictures vs. high res).