data_mining:strategy

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
data_mining:strategy [2018/05/21 17:15] – [Change dev/test set and metric] phreazerdata_mining:strategy [2018/05/21 17:31] – [Human level performance] phreazer
Line 1: Line 1:
-====== Using a single metric evaluation metric ======+====== Evaluation metrics and train/dev/test set ====== 
 +===== Using a single metric evaluation metric =====
  
 Precision (% of examples recognized as class 1, were class 1) Precision (% of examples recognized as class 1, were class 1)
Line 12: Line 13:
 Use **Dev set** + **single number evaluation** metric to speed-up iterative improvement Use **Dev set** + **single number evaluation** metric to speed-up iterative improvement
  
-====== Metric tradeoffs ======+===== Metric tradeoffs =====
  
 Maximize accuracy, subject to runningTime <= 100ms Maximize accuracy, subject to runningTime <= 100ms
Line 18: Line 19:
 N metrics: 1 optimizing, N-1 satisficing (reaching some threshold) N metrics: 1 optimizing, N-1 satisficing (reaching some threshold)
  
-====== Train/Dev/Test set ======+===== Train/Dev/Test set =====
  
 Dev set / holdout set: Try ideas on dev set Dev set / holdout set: Try ideas on dev set
Line 26: Line 27:
 Solution: Random shuffle (or stratified sample) Solution: Random shuffle (or stratified sample)
  
-===== Sizes =====+==== Sizes ====
   * For 100 - 10.000 samples: 70 Train 30 Test, or 60% Train 20% Dev 20 % Test   * For 100 - 10.000 samples: 70 Train 30 Test, or 60% Train 20% Dev 20 % Test
   * For 1.000.000 (NNs): 98% Train, 1% Dev, 1% Test   * For 1.000.000 (NNs): 98% Train, 1% Dev, 1% Test
  
-====== Change dev/test set and metric ======+===== Change dev/test set and metric =====
  
 Change metric, if rank ordering isn't "right" Change metric, if rank ordering isn't "right"
Line 42: Line 43:
  
 E.g. high quality images in dev/test set, user upload low quality images. => change metric and/or dev/test set E.g. high quality images in dev/test set, user upload low quality images. => change metric and/or dev/test set
 +
 +====== Human level performance ======
 +
 +Bayes optimal error (best optimal error)
 +
 +Human level error could be used as an estimate for Bayes error (e.g. in Computer Vision)
 +
 +  * H: 1%, Train: 8%, Dev: 10% => bias reduction
 +  * H: 7,5%, Train: 8, Dev: 10% => variance reduction (more data, regularization)
 +
 +What's human-level error? Best performance possible as a human / usefullness
 +
 +Measure of error between Human Error, Train Error and Dev error
 +
  • data_mining/strategy.txt
  • Last modified: 2018/05/21 18:50
  • by phreazer