data_mining:error_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
data_mining:error_analysis [2018/05/21 19:30] – [Working on most promising problems] phreazerdata_mining:error_analysis [2018/05/21 19:38] – [Working on most promising problems] phreazer
Line 130: Line 130:
 Result: Calc percentage of problem category (potential improvement "ceiling") Result: Calc percentage of problem category (potential improvement "ceiling")
  
 +====== Misslabeled data ======
 +
 +DL algos: If % or errors is //low// and errors are //random//, they are robust
 +
 +Add another col "incorrectly labeled" in error analysis spread sheet.
 +
 +Principles when fixing labels:
 +
 +- Apply same process to dev and test set (same distribution)
 +- Also see what examples algo got right (not only wrong)
 +- Train and dev/test data may come from different distribution (no problem if slightly different)
  • data_mining/error_analysis.txt
  • Last modified: 2018/05/21 22:24
  • by phreazer