data_mining:neural_network:loss_functions

This is an old revision of the document!


Loss functions

Difference between two probability distributions

$-(y log(\hat{y}) + (1-y) log(1-\hat{y})$

(Negative log, since log in [0,1] is < 0)

Recap entropy:

In bin classification entropy for distribution q(y) with 50:50 classes is: $H(q) = log(2)$

For other distrubition (and in general) with C classes, entropy of distribution is $H(q) = - \sum_{c=1}^C q(y_c) * log(q(y_c))$

Small value of $q(y_c)$ leads to large negative log (multiple times of q): math.log(0.01) = -4.6

Large value of $q(y_c)$ leads to small negative log: math.log(0.99) = -0.01

More classes, higher entropy

  • data_mining/neural_network/loss_functions.1573340338.txt.gz
  • Last modified: 2019/11/09 23:58
  • by phreazer