Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
data_mining:neural_network:loss_functions [2019/11/09 23:31] – [Binary cross entropy] phreazer | data_mining:neural_network:loss_functions [2019/11/09 23:40] (current) – [Binary cross entropy] phreazer | ||
---|---|---|---|
Line 40: | Line 40: | ||
In training we have $N$ samples. For one particular example the distribution is known to which class it belongs. The loss function should minimize the average cross-entropy. | In training we have $N$ samples. For one particular example the distribution is known to which class it belongs. The loss function should minimize the average cross-entropy. | ||
+ | |||
+ | Outcome: Scalar [0,1] using sigmoid | ||
+ | |||
+ | ===== Cross-entropy ===== | ||
+ | |||
+ | $-\sum_{i}^C(y_i log(\hat{y_i})$ | ||
+ | $C$ is number of classes | ||
+ | |||
+ | Outcome: Vector [0,1] using softmax | ||
+ | |||
+ | ===== Binary cross entropy with multi labels ===== | ||
+ | |||
+ | $-\sum_{i}^C(y_i log(\hat{y_i}) + (1-y_i) log(1-\hat{y_i})$ | ||
+ | |||
+ | Ouctome: Vector [0,1] using sigmoid | ||
+ |