Differences

This shows you the differences between two versions of the page.

--- data_mining:entropie [2013/09/15 17:20] – [Mutual information] phreazer
+++ data_mining:entropie [2017/09/09 12:53] (current) – phreazer
@@ Line 1: / Line 1: @@
-====== Entropie ======
+====== Entropy ======
-Claude Shannon (1948): Information hängt mit der Überraschung zusammen
+Claude Shannon (1948): Entropy as a measure of surprise / uncertainty.
-Nachricht über ein Ereignis mit Wahscheinlichkeit p umfasst $- \mathit{log}_2 p$ Bits an Informationen
+Message about an event with a probability of occurrence p includes $- \mathit{log}_2 p$ bits of information
-Beispiel für eine faire Münze : $- \mathit{log}_2 0.5 = 1$
+Example of a fair coin: $- \mathit{log}_2 0.5 = $1
 ====== Mutual information ======
@@ Line 28: / Line 28: @@
 H(F) + H(B) - H(F,B)
-Smoothing
+Features selection => Die, die höchste MI haben, allerdings zu rechenintensiv
+Proxies: IDF; iterativ AdaBoost
+Mehr features ->
+NBC verbessert sich, fällt dann.
+Redundante Features, Annahme von Bayes
+====== Beispiel ======
+p(+) = 10.000/15.000 = 2/3\\
+p(-) = 5.000/15.000 = 1/3\\
+p(hate) = 3.000/15.000 = 0,2\\
+p(~hate) = 0,8\\
+p(hate,+) =1/15.000 \text{(kommt in keinem positiven Kommentar vor, 1 anstelle von Null => Smoothing)}\\
+p(~hate,+) = 10.000/15.000 = 2/3\\
+p(hate,-) = 3.000/15.000 = 1/5\\
+p(~hate,-) = 2.000/15.000 = 2/15
 $$
-p(+)=0,75\\
+  I(H,S) = p(hate,+) * log \frac{p(hate,+)}{p(hate)p(+)} + ... =
-p(-)=0,25\\
-p(hate)=800/8000\\
-p(~hate)=7200/8000\\
-p(hate,+)=1/8000 \text{(kommt in keinem positiven Kommentar vor, 1 anstelle von Null => Smoothing)}\\
-p(~hate,+)=6000/8000\\
-p(hate,-)=1200/8000\\
-p(~hate,-)=0,1
 $$
+====== Kapazität eines Kanals ======
+Maximale mutual information, die zwischen Sender und Empfängeer pro Sekunde
+Äquivalent im ML: Wie viele Trainingsdaten notwendig -> Abhängig vom Konzept