Differences

This shows you the differences between two versions of the page.

--- data_mining:mutual_information [2015/08/13 22:31] – [Naive KNN] phreazer
+++ data_mining:mutual_information [2015/08/14 00:21] (current) – [Beispiel:] phreazer
@@ Line 7: / Line 7: @@
 **Entropiebasiert**
 Differenz von Quell-Entropie und Äquivokation oder Empfangsentropie und Fehlinformation.
@@ Line 25: / Line 26: @@
 $D(P||Q) = KL(P,Q) = \sum_{x \in X} P(x) * log \frac{P(x)}{Q(x)}$
+==== Beispiel: ====
+F ist Feature und T ist Target => I(F,B)
+Siehe auch https://www.youtube.com/watch?v=hlGJ1M8T5oA
 ===== Schätzer =====
@@ Line 30: / Line 34: @@
 $x$ ist d-dimensionale kontinuierliche ZV mit pdf p und Randichten $p_j$ für jedes $x_j$.
-\begin{align}
+\begin{align}H(x)& = - \int_{R^d} p(x) log p(x) dx \\I(x)& = - \int_{R^d} p(x) log \frac{p(x)}{\prod_{j=1}^{d} p_j(x_j)} dx\end{align}
-H(x)& = - \int_{R^d} p(x) log p(x) dx \\
-I(x)& = - \int_{R^d} p(x) log \frac{p(x)}{\prod_{j=1}^{d} p_j(x_j)} dx
-\end{align}
+Für $d>2$ ist die generalisierte MI die total correlation oder multi-information. Gegeben N i.i.d. samples $X$ Schätzer $I(x)$ basierend auf Samples.
+Naive KNN-Schätzer:
-Für $d>2$ ist die generalisierte MI die total correlation oder multi-information. Gegeben N i.i.d. samples $\Chi$ Schätzer $I(x)$ basierend auf Samples.
+- Asymptotic unbiased estimator