Differences

This shows you the differences between two versions of the page.

--- data_mining:neural_network:word_embeddings [2018/06/09 18:00] – [Word2Vec] phreazer
+++ data_mining:neural_network:word_embeddings [2018/06/09 18:40] (current) – [Debiasing word embeddings] phreazer
@@ Line 59: / Line 59: @@
   * Learn Kontext c ("orange") => Target t ("juice")
   * $o_c => E => e_c => o_{softmax} => \hat{y}$
+  * Softmax has $\Theta_t$ parameter
   * $L(\hat{y},y) = - \Sigma^{10000}_{i=1} y_i log \hat{y}_i$
   * $y$ is one hot vector (10000 dim)
+Problems with softmax classification: Slow due to summing over dimension
+Solution: Hierarchical softmax: Tree of classifiers $log |v|$. Common words on top, not a balanced tree.
+=== How to sample context c? ===
+When uniformly random: often frequent words like "the, of, a, ..."
+Heuristics are used for sampling
+==== Negative Sampling ====
+Generate data set
+  * Pick 1 positive example
+  * Pick k negative examples
+    * Choose random word from dicitionary which are not associated with context word: target = 0
+    * Heuristic between uniform and observed distribution
+binary classification problems
+==== GloVe word vectors ====
+Global vectors for word representation
+$x_{ij}$: Number of times i appears in context of j
+Minimize $\sum_{i=1}^{10000} \sum_{j=1}^{10000} f(x_{ij}) (\Theta_i^{T} e_j + b_i - b'_j  - log x_{ij})^2$
+Weighting term $f(x_{ij})$: Weight for frequent, infrequent words
+$e^{final}_w = \frac{e_w + \Theta_w}{2}$
+===== Application =====
+==== Sentiment classification ====
+=== Simple model ===
+  * Extract embedding vector for each word
+  * Sum or Avg those vectors
+  * Pass to softmax to gain output (1-5 stars)
+Problem: Doesn't include order/sequence of words
+=== RNN for sentiment classification ===
+  * Extract embedding vector for each word
+  * Feed into RNN with softmax output
+===== Debiasing word embeddings =====
+Bias in text
+Addressing bias in word embessing:
+  - Identify bias direction (e.g. gender)
+    * $e_{he} - e_{she}$, average them
+  -  Neutralize: For every word that is not definitial (legitimate gender component), project
+  - Equalize pairs: Only difference should be gender (e.g. grandfather vs. grandmother); equidistant