data_mining:neural_network:word_embeddings

This is an old revision of the document!


Word embeddings

Man → Woman is like King → ?

Example: 4 dim embedding (Gender, royal, age, food):

  • $e_{man} - e_{woman} \approx (-2, 0, 0, 0)^T$
  • $e_{king} - e_{queen} \approx (-2, 0, 0, 0)^T$

Goal: Find word $w$, that maximaizes $sim(e_w, e_{king} - e_{man} + e_{woman})$

Cosine similarity often used as similarity function

$sim(u,v) = \frac{u^T v}{||u||_2 ||v||_2}$

Dimensions 10000 x 300

  • Dictionary with 10000 entries
  • 300 features?

Embedding vector obtained with one-hot encoding $o_j$ : $E * o_j = e_j$

Goal: Learn embedding matrix $E$.

Embedding layer in Keras

Given 4 words in sequence, what is next word (using E as parameter).

Maximize likelihood with gradient descent.

Other context:

Can be used to learn a word embedding

Context: 4 words on left and right Or last 1 word Or nearby 1 word (“skip gram”)

Context and Target

“I want a glass of orange juice to go along with my cereal.”

Context: orange Pick target by chance within a window: juice or glass or …

Model:

  • Vocab size = 10.000
  • Learn Kontext c (“orange”) ⇒ Target t (“juice”)
  • $o_c => E => e_c => o_{softmax} => \hat{y}$
  • data_mining/neural_network/word_embeddings.1528559958.txt.gz
  • Last modified: 2018/06/09 17:59
  • by phreazer