Word embeddings

This is an old revision of the document!

Man → Woman is like King → ?

Example: 4 dim embedding (Gender, royal, age, food):

Goal: Find word $w$, that maximaizes $sim(e_w, e_{king} - e_{man} + e_{woman})$

Cosine similarity often used as similarity function

$sim(u,v) = \frac{u^T v}{||u||_2 ||v||_2}$

Dimensions 10000 x 300

Embedding vector obtained with one-hot encoding $o_j$ : $E * o_j = e_j$

Goal: Learn embedding matrix $E$.

Embedding layer in Keras

Given 4 words in sequence, what is next word (using E as parameter).

Maximize likelihood with gradient descent.

Other context:

Can be used to learn a word embedding

Context: 4 words on left and right Or last 1 word Or nearby 1 word (“skip gram”)

Context and Target

“I want a glass of orange juice to go along with my cereal.”

Context: orange Pick target by chance within a window: juice or glass or …

Model:

Basics