A More Unique Word Vector Model (I): simpler glove

By 苏剑林 | November 19, 2017

By Su Jianlin | November 19, 2017 | 52,909 Readers

If you ask me which is the most convenient and easiest-to-use word vector model, I think it should be word2vec. But if you ask me which is the most beautiful word vector model, I don't know; I feel that every model has its own deficiencies. Setting aside experimental results (which are often just a matter of evaluation metrics), purely from a theoretical perspective, no model can yet be called truly "beautiful."

This article discusses some common questions regarding word vectors that many people are concerned about. Many conclusions have been discovered primarily through experimentation but lack reasonable explanations, including:

How should one construct a word vector model?

Why use cosine similarity for synonym search? What does the inner product of vectors represent?

Does the norm (length) of a word vector have any special meaning?

Why do word vectors possess word analogy properties? (King - Man + Woman = Queen)

After obtaining word vectors, how should sentence vectors be constructed? What is the basis for using the sum of word vectors as a simple sentence vector?

These discussions are both specific and general. Some explanations might be directly transferable to the interpretation of word vector properties in the GloVe model and the Skip-gram model. Readers are encouraged to try this for themselves.

Centered around the discussion of these questions, this article proposes a new GloVe-like word vector model, here referred to as simpler glove. Based on modifications to Stanford's GloVe source code, an implementation is provided, with the specific code available on Github.

Why improve GloVe? Certainly, the ideas behind GloVe are very inspiring. However, although it claims to rival or even surpass word2vec, it is fundamentally a somewhat flawed model (we will explain why it is problematic later). Therefore, there is room for improvement.

Content Overview: