Word Embeddings are dense vector representations of words that capture semantic relationships and meaning in a continuous vector space.
Key Properties:
• Dense - Typically 50-300 dimensions
• Learned - From large text corpora
• Semantic - Similar words have similar vectors
• Compositional - Can perform vector arithmetic
🧮 Famous Example
King - Man + Woman ≈ Queen
Problems with Traditional Approaches
One-Hot Encoding
Sparse, high-dimensional, no semantic relationships
Bag of Words
Ignores word order, no context, treats all words equally
TF-IDF
Still sparse, limited semantic capture, no generalization
Computational Problems
Vocabulary size explosion, memory inefficiency
From Words to Vectors
Cat
Dog
Car
Truck
→
Cat
Dog
Car
Truck
Notice: Similar words (cat/dog, car/truck) are closer in the vector space!