CS5720 - Week 7
Slide 133 of 140

Word Embeddings Introduction

What are Word Embeddings?

Word Embeddings are dense vector representations of words that capture semantic relationships and meaning in a continuous vector space.
Key Properties:

Dense - Typically 50-300 dimensions
Learned - From large text corpora
Semantic - Similar words have similar vectors
Compositional - Can perform vector arithmetic
🧮 Famous Example
King - Man + Woman ≈ Queen

Problems with Traditional Approaches

  • One-Hot Encoding
    Sparse, high-dimensional, no semantic relationships
  • Bag of Words
    Ignores word order, no context, treats all words equally
  • TF-IDF
    Still sparse, limited semantic capture, no generalization
  • Computational Problems
    Vocabulary size explosion, memory inefficiency

From Words to Vectors

Cat
Dog
Car
Truck
Cat
Dog
Car
Truck
Notice: Similar words (cat/dog, car/truck) are closer in the vector space!
Prepared by Dr. Gorkem Kar