CS5720 - Week 7
Slide 132 of 140

Language Modeling Basics

What is Language Modeling?

Language Modeling is the task of learning the probability distribution over sequences of words or characters in a language, enabling prediction of the next word given previous context.
Core Objective:

β€’ Estimate P(word | context)
β€’ Learn language patterns and structure
β€’ Capture syntax and semantics
β€’ Enable text generation and understanding
🧠 Think of it as:
Teaching a computer to understand and predict language by learning from millions of examples of human writing.

Types of Language Models

  • πŸ“Š
    N-gram Models
    Statistical models that predict based on previous N-1 words
  • 🧠
    Neural Language Models
    RNNs, LSTMs, and Transformers that learn distributed representations
  • πŸ”€
    Character vs Word Level
    Different granularities of language modeling
πŸ“ˆ Evolution:
From simple n-grams β†’ RNNs β†’ LSTMs β†’ Transformers (GPT, BERT)

Language Model Probability Examples

P("cat" | "The quick brown fox jumps over the lazy")
Probability of word "cat" given the preceding context
High Probability
"I went to the store"
"The sun is bright"
"She opened the door"
P β‰ˆ 0.8 - 0.9
Medium Probability
"I went to the museum"
"The sun is yellow"
"She opened the window"
P β‰ˆ 0.4 - 0.6
Low Probability
"I went to the purple"
"The sun is sleeping"
"She opened the elephant"
P β‰ˆ 0.0 - 0.1
Key Insight: Good language models assign high probabilities to natural, grammatically correct continuations.
Prepared by Dr. Gorkem Kar