CS5720 - Language Modeling Basics

What is Language Modeling?

Language Modeling is the task of learning the probability distribution over sequences of words or characters in a language, enabling prediction of the next word given previous context.

Core Objective:

• Estimate P(word | context)
• Learn language patterns and structure
• Capture syntax and semantics
• Enable text generation and understanding

🧠 Think of it as:

Teaching a computer to understand and predict language by learning from millions of examples of human writing.

Types of Language Models

📊

N-gram Models

Statistical models that predict based on previous N-1 words
🧠

Neural Language Models

RNNs, LSTMs, and Transformers that learn distributed representations
🔤

Character vs Word Level

Different granularities of language modeling

📈 Evolution:

From simple n-grams → RNNs → LSTMs → Transformers (GPT, BERT)

Language Model Probability Examples

P("cat" | "The quick brown fox jumps over the lazy")

Probability of word "cat" given the preceding context

High Probability

"I went to the store"
"The sun is bright"
"She opened the door"

P ≈ 0.8 - 0.9

Medium Probability

"I went to the museum"
"The sun is yellow"
"She opened the window"

P ≈ 0.4 - 0.6

Low Probability

"I went to the purple"
"The sun is sleeping"
"She opened the elephant"

P ≈ 0.0 - 0.1

Key Insight: Good language models assign high probabilities to natural, grammatically correct continuations.

Language Modeling Basics

What is Language Modeling?

Types of Language Models

Language Model Probability Examples

Modal Title