CS5720 - Week 7
Slide 122 of 140

LSTM Architecture: The Cell State

LSTM Innovation

The Key Breakthrough:

LSTMs introduce a dual-state architecture that separates:

โ€ข Cell State (C_t): Long-term memory storage
โ€ข Hidden State (h_t): Short-term working memory
๐Ÿ—๏ธ Dual-State Design
Two separate information streams enable both short-term processing and long-term memory preservation.
๐Ÿ›ค๏ธ Information Highway
The cell state acts as a highway where information can travel across time with minimal interference.

Core Components

๐Ÿ’œ Cell State (C_t)
The main memory component that carries information across time steps with minimal modification.
๐Ÿ’š Hidden State (h_t)
The output representation that combines cell state information with current input.
๐Ÿšช Three Smart Gates
Forget, Input, and Output gates control information flow with learned parameters.

Interactive LSTM Cell Architecture

x_t
Input
โ†“
LSTM Cell
f_t
i_t
o_t
Cell State: C_t
โ†“
h_t
Output
๐Ÿง  Selective Memory
Gates learn to selectively remember, forget, and output information based on context and importance.
๐Ÿ›ฃ๏ธ Gradient Highway
Cell state provides a direct path for gradients, preventing vanishing gradient problems.
โšก Adaptive Processing
Each gate adapts its behavior based on current input and previous state, enabling context-aware processing.
Prepared by Dr. Gorkem Kar