CS5720 - Week 7
Slide 123 of 140

LSTM Gates Deep Dive

🚫
Forget Gate
"What should I forget?"
Controls what information to remove from the cell state. Outputs values between 0 (forget everything) and 1 (remember everything).
📥
Input Gate
"What should I learn?"
Decides what new information to store in the cell state. Works with candidate values to selectively add new information.
📤
Output Gate
"What should I share?"
Controls what parts of the cell state to output. Determines what information becomes the hidden state.

Gate Operation Sequence

1
Input Processing
Combine current input x_t with previous hidden state h_{t-1}
2
Gate Computation
Calculate forget, input, and output gate values using sigmoid
3
Candidate Values
Generate potential new information using tanh activation
4
Cell State Update
Apply forget and input gates to update cell state
5
Hidden State
Apply output gate to generate final hidden state
LSTM Mathematical Formulation
Forget Gate
f_t = σ(W_f · [h_{t-1}, x_t] + b_f)
Sigmoid activation ensures values between 0 and 1
Input Gate
i_t = σ(W_i · [h_{t-1}, x_t] + b_i)
Controls what new information to store
Candidate Values
C̃_t = tanh(W_C · [h_{t-1}, x_t] + b_C)
Tanh creates values between -1 and 1
Cell State Update
C_t = f_t * C_{t-1} + i_t * C̃_t
Elementwise operations preserve gradient flow
Output Gate
o_t = σ(W_o · [h_{t-1}, x_t] + b_o)
Determines what to output from cell state
Hidden State
h_t = o_t * tanh(C_t)
Final output combining cell state and output gate
Prepared by Dr. Gorkem Kar