CS5720 - Week 1
Slide 16 of 20

Neural Network Terminology

🧠
Neuron
Basic Component
A computational unit that receives inputs, applies weights and bias, then passes the result through an activation function.
output = f(Σ(wᵢ × xᵢ) + b)
📚
Layer
Architecture
A collection of neurons that process information together. Networks typically have input, hidden, and output layers.
Layer₁ → Layer₂ → ... → Output
⚖️
Weight
Parameters
Learnable parameters that determine the strength of connections between neurons. Adjusted during training.
w ∈ ℝ (real number)
🎯
Bias
Parameters
An offset value added to the weighted sum, allowing the activation function to shift left or right.
b ∈ ℝ (additive constant)
Activation Function
Functions
Non-linear function applied to neuron output, enabling networks to learn complex patterns.
f: ℝ → ℝ (e.g., ReLU, Sigmoid)
➡️
Forward Propagation
Process
The process of passing input data through the network layers to produce an output prediction.
Input → Hidden → Output
⬅️
Backpropagation
Training
Algorithm for training neural networks by propagating errors backward to update weights and biases.
∂L/∂w = ∂L/∂y × ∂y/∂w
📉
Loss Function
Training
Measures how far the network's predictions are from the true values. Guides the learning process.
L(ŷ, y) = error measure
📈
Gradient Descent
Optimization
Optimization algorithm that updates parameters in the direction that minimizes the loss function.
w ← w - α∇w L
🔄
Epoch
Training
One complete pass through the entire training dataset during the learning process.
1 epoch = full dataset pass
⚠️
Overfitting
Problem
When a model learns training data too well, failing to generalize to new, unseen data.
High train accuracy, Low test accuracy
🎛️
Hyperparameter
Configuration
Configuration settings that control the learning process, set before training begins.
α, batch_size, n_layers, etc.

Browse by Category

Architecture
Training
Optimization
Evaluation
Advanced
Multi-Layer Perceptron (MLP)
Feedforward network with multiple hidden layers
Dense Layer
Fully connected layer where each neuron connects to all neurons in the previous layer
Input Layer
First layer that receives raw data features
Hidden Layer
Intermediate layers that extract and transform features
Output Layer
Final layer that produces predictions or classifications
Network Depth
Number of layers in the neural network
Batch
Subset of training data processed together in one iteration
Learning Rate
Step size for parameter updates during optimization
Validation Set
Data used to tune hyperparameters and monitor training progress
Test Set
Held-out data for final model evaluation
Early Stopping
Technique to prevent overfitting by stopping training when validation performance degrades
Convergence
When the training process reaches a stable state with minimal loss changes
SGD
Stochastic Gradient Descent - basic optimization algorithm
Adam
Adaptive optimization algorithm with momentum
Momentum
Technique to accelerate gradient descent by accumulating past gradients
Regularization
Techniques to prevent overfitting by constraining model complexity
Dropout
Regularization technique that randomly turns off neurons during training
Batch Normalization
Technique to normalize inputs to each layer for stable training
Accuracy
Percentage of correct predictions
Precision
True positives divided by predicted positives
Recall
True positives divided by actual positives
F1 Score
Harmonic mean of precision and recall
Confusion Matrix
Table showing correct vs predicted classifications
Cross Validation
Technique to assess model generalization using multiple train/test splits
Transfer Learning
Using pre-trained models as starting points for new tasks
Ensemble Methods
Combining multiple models to improve performance
Attention Mechanism
Technique for models to focus on relevant parts of input
Autoencoder
Neural network that learns compressed representations
Embedding
Dense vector representations of discrete objects
Fine-tuning
Adapting pre-trained models to specific tasks
Prepared by Dr. Gorkem Kar