Regularization

Techniques to prevent overfitting by controlling model complexity

Overview

Regularization is a set of techniques used in machine learning to prevent overfitting by introducing constraints or penalties during training. It helps models achieve better generalization by discouraging them from learning spurious or overly complex patterns in the data.

Core Concepts

  • Penalties for Complexity
    Encouraging simpler models by penalizing large weights.
  • Controlling Weight Magnitudes
    Reduces the chance of memorizing outliers.
  • Bias-Variance Tradeoff
    Adjusting how closely the model fits training data.
  • Preventing Memorization
    Ensures that patterns, not noise, drive the model's predictions.

Common Techniques

  • L1 (Lasso)
    Penalizes the absolute value of weights, often encouraging sparse solutions.
  • L2 (Ridge)
    Penalizes the square of weights, promoting smaller but non-zero weights.
  • Elastic Net
    A blend of L1 and L2 to balance sparsity and stability.
  • Dropout
    Randomly “dropping” connections during training to reduce co-adaptation.
  • Early Stopping
    Stopping training when validation performance stops improving.
  • Weight Decay
    Gradually scaling weights down during optimization.

Implementation

  • Applied During Model Training
    Usually integrated into the loss function.
  • Tuned via Hyperparameters
    E.g., the strength of L1 or L2 regularization.
  • Monitored with Validation
    Ensures that regularization is neither too weak nor too strong.
  • Balanced with Model Capacity
    Over-regularization can lead to underfitting.