Gradient Descent: From Intuition to Implementation

What is Gradient Descent?

Gradient descent minimizes a function by iteratively stepping in the direction of steepest descent.

Imagine standing blindfolded on a hilly landscape. You want the lowest point. Strategy: feel the slope, step downhill, repeat.

Given a loss function $L(\theta)$, update parameters as:

$$\theta = \theta - \alpha \nabla L(\theta)$$

where $\alpha$ is the learning rate.

def gradient_descent(grad_fn, theta, lr=0.01, steps=100):
    for _ in range(steps):
        theta -= lr * grad_fn(theta)
    return theta