← ~/visualizations
stochastic-gradient-descent #
Shows SGD as repeated parameter updates on a 2D loss surface using a noisy mini-batch gradient estimate g^_t. The true (full-data) gradient direction is shown alongside the stochastic estimate; mini-batch size cycles to demonstrate variance reduction and the unbiased expectation E[g^_t] = ∇f(θ_t).
canvasclick to interact
⏮◀◀▶▶STEP0.25x1xZOOM
t=0s
practical uses #
- 01.Training neural networks efficiently on large datasets
- 02.Online/streaming learning where data arrives continuously
- 03.Optimization when full gradients are expensive (large n), using mini-batches for a speed/variance trade-off
technical notes #
Implements a simple convex quadratic loss with an unbiased per-example gradient = true gradient + noise. g^_t is computed by averaging batchSize samples (variance ~ 1/sqrt(batchSize)), then θ is updated once per stepTime. Rendering uses snapped 4px grid alignment for a retro blocky look; animation interpolates θ between discrete updates using the provided ease(t) function.
← cooperative-gamesdiffusion-models →