Common Distributions

←Back to Tech Tree

inventorycoverage

Common Distributions #

Probability & StatisticsDifficulty: ★★☆☆☆Depth: 5Unlocks: 59

Bernoulli, binomial, Poisson, uniform, normal distributions.

Interactive Visualization #

⏮◀◀▶▶STEP0.25x1xZOOM

t=0s

Core Concepts #

Key Symbols & Notation #

theta (generic parameter set) - the parameters that define a specific distribution within a family (e.g., p, lambda, mu, sigma).

Essential Relationships #

Prerequisites (2) #

Random Variables6 atomsVariance5 atoms

Unlocks (10) #

Maximum Likelihood Estimationlvl 3Bayesian Inferencelvl 4Entropylvl 3Joint Distributionslvl 3Central Limit Theoremlvl 3Monte Carlo Methodslvl 4Conjugate Priorslvl 4Hypothesis Testinglvl 3

+2 more...

Referenced by (3) #

Where this concept shows up in the operating-finance and personal-finance graphs.

From Business (3) #

[personal financeBusiness

Options pricing (Black-Scholes) assumes log-normal return distributions - options-basics is applied distribution theory](/business/personal-finance/)[Long TailBusiness

The long tail is literally the tail of a power law (Pareto) distribution - understanding heavy-tailed vs normal distributions is the mathematical foundation for why a massive volume of rare items exists and why their aggregate value is significant](/business/long-tail/)[Return DistributionBusiness

Return distributions are specific instances of probability distributions - normal, lognormal, fat-tailed. Understanding Bernoulli, binomial, and especially normal distributions is the mathematical prerequisite for modeling investment returns.](/business/return-distribution/)

Advanced Learning Details

Graph Position #

40

Depth Cost

59

Fan-Out (ROI)

27

Bottleneck Score

5

Chain Length

Cognitive Load #

6

Atomic Elements

39

Total Elements

L2

Percentile Level

L4

Atomic Level

All Concepts (14) #

Teaching Strategy #

Deep-dive lesson - accessible entry point but dense material. Use worked examples and spaced repetition.

Most of probability and statistics is built on a surprisingly small “toolbox” of distributions. Learn a handful well, and you can model coin flips, counts of arrivals, measurement noise, and uncertainty around unknown quantities—with clean formulas for probabilities, means, and variances.

TL;DR:

A distribution is a parameterized family p(x | θ) describing how a random variable X behaves. Discrete distributions use PMFs and sum to get probabilities; continuous distributions use PDFs and integrate. This lesson covers Bernoulli, binomial, Poisson (discrete) and uniform, normal (continuous), including when to use each and their key formulas.

What Is a Common Distribution? #

Why you should care #

In real problems, you rarely invent a probability model from scratch. Instead, you pick a distribution family—a reusable pattern—and tune a few parameters θ to match the situation.

Examples:

Each family gives you:

Support: discrete vs continuous #

The first decision is the type of values X can take.

A common confusion: for continuous X, f(x) is not itself a probability. It’s a density. You must integrate over an interval.

Parameterized families p(x | θ) #

A distribution is often written as p(x | θ) (or f(x | θ)). The parameter set θ chooses one specific member of the family.

Examples of θ:

These parameters control “location” (where the mass sits) and “scale/spread” (how variable it is).

A quick comparison table #

DistributionSupportTypeParameters θTypical meaning
Bernoulli(p){0, 1}Discrete (PMF)pOne trial: success/failure
Binomial(n, p){0, …, n}Discrete (PMF)n, p# successes in n independent trials
Poisson(λ){0, 1, 2, …}Discrete (PMF)λ# events in a fixed window
Uniform(a, b)[a, b]Continuous (PDF)a, b“Equally likely” in an interval
Normal(μ, σ²)Continuous (PDF)μ, σ²Measurement noise / sums / averages

In the next sections, we’ll build each one slowly: story → support → formula → mean/variance → how to compute probabilities.

Core Mechanic 1: Discrete Distributions (Bernoulli, Binomial, Poisson) #

Why discrete models show up everywhere #

Many systems naturally produce counts:

Discrete distributions let you assign probability to each integer outcome and then sum the relevant probabilities.


Bernoulli(p) #

Story #

One trial. Two outcomes.

Support and PMF #

Support: X ∈ {0, 1}

PMF:

A compact way to write the PMF is:

Mean and variance #

You’ll use these constantly:

Why E[X] = p?


Binomial(n, p) #

Story #

Repeat a Bernoulli trial n times, independently, with the same success probability p. Let X be the number of successes.

Examples:

Support and PMF #

Support: X ∈ {0, 1, …, n}

PMF:

Where (n choose k) = n! / (k!(n−k)!).

Intuition for the formula #

Mean and variance #

A useful connection (preview of CLT): if n is large, a binomial can often be approximated by a normal:


Poisson(λ) #

Story #

Poisson models counts of events in a fixed window when:

Examples:

Support and PMF #

Support: X ∈ {0, 1, 2, …}

PMF:

Here λ is both the rate parameter and the mean count per window.

Mean and variance #

That “mean = variance” fact is a diagnostic: if your observed counts have variance much bigger than the mean, a plain Poisson may be too simple.

Connection to binomial (rare events) #

A classic approximation: if n is large and p is small but np stays moderate, then:

This is the “rare events” regime.


Discrete probability calculations: summing #

If X is discrete, you compute:

Examples:

In practice, you often use a CDF function from software for these sums, but it’s crucial to understand what is being summed and why.

Core Mechanic 2: Continuous Distributions (Uniform, Normal) and PDFs #

Why continuous models matter #

Many measurements are not naturally integer counts:

Even if the world is measured with finite precision, continuous distributions are often excellent approximations and give smooth, usable mathematics.

The key mental shift:


Uniform(a, b) #

Story #

“All values between a and b are equally plausible.”

Examples:

Support and PDF #

Support: X ∈ [a, b]

PDF:

Probability of an interval #

For any c, d with a ≤ c ≤ d ≤ b:

P(c ≤ X ≤ d)

= ∫ᶜᵈ 1/(b − a) dx

= (d − c)/(b − a)

So probability is proportional to interval length.

Mean and variance #


Normal(μ, σ²) #

Story #

The normal (Gaussian) distribution models values that cluster around an average μ with symmetric noise.

It also appears because of aggregation:

Examples:

Support and PDF #

Support: X ∈ ℝ

PDF:

Parameters:

Standardization (Z-scores) #

A core technique is converting to the standard normal.

If X ∼ Normal(μ, σ²), define:

Then Z ∼ Normal(0, 1).

This lets you use standard normal tables or software CDFs:

The 68–95–99.7 rule (intuition) #

For X ∼ Normal(μ, σ²):

This is not a definition—just a helpful memory.


Continuous probability calculations: integrating #

If X is continuous with PDF f:

For an interval [c, d]:

For the uniform, this integral is easy.

For the normal, there is no elementary antiderivative, so we use the CDF Φ(z) numerically.

A practical comparison:

TaskDiscreteContinuous
Probability at a pointP(X=x) can be > 0P(X=x)=0
Probability over a rangesum PMF valuesintegrate PDF
Typical tool∑ and CDF tables∫ and CDF Φ

Understanding this split (support → PMF/PDF → sum/integrate) prevents many downstream mistakes in statistics and ML.

Application/Connection: Choosing a Distribution + How This Unlocks Next Topics #

Why model choice matters #

A distribution is a compact set of assumptions. Choosing one is not just “picking a formula”—it’s deciding what outcomes are possible and what patterns are likely.

A good first pass is to match the data type and generative story.


Quick chooser: which distribution should I try? #

If your variable X is…And the story is…Start with…
0/1 outcomeone trial with success prob pBernoulli(p)
integer 0…nn independent trials, constant pBinomial(n, p)
nonnegative countevents in a window at average rate λPoisson(λ)
real in [a, b]equally likely in an intervalUniform(a, b)
real-valuedsymmetric noise around μNormal(μ, σ²)

Then sanity-check with:


How this connects to Maximum Likelihood Estimation (MLE) #

In MLE, you assume data x₁, …, xₙ came from a distribution family p(x | θ) and pick θ that makes the observed data most likely.

Examples you’ll soon see:

To do MLE well, you must recognize which likelihood matches your data (Bernoulli vs binomial vs Poisson, etc.).


How this connects to Bayesian inference #

Bayesian inference updates distributions with data:

The likelihood often comes from a “common distribution.” Examples:

Knowing the likelihood family is step 1.


How this connects to joint distributions and the CLT #


One more pacing note: models are approximations #

A distribution is rarely “true.” It’s a simplified story that is useful if:

As you learn more, you’ll add richer families, but these five are the workhorses you’ll keep returning to.

Worked Examples (3) #

Binomial probability: at least k successes #

A website A/B test shows a conversion on a visit with probability p = 0.2, assumed constant across visitors. You observe n = 5 independent visitors. Let X be the number of conversions. Compute P(X ≥ 2).

  1. Identify the distribution:

    X counts successes in n independent Bernoulli trials ⇒ X ∼ Binomial(n, p) with n=5, p=0.2.

  2. Use the complement to reduce work:

    P(X ≥ 2) = 1 − P(X ≤ 1)

    = 1 − (P(X=0) + P(X=1)).

  3. Compute P(X=0):

    P(X=0) = (5 choose 0) (0.2)⁰ (0.8)⁵

    = 1 · 1 · 0.8⁵

    = 0.32768.

  4. Compute P(X=1):

    P(X=1) = (5 choose 1) (0.2)¹ (0.8)⁴

    = 5 · 0.2 · 0.8⁴

    = 1 · 0.4096

    = 0.4096.

  5. Combine:

    P(X ≤ 1) = 0.32768 + 0.4096 = 0.73728

    P(X ≥ 2) = 1 − 0.73728 = 0.26272.

Insight: For discrete distributions, complements often avoid long sums. Here, summing k=2,3,4,5 is more work than subtracting k=0,1 from 1.

Poisson probability: probability of 0 or 1 event #

A server receives requests at an average rate of λ = 3 requests per minute. Model the number of requests in a minute as X ∼ Poisson(3). Compute P(X ≤ 1).

  1. Write the PMF:

    P(X=k) = e^(−λ) λᵏ / k! with λ = 3.

  2. Compute P(X=0):

    P(X=0) = e^(−3) 3⁰ / 0!

    = e^(−3).

  3. Compute P(X=1):

    P(X=1) = e^(−3) 3¹ / 1!

    = 3e^(−3).

  4. Sum:

    P(X ≤ 1) = P(X=0) + P(X=1)

    = e^(−3) + 3e^(−3)

    = 4e^(−3)

    ≈ 4 · 0.049787

    ≈ 0.19915.

Insight: Poisson computations are often a few terms plus a small exponential factor. Also note how λ directly sets the typical count: with λ=3, getting ≤1 is fairly unlikely (~0.20).

Uniform PDF to probability: interval length #

Let X ∼ Uniform(10, 18). Compute P(12 ≤ X ≤ 15) and the mean and variance.

  1. Write the PDF:

    f(x) = 1/(18−10) = 1/8 for 10 ≤ x ≤ 18.

  2. Compute the probability by integrating:

    P(12 ≤ X ≤ 15) = ∫¹²¹⁵ (1/8) dx

    = (1/8)(15−12)

    = 3/8

    = 0.375.

  3. Compute the mean:

    E[X] = (a+b)/2 = (10+18)/2 = 14.

  4. Compute the variance:

    Var(X) = (b−a)²/12

    = (8)²/12

    = 64/12

    = 16/3

    ≈ 5.333.

Insight: For a uniform distribution, probabilities are purely about lengths of intervals—no calculus tricks required beyond “constant × width.”

Key Takeaways #

Common Mistakes #

Practice #

easy

Let X ∼ Bernoulli(p) with p = 0.7. Compute E[X], Var(X), and P(X=0).

Hint: Use E[X]=p and Var(X)=p(1−p). Also P(X=0)=1−p.

Show solution

E[X]=0.7.

Var(X)=0.7(1−0.7)=0.7·0.3=0.21.

P(X=0)=1−0.7=0.3.

medium

A factory produces items with defect probability p = 0.05 independently. In a batch of n = 20 items, let X be the number of defects. Compute P(X=0) and P(X≥1).

Hint: Use X ∼ Binomial(20, 0.05). P(X≥1)=1−P(X=0).

Show solution

X ∼ Binomial(20,0.05).

P(X=0) = (20 choose 0)(0.05)⁰(0.95)²⁰ = 0.95²⁰ ≈ 0.3585.

P(X≥1)=1−0.95²⁰ ≈ 1−0.3585 = 0.6415.

medium

Let X ∼ Normal(μ, σ²) with μ = 100 and σ = 15. Compute P(X ≤ 130) in terms of the standard normal CDF Φ, and give a numerical approximation.

Hint: Convert to Z = (X−μ)/σ. Then P(X≤x)=Φ((x−μ)/σ). Use Φ(2)≈0.9772.

Show solution

Z = (X−100)/15 so Z ∼ Normal(0,1).

P(X ≤ 130) = P(Z ≤ (130−100)/15) = Φ(30/15) = Φ(2).

Numerically, Φ(2) ≈ 0.9772.

Connections #

Next nodes you can tackle:

Quality: B (3.8/5)

← back to treebrowse all →