Conditional Probability

←Back to Tech Tree

inventorycoverage

Conditional Probability #

Probability & StatisticsDifficulty: ★★☆☆☆Depth: 2Unlocks: 28

P(A|B) - probability of A given B has occurred.

Interactive Visualization #

⏮◀◀▶▶STEP0.25x1xZOOM

t=0s

Core Concepts #

Key Symbols & Notation #

P(A|B)

Essential Relationships #

Prerequisites (1) #

Basic Probability6 atoms

Unlocks (3) #

Bayes Theoremlvl 2Markov Chainslvl 4Independencelvl 2

Referenced by (7) #

Where this concept shows up in the operating-finance and personal-finance graphs.

From Business (6) #

[Close RateBusiness

Close rate is literally P(placement | interview); data-driven recruiting means identifying which observable candidate signals maximize this conditional probability, and the 4:1 ratio is a direct expression of the conditional conversion rate through the funnel](/business/close-rate/)[CollateralBusiness

Collateral fundamentally changes P(loss severity | default) - the conditional distribution of lender losses given borrower default is dramatically different with vs without pledged assets, which is the mathematical mechanism by which collateral reduces loan pricing](/business/collateral/)[Interview-to-Placement RatioBusiness

The 4:1 ratio is literally P(placement | interview) = 0.25. Data-driven recruiting means decomposing this conditional probability through the funnel - P(pass screen | apply), P(pass interview | pass screen), etc. - and measuring which signals improve each stage's conversion rate.](/business/interview-to-placement-ratio/)[Time-to-FillBusiness

Close rates and interview-stage ratios are conditional probabilities - P(accept|offer), P(onsite|phone screen) - and the entire hiring funnel is a chain of conditional transitions whose product gives P(hire|sourced)](/business/time-to-fill/)[Contingent LiabilitiesBusiness

A contingent liability is literally a conditional obligation - it only materializes if a triggering event occurs, making P(loss | trigger event) the core quantity to estimate](/business/contingent-liabilities/)[Churn RateBusiness

Churn rate is P(leave | active customer in period t). Computing it correctly requires conditioning on the right population and time window - cohort-based churn vs blended rates are conditional probability problems.](/business/churn-rate/)

From Money (1) #

[Disability InsuranceMoney

Disability probability is conditioned on occupation, age, and health](/money/disability-insurance/)

Advanced Learning Details

Graph Position #

18

Depth Cost

28

Fan-Out (ROI)

14

Bottleneck Score

2

Chain Length

Cognitive Load #

6

Atomic Elements

16

Total Elements

L0

Percentile Level

L4

Atomic Level

All Concepts (6) #

Teaching Strategy #

Deep-dive lesson - accessible entry point but dense material. Use worked examples and spaced repetition.

A fair coin has P(H) = 1/2. But if you’re told “the coin landed heads,” the probability of heads becomes 1. Conditional probability is the formal way to express this idea: once you learn B happened, you measure everything inside the world where B is true.

TL;DR:

Conditional probability is the probability of event A given event B: P(A|B). It means you restrict your attention to outcomes in B, and then ask what fraction of those outcomes also lie in A. Formally (when P(B) > 0): P(A|B) = P(A ∩ B) / P(B).

What Is Conditional Probability? #

Why we need it #

In basic probability, you pick a sample space Ω (all possible outcomes) and assign probabilities to events (subsets of Ω). That’s great when you have no extra information.

But real reasoning almost always comes with information:

That phrase “given that …” is exactly what conditional probability captures.

Intuition: shrinking the universe #

Think of an event B as a filter. Once you learn B occurred, outcomes not in B are impossible, so you throw them away.

Conditional probability asks: inside this new world B, how likely is A?

So P(A|B) is not “A and B” (that’s intersection). It’s “A measured within B.”

Formal definition #

If P(B) > 0, the conditional probability of A given B is

P(A|B) = P(A ∩ B) / P(B)

Read it slowly:

A picture in words #

Imagine 100 equally likely outcomes.

Then:

The key idea: you don’t compare A ∩ B to Ω anymore; you compare it to B.

The nonzero condition #

The definition requires P(B) > 0.

Why? Because if P(B) = 0, then “given B happened” is conditioning on something that never happens (in your probability model). The fraction P(A ∩ B)/P(B) would divide by zero.

At difficulty level 2, the important takeaway is: only condition on events with positive probability (for basic discrete problems).

Core Mechanic 1: Conditioning Restricts the Sample Space #

Why this matters #

Most mistakes with conditional probability come from not fully committing to the new sample space. Learners often keep using the original denominator (Ω) instead of the conditioned denominator (B).

Sample-space rewrite rule #

Once you know B happened:

In equally likely discrete settings:

P(A|B) = |A ∩ B| / |B|

This is the same as the earlier formula, just using counts instead of probabilities.

Example: die roll with a condition #

Let Ω = {1,2,3,4,5,6}.

After conditioning on B, the new universe is {2,4,6}.

Within B:

So:

P(A|B) = 1/3

Notice how different this is from P(A) = 1/6. The evidence “even” made 6 more plausible.

Two complementary views (and when each is useful) #

Conditional probability can be approached in two equivalent ways:

ViewWhat you doWhen it’s easiest
Restrict-and-countRewrite the sample space as B, then count A within itEqually likely outcomes (dice, cards)
Use the formulaCompute P(A ∩ B) and P(B) from given probabilitiesNon-uniform probabilities, word problems

Relationship to complements #

If you are inside B, the complement of A becomes “not A, but still inside B.”

P(Aᶜ|B) = 1 − P(A|B)

This is often an easy way to compute a conditional probability when the direct event is awkward.

A common mental model: “re-normalize” #

Conditioning on B does two things:

  1. Deletes outcomes not in B

  2. Scales the remaining probabilities so they sum to 1

If the outcomes in B were originally equally likely, they remain equally likely relative to each other after conditioning.

If outcomes in B were not equally likely, you can still condition, but you must keep their original weights and then re-normalize.

Mini-derivation: why the formula makes sense #

Suppose we want a new probability measure P(·|B) that lives on the restricted universe B.

We want:

So we set

P(A|B) = c · P(A ∩ B)

Choose c so that P(B|B) = 1:

1 = P(B|B) = c · P(B ∩ B) = c · P(B)

So c = 1 / P(B).

Therefore:

P(A|B) = P(A ∩ B) / P(B)

This is the cleanest justification for the definition: it’s the unique way to “renormalize” probabilities inside B.

Core Mechanic 2: Multiplication Rule (Turning Conditionals into Intersections) #

Why we need a second mechanic #

Sometimes you don’t want P(A|B). Instead, you want the probability that both events happen, P(A ∩ B). Conditional probability gives a direct bridge between these.

Starting from the definition:

P(A|B) = P(A ∩ B) / P(B)

Multiply both sides by P(B):

P(A ∩ B) = P(A|B) · P(B)

This is called the multiplication rule.

Two equivalent forms #

Be careful with order: both are true (when denominators are nonzero).

P(A ∩ B) = P(A|B)P(B)

P(A ∩ B) = P(B|A)P(A)

They describe the same intersection, just conditioning in different directions.

Why this is powerful #

The intersection P(A ∩ B) is often hard to estimate directly, but conditional probabilities are natural in real situations.

Example narrative:

Then P(A ∩ B) = probability it rains and traffic is bad.

Chaining more than two events #

Conditional probability lets you build probabilities step-by-step. For three events A, B, C with appropriate nonzero probabilities:

P(A ∩ B ∩ C) = P(A|B ∩ C) · P(B|C) · P(C)

Interpretation: start from C, then within C consider B, then within (B ∩ C) consider A.

At this node’s level, you don’t need to memorize the general chain rule, but it’s useful to see that conditional probability is the building block for multi-step reasoning.

A note on symmetry (and why “given” is not symmetric) #

Intersection is symmetric:

A ∩ B = B ∩ A

But conditional probability generally is not:

P(A|B) ≠ P(B|A)

Example:

If a number is divisible by 6, it must be divisible by 2, so P(A|B) = 1.

But if a number is divisible by 2, it is not necessarily divisible by 6, so P(B|A) < 1.

This non-symmetry is the entire reason Bayes’ Theorem is interesting later: it provides a way to relate the two directions.

Application/Connection: Reading Real Problems (Tests, Updates, and “Given” Language) #

Translating English to events #

A lot of conditional probability skill is language parsing.

A practical technique: rewrite the question as

“Among outcomes where B is true, what fraction have A true?”

Diagnostic tests (setup for Bayes) #

Consider medical testing language:

Two different conditionals:

These are not the same. Conditional probability makes that distinction precise.

Even before Bayes’ Theorem, you can see the structure:

This node equips you to keep the symbols straight so Bayes later feels like algebra, not magic.

Independence as a special case #

Independence will be unlocked soon. Conditional probability is the quickest way to express it.

A and B are independent exactly when

P(A|B) = P(A)

(assuming P(B) > 0)

Interpretation: learning B doesn’t change your belief about A.

Equivalently:

P(A ∩ B) = P(A)P(B)

But conceptually, the conditional form is often more intuitive: no update.

Markov chains preview #

A Markov chain is about transitions like

P(Xₜ₊₁ = j | Xₜ = i)

That is literally conditional probability: the next state given the current state.

So this node is foundational: without comfort reading and manipulating P(·|·), transition matrices and “memoryless” properties will feel opaque.

Quick checklist for solving conditional probability problems #

  1. Identify events A and B clearly.

  2. Confirm P(B) > 0 (in discrete problems, B must have at least one outcome).

  3. Decide approach:

  1. Be explicit about the denominator: after conditioning, your denominator is B.

  2. Sanity-check: 0 ≤ P(A|B) ≤ 1, and if A ⊂ B then P(A|B) = P(A)/P(B) and should be ≤ 1.

That last point is a great self-check: if you compute something bigger than 1, your denominator or event interpretation is wrong.

Worked Examples (3) #

Worked Example 1: Conditioning by restricting the sample space (cards) #

A standard 52-card deck. Let A = “card is an Ace”. Let B = “card is a Spade”. Find P(A|B).

  1. Step 1: Translate the meaning.

    P(A|B) means: among the spades, what fraction are aces?

  2. Step 2: Count the conditioned universe.

    B = “Spade” → there are 13 spades.

    So |B| = 13.

  3. Step 3: Count the overlap.

    A ∩ B = “Ace and Spade” → only the Ace of Spades.

    So |A ∩ B| = 1.

  4. Step 4: Compute the conditional probability using counts.

    P(A|B) = |A ∩ B| / |B| = 1 / 13.

Insight: Conditioning turned a 52-outcome space into a 13-outcome space. The denominator must match the condition.

Worked Example 2: Using the formula and the multiplication rule #

Suppose P(B) = 0.30 and P(A|B) = 0.20. Find (1) P(A ∩ B) and (2) P(Aᶜ|B).

  1. Part (1): Use the multiplication rule.

    We know:

    P(A ∩ B) = P(A|B)P(B)

  2. Compute:

    P(A ∩ B) = 0.20 · 0.30 = 0.06

  3. Part (2): Use the complement rule inside the condition.

    P(Aᶜ|B) = 1 − P(A|B)

  4. Compute:

    P(Aᶜ|B) = 1 − 0.20 = 0.80

Insight: Once you know one conditional probability, you can often get several others quickly using algebraic identities (multiplication and complements).

Worked Example 3: “Given” changes the denominator (dice) #

Roll a fair six-sided die. Let A = “roll is greater than 3” and B = “roll is odd”. Compute P(A|B) and compare to P(A).

  1. Step 1: List outcomes.

    Ω = {1,2,3,4,5,6}

    A = {4,5,6}

    B = {1,3,5}

  2. Step 2: Restrict to B.

    Given B occurred, possible outcomes are {1,3,5}. So |B| = 3.

  3. Step 3: Find overlap A ∩ B.

    A ∩ B = {5}. So |A ∩ B| = 1.

  4. Step 4: Compute conditional.

    P(A|B) = |A ∩ B| / |B| = 1/3.

  5. Step 5: Compute unconditional for comparison.

    P(A) = |A| / |Ω| = 3/6 = 1/2.

Insight: Learning “odd” made outcomes {4,6} impossible, which reduced the chance of being > 3 from 1/2 down to 1/3.

Key Takeaways #

Common Mistakes #

Practice #

easy

A fair die is rolled. Let A = “the roll is 2 or 3” and B = “the roll is less than 4”. Compute P(A|B).

Hint: Restrict the sample space to B first, then count outcomes in A within that restricted set.

Show solution

Ω = {1,2,3,4,5,6}

A = {2,3}

B = {1,2,3}

A ∩ B = {2,3}

P(A|B) = |A ∩ B| / |B| = 2/3.

medium

You are told that P(B) = 0.4 and P(A ∩ B) = 0.1. Compute P(A|B).

Hint: Use the definition P(A|B) = P(A ∩ B)/P(B).

Show solution

Given P(B) = 0.4 and P(A ∩ B) = 0.1,

P(A|B) = 0.1 / 0.4 = 0.25.

hard

A bag has 3 red balls and 2 blue balls. Two balls are drawn without replacement. Let A = “the second ball is red” and B = “the first ball is red”. Compute P(A|B).

Hint: After conditioning on B, update the bag’s composition before computing the probability of A.

Show solution

Initially: 3R, 2B (5 total).

Condition on B: the first ball is red, so remove one red.

Remaining bag: 2R, 2B (4 total).

Event A: second ball is red.

So P(A|B) = 2/4 = 1/2.

Connections #

Next nodes:

Quality: A (4.5/5)

← back to treebrowse all →