Binary Search

←Back to Tech Tree

inventorycoverage

Binary Search #

AlgorithmsDifficulty: ★★☆☆☆Depth: 3Unlocks: 0

O(log n) search in sorted array. Divide and conquer.

Interactive Visualization #

⏮◀◀▶▶STEP0.25x1xZOOM

t=0s

Core Concepts #

Key Symbols & Notation #

low (lowest index of current interval)high (highest index of current interval)

Essential Relationships #

Prerequisites (2) #

Arrays5 atomsBig O Notation6 atoms

Referenced by (1) #

Where this concept shows up in the operating-finance and personal-finance graphs.

From Business (1) #

[IRRBusiness

Bisection for IRR is literally binary search on a continuous domain - bracket the rate where NPV changes sign, halve the interval, repeat. The O(log n) convergence logic and divide-and-conquer structure are identical.](/business/irr/)

Advanced Learning Details

Graph Position #

38

Depth Cost

0

Fan-Out (ROI)

0

Bottleneck Score

3

Chain Length

Cognitive Load #

5

Atomic Elements

30

Total Elements

L1

Percentile Level

L3

Atomic Level

All Concepts (13) #

Teaching Strategy #

Self-serve tutorial - low prerequisites, straightforward concepts.

You have a sorted list of a million items and need to know whether a value is present. Scanning from left to right works—but it wastes the fact that the list is sorted. Binary search is the classic way to turn “sorted order” into a dramatic speedup: from O(n) down to O(log n).

TL;DR:

Binary search repeatedly halves a sorted array’s remaining search interval. Maintain an interval [low, high] that always contains the target’s index if it exists. Check mid; if mid is too small, move low up; if too large, move high down. This takes O(log n) time and O(1) extra space.

What Is Binary Search? #

The problem it solves #

You’re given an array A and a target value. You want to find the index of target (or decide it isn’t present).

Binary search is a divide-and-conquer algorithm for searching in a sorted array. The key idea: each comparison lets you discard half of the remaining candidates.

What “sorted” means here #

Assume the array is sorted in nondecreasing order:

Nondecreasing allows duplicates (e.g., [1, 2, 2, 2, 5]). Binary search still works, but you must be clear on what you want when duplicates exist (any occurrence? first? last?). We’ll cover variants later.

The mental model: shrinking an interval #

Binary search maintains a search interval of indices where the target could still be.

At all times, you try to preserve this loop invariant:

Invariant: If target appears in the array, then its index lies within [low, high].

Each step picks the middle index mid, compares A[mid] to target, and throws away the half that can’t contain the target.

Prerequisite note (to avoid hidden gaps) #

Before you code binary search confidently, you should be comfortable with:

  1. 1)Integer division / floor: computing mid using floor behavior (e.g., (low + high) // 2).
  2. 2)Loop invariants: understanding and maintaining “what is always true” about [low, high] each iteration.
  3. 3)Edge cases: empty arrays, single-element arrays, and duplicate values.
  4. 4)(Optional but useful) Comparator-based ordering: arrays sorted descending or by a custom key require adjusting comparisons.

These aren’t “extra topics”—they are exactly where most binary search bugs come from.

Why it’s O(log n) #

Each iteration cuts the interval size roughly in half.

If the array length is n:

We stop when the interval becomes empty or we find the target. The smallest k such that n/2ᵏ ≤ 1 is k ≥ log₂(n). So the number of iterations is O(log n).

Core Mechanic 1: Maintain the Search Interval [low, high] (Loop Invariant) #

Why the interval matters #

Binary search isn’t just “check middle and move left/right.” It’s a careful promise about what indices are still possible.

If you lose that promise (the invariant), you can:

So we’ll be explicit about the invariant and how each update preserves it.

The standard exact-match version (closed interval) #

We’ll use a closed interval [low, high] inclusive. Initialize:

Loop while the interval is non-empty:

Compute middle:

(The low + (high - low) form avoids overflow in languages with fixed-size integers; in Python it’s not necessary, but it’s a good habit.)

Compare:

If the loop ends, return “not found” (often -1).

Showing the invariant preservation #

Let’s spell out what each case means.

Assume the array is sorted nondecreasing.

Case 1: A[mid] < target #

Because the array is sorted, for every i ≤ mid, we have A[i] ≤ A[mid] < target.

So no index ≤ mid can hold target.

Therefore, if target exists, it must be in indices mid + 1 ... high. Setting low = mid + 1 preserves:

If target exists, it’s still in [low, high].

Case 2: A[mid] > target #

Similarly, for every i ≥ mid, we have A[i] ≥ A[mid] > target.

So no index ≥ mid can hold target.

Therefore, if target exists, it must be in indices low ... mid - 1. Setting high = mid - 1 preserves the invariant.

Case 3: A[mid] == target #

We’re done.

Termination: why it must end #

Each iteration strictly shrinks the interval:

Since low and high are integers and the interval length can’t shrink forever without becoming empty, the loop terminates.

Concrete pseudocode #

binary_search(A, target):
    low = 0
    high = len(A) - 1

    while low <= high:
        mid = low + (high - low) // 2

        if A[mid] == target:
            return mid
        else if A[mid] < target:
            low = mid + 1
        else:
            high = mid - 1

    return -1

Breathing room: trace one run #

Suppose A = [2, 5, 7, 12, 19, 23, 31], target = 19.

Notice how [low, high] always contained index 4 until we found it.

Core Mechanic 2: Getting Boundaries Right (mid, duplicates, and interval conventions) #

Binary search is famous for off-by-one errors. Most bugs come from mixing interval conventions or mishandling duplicates.

1) Computing mid safely and correctly #

The classic formula is:

In many languages, low + high can overflow if indices are large. A safer version is:

Both choose the “lower middle” when there are two middles.

2) Closed vs half-open intervals #

There are two common interval conventions:

ConventionIntervalLoop conditionUpdate when going rightUpdate when going left
Closed[low, high]low <= highlow = mid + 1high = mid - 1
Half-open[low, high)low < highlow = mid + 1 or low = mid (depends)high = mid

For beginners, closed intervals are often easier to reason about for exact-match search.

The most important rule is: pick one convention and stay consistent.

3) Duplicates: “any occurrence” vs “first/last occurrence” #

If duplicates exist, the exact-match binary search above can return any index where A[mid] == target.

Sometimes you need a stable boundary, like:

These are incredibly useful in practice (counting occurrences, insertion positions, range queries).

Lower bound (first position with A[i] ≥ target) #

A common pattern uses a half-open interval [low, high):

At the end, low is the smallest index with A[low] ≥ target (or n if none).

Why does high = mid (not mid - 1)? Because in half-open intervals, high is exclusive, and we want to keep mid as a candidate.

Upper bound (first position with A[i] > target) #

Same structure, just change the comparison:

Then the count of occurrences of target is:

count=upper_bound(A,target)−lower_bound(A,target)\text{count} = \text{upper\_bound}(A, target) - \text{lower\_bound}(A, target)count=upper_bound(A,target)−lower_bound(A,target)

4) Empty arrays and single-element arrays #

Binary search should handle these without special-casing if your loop condition is correct.

5) Custom orderings (optional but important) #

If the array is sorted descending, the direction flips:

If the array is sorted by a key (e.g., objects sorted by age), compare using that key consistently.

A useful way to stay sane is to think: you need a monotonic predicate (something that switches from false to true once). Lower/upper bound are built exactly on that principle.

Applications and Connections (Where Binary Search Shows Up) #

1) Fast lookup in sorted data #

Binary search is the core primitive behind many “fast membership” tasks when data is stored sorted:

Time complexity is O(log n) per query, with O(1) extra space.

2) Insertion position and maintaining sorted order #

Using lower bound, you can find where to insert an element while maintaining sorted order.

This is common in:

3) Range counting and frequency queries #

If the array is sorted, you can count how many elements equal target or lie in a range [L, R].

4) Binary search on the answer (a bigger idea) #

A powerful pattern is binary searching over a value space when you have a monotonic condition.

Example shape:

Then you can binary search k even if there is no array.

This is used for:

This lesson node focuses on array search, but learning lower/upper bound sets you up for that broader technique.

Worked Examples (3) #

Exact match search (closed interval) with a full trace #

Array A is sorted: A = [1, 4, 6, 9, 13, 18, 21, 27]. Find target = 13 using binary search with [low, high].

  1. Initialize:

    low = 0

    high = 7 (n-1)

    Invariant: if 13 exists, it is in indices [0,7].

  2. Iteration 1:

    mid = low + (high - low)//2 = 0 + (7-0)//2 = 3

    A[mid] = A[3] = 9

    Compare: 9 < 13, so discard indices ≤ 3.

    Update: low = mid + 1 = 4

    Now interval is [4,7].

  3. Iteration 2:

    mid = 4 + (7-4)//2 = 4 + 3//2 = 5

    A[5] = 18

    Compare: 18 > 13, so discard indices ≥ 5.

    Update: high = mid - 1 = 4

    Now interval is [4,4].

  4. Iteration 3:

    mid = 4 + (4-4)//2 = 4

    A[4] = 13

    Compare: 13 == 13 → found.

    Return index 4.

Insight: The invariant makes the updates feel inevitable: once A[mid] is too small, everything left of mid is too small because the array is sorted.

Lower/upper bound to count duplicates #

A = [1, 2, 2, 2, 3, 5, 5, 9]. Compute how many times target = 2 appears using lower_bound and upper_bound (half-open intervals).

  1. Goal:

    count(2) = upper_bound(2) - lower_bound(2)

    where:

    • •lower_bound finds first index with A[i] ≥ 2
    • •upper_bound finds first index with A[i] > 2
  2. Compute lower_bound(2):

    Initialize low=0, high=8 (n)

    While low < high:

    • •mid = 0 + (8-0)//2 = 4, A[4]=3

    A[mid] < 2? No (3 < 2 is false) → high = mid = 4

    • •mid = 0 + (4-0)//2 = 2, A[2]=2

    A[mid] < 2? No (2 < 2 false) → high = 2

    • •mid = 0 + (2-0)//2 = 1, A[1]=2

    A[mid] < 2? No → high = 1

    • •mid = 0 + (1-0)//2 = 0, A[0]=1

    A[mid] < 2? Yes → low = mid + 1 = 1

    Stop (low == high == 1). So lower_bound(2) = 1.

  3. Compute upper_bound(2):

    Initialize low=0, high=8

    While low < high:

    • •mid = 4, A[4]=3

    A[mid] <= 2? No → high=4

    • •mid = 2, A[2]=2

    A[mid] <= 2? Yes → low = mid + 1 = 3

    • •mid = 3, A[3]=2

    A[mid] <= 2? Yes → low = 4

    Stop (low == high == 4). So upper_bound(2) = 4.

  4. Compute count:

    count = upper_bound - lower_bound = 4 - 1 = 3

Insight: Lower/upper bound treat equality differently. That tiny change ( < vs ≤ ) is what turns “find a match” into “find a boundary,” which is often more useful than returning any one index.

Not found case and why the loop ends correctly #

A = [3, 8, 10, 14, 17]. Search for target = 9 using the closed-interval exact-match binary search.

  1. Initialize:

    low=0, high=4

  2. Iteration 1:

    mid = 0 + (4-0)//2 = 2

    A[2]=10

    10 > 9 → high = mid - 1 = 1

    Interval becomes [0,1]

  3. Iteration 2:

    mid = 0 + (1-0)//2 = 0

    A[0]=3

    3 < 9 → low = mid + 1 = 1

    Interval becomes [1,1]

  4. Iteration 3:

    mid = 1 + (1-1)//2 = 1

    A[1]=8

    8 < 9 → low = mid + 1 = 2

    Interval becomes [2,1] (empty because low > high)

  5. Loop ends because low <= high is false.

    Return -1 (not found).

Insight: The empty interval (low > high) is the precise signal that there is no index left where target could be, so “not found” is justified—not just a guess.

Key Takeaways #

Common Mistakes #

Practice #

easy

Implement the closed-interval binary search that returns the index of target in a sorted array A, or -1 if not found. Then trace it on A = [1, 2, 4, 8, 16] with target = 8.

Hint: Use low=0, high=n-1, loop while low <= high, and update low=mid+1 or high=mid-1.

Show solution

Algorithm:

mid=low+(high-low)//2

if A[mid]==target return mid

if A[mid]<target low=mid+1 else high=mid-1

Trace:

A=[1,2,4,8,16], target=8

low=0 high=4 mid=2 A[2]=4<8 → low=3

low=3 high=4 mid=3 A[3]=8==8 → return 3

medium

Given A = [1, 2, 2, 2, 3, 3, 10], find (a) lower_bound(3) and (b) upper_bound(3), and use them to compute how many 3s are in A.

Hint: Lower bound uses condition A[mid] < target; upper bound uses A[mid] <= target. Use a half-open interval [low, high) with high=n.

Show solution

A length n=7.

Count of 3s = 6 - 4 = 2.

hard

You are given a sorted array A (nondecreasing) and a target. Modify binary search to return the index of the FIRST occurrence of target (or -1 if absent). Describe the rule for updating boundaries when you find equality.

Hint: When A[mid] == target, don’t stop immediately; keep searching left while preserving that the first occurrence is still in the interval.

Show solution

One approach (half-open lower_bound style):

Compute i = lower_bound(A, target) (first index with A[i] ≥ target). If i < n and A[i] == target, return i else return -1.

If you implement directly with a closed interval:

Return ans at the end.

Key rule: on equality, shrink toward the side where the first occurrence could be (left), rather than terminating.

Connections #

Quality: A (4.5/5)

← back to treebrowse all →