←Back to Tech Tree
inventorycoverage
Binary Search #
AlgorithmsDifficulty: ★★☆☆☆Depth: 3Unlocks: 0
O(log n) search in sorted array. Divide and conquer.
Interactive Visualization #
⏮◀◀▶▶STEP0.25x1xZOOM
t=0s
Core Concepts #
- -Array is sorted (nondecreasing order).
- -Maintain a search interval [low,high] that, if target exists, always contains its index; stop when interval is empty or target is found.
Key Symbols & Notation #
low (lowest index of current interval)high (highest index of current interval)
Essential Relationships #
- -Compute mid = (low + high) div 2; compare A[mid] to target: if equal -> found; if A[mid] < target -> low = mid + 1; else high = mid - 1 (this halves the interval).
Prerequisites (2) #
Arrays5 atomsBig O Notation6 atoms
Referenced by (1) #
Where this concept shows up in the operating-finance and personal-finance graphs.
From Business (1) #
[IRRBusiness
Bisection for IRR is literally binary search on a continuous domain - bracket the rate where NPV changes sign, halve the interval, repeat. The O(log n) convergence logic and divide-and-conquer structure are identical.](/business/irr/)
Advanced Learning Details
Graph Position #
38
Depth Cost
0
Fan-Out (ROI)
0
Bottleneck Score
3
Chain Length
Cognitive Load #
5
Atomic Elements
30
Total Elements
L1
Percentile Level
L3
Atomic Level
All Concepts (13) #
- Sorted array (total order on elements; array must be ordered ascending or descending for binary search to work)
- Divide-and-conquer application: reduce problem by splitting the search interval in two and discarding one half
- Search interval representation: maintaining low and high indices that bound the current candidate subarray
- Midpoint selection: choosing a middle index of the current interval to test
- Halving property: each comparison reduces the size of the candidate interval by about half
- Termination conditions: when midpoint equals target (found) or when interval becomes empty (not found)
- Loop vs recursion control: implementing repeated halving either iteratively or recursively
- Invariants for correctness: e.g., if target exists in the array then it is always within A[low..high]
- Inclusive vs exclusive bounds variants (e.g., [low, high] vs [low, high) semantics) and their impact on updates
- Off-by-one issues: correct index updates (low = mid+1, high = mid-1, or their variants) to avoid skipping elements
- Integer-midpoint rounding: the need to floor/truncate when computing the middle index
- Overflow-safe midpoint computation: using low + (high - low) // 2 instead of (low + high) // 2 in fixed-width integer contexts
- Handling duplicates: distinguishing finding any occurrence versus finding first/last occurrence requires modified update rules
Teaching Strategy #
Self-serve tutorial - low prerequisites, straightforward concepts.
You have a sorted list of a million items and need to know whether a value is present. Scanning from left to right works—but it wastes the fact that the list is sorted. Binary search is the classic way to turn “sorted order” into a dramatic speedup: from O(n) down to O(log n).
TL;DR:
Binary search repeatedly halves a sorted array’s remaining search interval. Maintain an interval [low, high] that always contains the target’s index if it exists. Check mid; if mid is too small, move low up; if too large, move high down. This takes O(log n) time and O(1) extra space.
What Is Binary Search? #
The problem it solves #
You’re given an array A and a target value. You want to find the index of target (or decide it isn’t present).
- •With an unsorted array, the safest approach is linear search: check each element until you find
target or reach the end. That’s O(n). - •With a sorted array (nondecreasing order), you can do much better by using the ordering information.
Binary search is a divide-and-conquer algorithm for searching in a sorted array. The key idea: each comparison lets you discard half of the remaining candidates.
What “sorted” means here #
Assume the array is sorted in nondecreasing order:
- •For indices i < j, we have A[i] ≤ A[j].
Nondecreasing allows duplicates (e.g., [1, 2, 2, 2, 5]). Binary search still works, but you must be clear on what you want when duplicates exist (any occurrence? first? last?). We’ll cover variants later.
The mental model: shrinking an interval #
Binary search maintains a search interval of indices where the target could still be.
- •
low = lowest index still possible - •
high = highest index still possible
At all times, you try to preserve this loop invariant:
Invariant: If target appears in the array, then its index lies within [low, high].
Each step picks the middle index mid, compares A[mid] to target, and throws away the half that can’t contain the target.
Prerequisite note (to avoid hidden gaps) #
Before you code binary search confidently, you should be comfortable with:
- 1)Integer division / floor: computing
mid using floor behavior (e.g., (low + high) // 2). - 2)Loop invariants: understanding and maintaining “what is always true” about
[low, high] each iteration. - 3)Edge cases: empty arrays, single-element arrays, and duplicate values.
- 4)(Optional but useful) Comparator-based ordering: arrays sorted descending or by a custom key require adjusting comparisons.
These aren’t “extra topics”—they are exactly where most binary search bugs come from.
Why it’s O(log n) #
Each iteration cuts the interval size roughly in half.
If the array length is n:
- •after 1 step: ~n/2 candidates remain
- •after 2 steps: ~n/4
- •after k steps: ~n/2ᵏ
We stop when the interval becomes empty or we find the target. The smallest k such that n/2ᵏ ≤ 1 is k ≥ log₂(n). So the number of iterations is O(log n).
Core Mechanic 1: Maintain the Search Interval [low, high] (Loop Invariant) #
Why the interval matters #
Binary search isn’t just “check middle and move left/right.” It’s a careful promise about what indices are still possible.
If you lose that promise (the invariant), you can:
- •skip over the target
- •loop forever
- •read out of bounds
So we’ll be explicit about the invariant and how each update preserves it.
The standard exact-match version (closed interval) #
We’ll use a closed interval [low, high] inclusive. Initialize:
Loop while the interval is non-empty:
Compute middle:
- •
mid = low + (high - low) // 2
(The low + (high - low) form avoids overflow in languages with fixed-size integers; in Python it’s not necessary, but it’s a good habit.)
Compare:
- •If
A[mid] == target: return mid - •If
A[mid] < target: discard left half including mid → set low = mid + 1 - •If
A[mid] > target: discard right half including mid → set high = mid - 1
If the loop ends, return “not found” (often -1).
Showing the invariant preservation #
Let’s spell out what each case means.
Assume the array is sorted nondecreasing.
Case 1: A[mid] < target #
Because the array is sorted, for every i ≤ mid, we have A[i] ≤ A[mid] < target.
So no index ≤ mid can hold target.
Therefore, if target exists, it must be in indices mid + 1 ... high. Setting low = mid + 1 preserves:
If target exists, it’s still in [low, high].
Case 2: A[mid] > target #
Similarly, for every i ≥ mid, we have A[i] ≥ A[mid] > target.
So no index ≥ mid can hold target.
Therefore, if target exists, it must be in indices low ... mid - 1. Setting high = mid - 1 preserves the invariant.
Case 3: A[mid] == target #
We’re done.
Termination: why it must end #
Each iteration strictly shrinks the interval:
- •
low increases (to mid + 1) or - •
high decreases (to mid - 1) or - •we return
Since low and high are integers and the interval length can’t shrink forever without becoming empty, the loop terminates.
Concrete pseudocode #
binary_search(A, target):
low = 0
high = len(A) - 1
while low <= high:
mid = low + (high - low) // 2
if A[mid] == target:
return mid
else if A[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
Breathing room: trace one run #
Suppose A = [2, 5, 7, 12, 19, 23, 31], target = 19.
- •Start: low=0, high=6
- •mid=3 → A[3]=12 < 19 → low=4
- •Now low=4, high=6
- •mid=5 → A[5]=23 > 19 → high=4
- •Now low=4, high=4
- •mid=4 → A[4]=19 == 19 → found at 4
Notice how [low, high] always contained index 4 until we found it.
Core Mechanic 2: Getting Boundaries Right (mid, duplicates, and interval conventions) #
Binary search is famous for off-by-one errors. Most bugs come from mixing interval conventions or mishandling duplicates.
1) Computing mid safely and correctly #
The classic formula is:
In many languages, low + high can overflow if indices are large. A safer version is:
- •
mid = low + (high - low) // 2
Both choose the “lower middle” when there are two middles.
2) Closed vs half-open intervals #
There are two common interval conventions:
| Convention | Interval | Loop condition | Update when going right | Update when going left |
|---|
| Closed | [low, high] | low <= high | low = mid + 1 | high = mid - 1 |
| Half-open | [low, high) | low < high | low = mid + 1 or low = mid (depends) | high = mid |
For beginners, closed intervals are often easier to reason about for exact-match search.
The most important rule is: pick one convention and stay consistent.
3) Duplicates: “any occurrence” vs “first/last occurrence” #
If duplicates exist, the exact-match binary search above can return any index where A[mid] == target.
Sometimes you need a stable boundary, like:
- •lower bound: first index i where A[i] ≥ target
- •upper bound: first index i where A[i] > target
These are incredibly useful in practice (counting occurrences, insertion positions, range queries).
Lower bound (first position with A[i] ≥ target) #
A common pattern uses a half-open interval [low, high):
- •Initialize:
low = 0, high = n - •While
low < high: - •
mid = low + (high - low) // 2 - •If
A[mid] < target, discard left including mid → low = mid + 1 - •Else, keep mid (could be answer) →
high = mid
At the end, low is the smallest index with A[low] ≥ target (or n if none).
Why does high = mid (not mid - 1)? Because in half-open intervals, high is exclusive, and we want to keep mid as a candidate.
Upper bound (first position with A[i] > target) #
Same structure, just change the comparison:
- •If
A[mid] <= target, move low = mid + 1 - •Else,
high = mid
Then the count of occurrences of target is:
count=upper_bound(A,target)−lower_bound(A,target)\text{count} = \text{upper\_bound}(A, target) - \text{lower\_bound}(A, target)count=upper_bound(A,target)−lower_bound(A,target)
4) Empty arrays and single-element arrays #
Binary search should handle these without special-casing if your loop condition is correct.
- •Empty: n = 0 →
high = -1 → low <= high is false immediately → return -1. - •Single element: low=0, high=0 → one iteration checks that element.
5) Custom orderings (optional but important) #
If the array is sorted descending, the direction flips:
- •If
A[mid] < target, you must go left (because values decrease to the right).
If the array is sorted by a key (e.g., objects sorted by age), compare using that key consistently.
A useful way to stay sane is to think: you need a monotonic predicate (something that switches from false to true once). Lower/upper bound are built exactly on that principle.
Applications and Connections (Where Binary Search Shows Up) #
1) Fast lookup in sorted data #
Binary search is the core primitive behind many “fast membership” tasks when data is stored sorted:
- •searching in a sorted array
- •searching in a sorted list of timestamps
- •dictionary-like behavior when updates are rare but queries are frequent
Time complexity is O(log n) per query, with O(1) extra space.
2) Insertion position and maintaining sorted order #
Using lower bound, you can find where to insert an element while maintaining sorted order.
- •Find index i = lower_bound(A, x)
- •Insert at i (in an array, insertion itself is O(n) due to shifting, but the search for position is O(log n))
This is common in:
- •building sorted lists incrementally
- •coordinate compression (collect values, sort, then binary search indices)
3) Range counting and frequency queries #
If the array is sorted, you can count how many elements equal target or lie in a range [L, R].
- •occurrences of x:
ub(x) - lb(x) - •count in [L, R]:
lb(R+ε) - lb(L) (conceptually), or more directly ub(R) - lb(L)
4) Binary search on the answer (a bigger idea) #
A powerful pattern is binary searching over a value space when you have a monotonic condition.
Example shape:
- •You want the smallest value
k such that feasible(k) is true. - •If
feasible(k) is true, then feasible(k+1) is also true (monotonic).
Then you can binary search k even if there is no array.
This is used for:
- •minimum capacity / maximum speed problems
- •scheduling and allocation
- •optimization with monotonic constraints
This lesson node focuses on array search, but learning lower/upper bound sets you up for that broader technique.
Worked Examples (3) #
Exact match search (closed interval) with a full trace #
Array A is sorted: A = [1, 4, 6, 9, 13, 18, 21, 27]. Find target = 13 using binary search with [low, high].
Initialize:
low = 0
high = 7 (n-1)
Invariant: if 13 exists, it is in indices [0,7].
Iteration 1:
mid = low + (high - low)//2 = 0 + (7-0)//2 = 3
A[mid] = A[3] = 9
Compare: 9 < 13, so discard indices ≤ 3.
Update: low = mid + 1 = 4
Now interval is [4,7].
Iteration 2:
mid = 4 + (7-4)//2 = 4 + 3//2 = 5
A[5] = 18
Compare: 18 > 13, so discard indices ≥ 5.
Update: high = mid - 1 = 4
Now interval is [4,4].
Iteration 3:
mid = 4 + (4-4)//2 = 4
A[4] = 13
Compare: 13 == 13 → found.
Return index 4.
Insight: The invariant makes the updates feel inevitable: once A[mid] is too small, everything left of mid is too small because the array is sorted.
Lower/upper bound to count duplicates #
A = [1, 2, 2, 2, 3, 5, 5, 9]. Compute how many times target = 2 appears using lower_bound and upper_bound (half-open intervals).
Goal:
count(2) = upper_bound(2) - lower_bound(2)
where:
- •lower_bound finds first index with A[i] ≥ 2
- •upper_bound finds first index with A[i] > 2
Compute lower_bound(2):
Initialize low=0, high=8 (n)
While low < high:
- •mid = 0 + (8-0)//2 = 4, A[4]=3
A[mid] < 2? No (3 < 2 is false) → high = mid = 4
- •mid = 0 + (4-0)//2 = 2, A[2]=2
A[mid] < 2? No (2 < 2 false) → high = 2
- •mid = 0 + (2-0)//2 = 1, A[1]=2
A[mid] < 2? No → high = 1
- •mid = 0 + (1-0)//2 = 0, A[0]=1
A[mid] < 2? Yes → low = mid + 1 = 1
Stop (low == high == 1). So lower_bound(2) = 1.
Compute upper_bound(2):
Initialize low=0, high=8
While low < high:
A[mid] <= 2? No → high=4
A[mid] <= 2? Yes → low = mid + 1 = 3
A[mid] <= 2? Yes → low = 4
Stop (low == high == 4). So upper_bound(2) = 4.
Compute count:
count = upper_bound - lower_bound = 4 - 1 = 3
Insight: Lower/upper bound treat equality differently. That tiny change ( < vs ≤ ) is what turns “find a match” into “find a boundary,” which is often more useful than returning any one index.
Not found case and why the loop ends correctly #
A = [3, 8, 10, 14, 17]. Search for target = 9 using the closed-interval exact-match binary search.
Initialize:
low=0, high=4
Iteration 1:
mid = 0 + (4-0)//2 = 2
A[2]=10
10 > 9 → high = mid - 1 = 1
Interval becomes [0,1]
Iteration 2:
mid = 0 + (1-0)//2 = 0
A[0]=3
3 < 9 → low = mid + 1 = 1
Interval becomes [1,1]
Iteration 3:
mid = 1 + (1-1)//2 = 1
A[1]=8
8 < 9 → low = mid + 1 = 2
Interval becomes [2,1] (empty because low > high)
Loop ends because low <= high is false.
Return -1 (not found).
Insight: The empty interval (low > high) is the precise signal that there is no index left where target could be, so “not found” is justified—not just a guess.
Key Takeaways #
✓
Binary search requires a sorted array (nondecreasing is the usual assumption).
✓
Maintain a search interval: with a closed interval, the loop invariant is “if target exists, it’s in [low, high].”
✓
Use mid = low + (high - low) // 2 to avoid overflow and to get correct integer rounding.
✓
Closed interval exact-match pattern: while low <= high, then update with low = mid + 1 or high = mid - 1.
✓
Binary search runs in O(log n) time because each comparison discards about half the remaining candidates.
✓
With duplicates, exact-match may return any occurrence; use lower_bound / upper_bound to get first/last positions and counts.
✓
Be consistent about interval conventions ([low, high] vs [low, high)) to avoid off-by-one errors.
Common Mistakes #
✗
Using the wrong loop condition for your interval convention (e.g., using low < high with updates meant for [low, high]).
✗
Updating to low = mid or high = mid in a closed-interval exact-match search, which can cause infinite loops when low == mid or high == mid.
✗
Computing mid = (low + high) // 2 in a language where low + high can overflow, leading to negative or incorrect mid.
✗
For duplicates, expecting the exact-match search to return the first (or last) occurrence without implementing the boundary-search variant.
Practice #
easy
Implement the closed-interval binary search that returns the index of target in a sorted array A, or -1 if not found. Then trace it on A = [1, 2, 4, 8, 16] with target = 8.
Hint: Use low=0, high=n-1, loop while low <= high, and update low=mid+1 or high=mid-1.
Show solution
Algorithm:
- •low=0, high=n-1
- •while low<=high:
mid=low+(high-low)//2
if A[mid]==target return mid
if A[mid]<target low=mid+1 else high=mid-1
Trace:
A=[1,2,4,8,16], target=8
low=0 high=4 mid=2 A[2]=4<8 → low=3
low=3 high=4 mid=3 A[3]=8==8 → return 3
medium
Given A = [1, 2, 2, 2, 3, 3, 10], find (a) lower_bound(3) and (b) upper_bound(3), and use them to compute how many 3s are in A.
Hint: Lower bound uses condition A[mid] < target; upper bound uses A[mid] <= target. Use a half-open interval [low, high) with high=n.
Show solution
A length n=7.
- •lower_bound(3) returns first index with A[i] ≥ 3. The first 3 is at index 4, so lower_bound(3)=4.
- •upper_bound(3) returns first index with A[i] > 3. The first element >3 is 10 at index 6, so upper_bound(3)=6.
Count of 3s = 6 - 4 = 2.
hard
You are given a sorted array A (nondecreasing) and a target. Modify binary search to return the index of the FIRST occurrence of target (or -1 if absent). Describe the rule for updating boundaries when you find equality.
Hint: When A[mid] == target, don’t stop immediately; keep searching left while preserving that the first occurrence is still in the interval.
Show solution
One approach (half-open lower_bound style):
Compute i = lower_bound(A, target) (first index with A[i] ≥ target). If i < n and A[i] == target, return i else return -1.
If you implement directly with a closed interval:
- •Keep an
ans = -1. - •When A[mid] == target: set ans=mid and move left: high = mid - 1.
- •When A[mid] < target: low = mid + 1.
- •When A[mid] > target: high = mid - 1.
Return ans at the end.
Key rule: on equality, shrink toward the side where the first occurrence could be (left), rather than terminating.
Connections #
Quality: A (4.5/5)
← back to treebrowse all →