Binary Search
Key Points
- •Binary search quickly finds targets or boundaries in sorted or monotonic data by halving the search interval each step.
- •Use lowe to find the first index with and uppe to find the first index with .
- •Think in terms of finding the first true in a nondecreasing predicate p(x) that flips from false to true at some boundary.
- •Binary search on answer is powerful when checking feasibility is easier than constructing the optimal answer.
- •Handle integer overflow with mid = l + (r - l) / 2 and watch off-by-one errors by fixing interval conventions.
- •For doubles, stop when the interval width is below the error shrinks exponentially with iterations.
- •Standard library functions (std::lowe, std::uppe, std::binar) are safe and fast.
- •Typical complexity is O(log n) per query, or O(n log R) for answer-search with an O(n) check over range R.
Prerequisites
- →Ordering and comparisons — Binary search depends on a total order to decide which half to discard.
- →Arrays and indexing — Understanding zero-based indices and bounds is crucial for interval management.
- →Mathematical monotonicity — Binary search on answer requires a predicate that transitions once and only once.
- →Time complexity (Big-O notation) — To analyze why halving leads to logarithmic time and to estimate iterations.
- →Integer and floating-point arithmetic — Prevents overflow and manages precision when searching over reals.
- →Sorting algorithms — Data must be sorted before applying binary search on arrays.
Detailed Explanation
Tap terms for definitions01Overview
Binary search is a fundamental algorithm to locate a target or a transition point in a sorted or monotonic structure. The core idea is to maintain an interval [L, R] (or a variant like [L, R), (L, R], or (L, R)) that is guaranteed to contain the answer, and then repeatedly cut this interval in half by comparing the middle element (or testing a predicate at the midpoint). Because each step discards half of the remaining possibilities, the number of steps grows logarithmically with problem size. This makes binary search especially effective for large datasets or wide numeric ranges. In practical programming, binary search appears in two main forms. The first is searching sorted arrays: finding whether an element exists, counting duplicates, or locating boundaries such as the first element not less than a given value (lower_bound) or the first greater than a value (upper_bound). The second form is binary search on answer: we do not search a container, but a numeric domain (integers or reals) to find the smallest feasible value (or largest) with respect to a monotone feasibility predicate. C++ provides robust standard library tools—std::lower_bound, std::upper_bound, and std::binary_search—to implement searches safely. When searching over numeric ranges, developers typically write a custom loop with careful boundary updates and overflow-aware midpoint computation. The algorithm’s reliability hinges on maintaining a correct invariant and choosing consistent interval semantics.
02Intuition & Analogies
Imagine looking for a word in a huge dictionary. Rather than flipping pages one by one, you open the book in the middle. If your word comes alphabetically after the mid page, you ignore the left half and continue with the right half; otherwise, you focus on the left half. Each peek halves your uncertainty. This is binary search in a nutshell: narrowing down possibilities by repeatedly cutting the search space in half. Another analogy: a broken light string where bulbs are numbered from 1 to n, and bulbs from some point onward stop working. You can test any position i to know whether bulbs up to i are okay. If bulbs are working up to i but fail beyond, the first broken bulb is somewhere to the right; if they already fail at i, the first broken bulb is at i or to the left. By picking the middle bulb each time, you find the exact boundary efficiently. This captures the “first true/first false” framing: a predicate p(i) that transitions once—false, false, …, then true, true, …—and you want the switch point. A third picture: tuning a radio. Suppose turning the dial right increases signal clarity monotonically. You want the smallest dial position that gives “acceptable” clarity. If the mid setting is unacceptable, you must go right; if acceptable, you try to improve (go left) without dropping below the threshold. Again, you’re halving the dial range repeatedly. The same happens for real numbers: you shrink a continuous interval until it’s tiny enough to approximate the desired value within a tolerance ε.
03Formal Definition
04When to Use
Use binary search whenever you can order candidates and prove a one-way transition in a predicate:
- Sorted containers: presence checks, counting occurrences, and range queries. Examples: find the first timestamp ≥ t, locate insertion positions to keep a list sorted, or compute frequency via upper_bound - lower_bound.
- Monotone feasibility (“binary search on answer”): choose the minimal capacity/threshold where a feasibility check succeeds. Examples: minimum ship capacity to deliver packages within D days; maximum minimum distance to place k routers; smallest processing speed to finish tasks within T time.
- Real-valued optimization with monotone acceptability: maximize cut length of ropes while yielding at least k pieces; find minimal radius to cover points with m centers; compute square/cube roots when a monotone function crosses a target.
- Batch queries: pre-sort once and answer many queries in O(log n) each. This is common in competitive programming for CF 800–1400 problems. Avoid binary search when the predicate is not monotone (e.g., values go false→true→false); in such unimodal cases, consider ternary search or other techniques.
⚠️Common Mistakes
- Off-by-one and interval confusion: Mixing [L, R], [L, R), or (L, R) without consistent updates causes infinite loops or missed answers. Pick one convention and stick to it. For example, with [L, R), advance L = mid + 1 only when you are certain the answer is strictly to the right.
- Midpoint overflow: Computing mid = (L + R) / 2 can overflow 32-bit integers. Use mid = L + (R - L) / 2 and 64-bit types when ranges are large.
- Wrong direction on equality: For lower_bound, when a[mid] ≥ x you must move R = mid (not L), otherwise you skip the first valid position. For upper_bound, when a[mid] ≤ x you must move L = mid + 1.
- Predicate not monotone: If your feasibility check sometimes flips back to false as you increase the candidate, binary search may return the wrong boundary. Prove monotonicity before coding.
- Infinite loops on integers: With [L, R) and mid = (L + R) / 2, ensure at least one bound changes. If mid == L, you must update R = mid in the ‘true’ branch and L = mid + 1 in the ‘false’ branch.
- Precision issues on doubles: Don’t compare doubles for exact equality. Stop when R - L < ε, or iterate a fixed number of times. Use adequate ε, and be mindful of accumulated floating-point error.
- Forgetting to sort: Searching an unsorted array with binary search yields garbage results. Always ensure sorted order consistent with your comparisons.
Key Formulas
Binary Search Time
Explanation: Each iteration halves the search space, so it takes logarithmic steps in the number of elements n. This means doubling the input size adds only one extra step.
Overflow-Safe Midpoint
Explanation: This computes the middle without summing L and R directly, preventing integer overflow when L and R are large.
Lower Bound Definition
Explanation: The lower bound index is the first position where the array value is at least x. It’s the standard way to find the insertion point for x while keeping order.
Upper Bound Definition
Explanation: The upper bound index is the first position strictly greater than x. The count of x equals ub(x) − lb(x).
Interval Shrinkage
Explanation: After k iterations, the interval length is reduced by a factor of 2^k. This quantifies how quickly the uncertainty shrinks.
Iterations for Precision
Explanation: To reduce an initial interval to within error ε in real-valued search, you need at least this many iterations.
Answer Search Complexity
Explanation: If the feasibility check per candidate costs O(n) and the numeric range size is R, total time is O(n log R). This is typical for capacity or distance optimization problems.
First-True Boundary
Explanation: Binary search requires a single boundary where a nondecreasing predicate transitions from false to true. The algorithm finds b efficiently.
Counting Duplicates
Explanation: The number of occurrences of x in a sorted array is the difference between the upper and lower bound indices.
Batch Queries
Explanation: Answering q independent searches on a sorted container takes logarithmic time per query, totaling O(q log n).
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Returns the first index i such that a[i] >= x; if not found, returns n 5 int lower_bound_index(const vector<int>& a, int x) { 6 int n = (int)a.size(); 7 int L = 0, R = n; // search in [L, R) 8 while (L < R) { 9 int mid = L + (R - L) / 2; // overflow-safe 10 if (a[mid] >= x) { 11 R = mid; // keep mid; answer is in [L, mid] 12 } else { 13 L = mid + 1; // answer is in (mid, R) 14 } 15 } 16 return L; // L == R: first >= x or n if none 17 } 18 19 // Returns the first index i such that a[i] > x; if not found, returns n 20 int upper_bound_index(const vector<int>& a, int x) { 21 int n = (int)a.size(); 22 int L = 0, R = n; // search in [L, R) 23 while (L < R) { 24 int mid = L + (R - L) / 2; 25 if (a[mid] > x) { 26 R = mid; // shrink right; mid could be the first > x 27 } else { 28 L = mid + 1; // need strictly larger, go right 29 } 30 } 31 return L; 32 } 33 34 int main() { 35 ios::sync_with_stdio(false); 36 cin.tie(nullptr); 37 38 vector<int> a = {1, 2, 2, 2, 4, 5}; 39 for (int x : {0, 2, 3, 6}) { 40 int lb = lower_bound_index(a, x); 41 int ub = upper_bound_index(a, x); 42 int cnt = ub - lb; 43 cout << "x=" << x 44 << " lb=" << lb 45 << " ub=" << ub 46 << " count=" << cnt << "\n"; 47 } 48 return 0; 49 } 50
We implement the classic boundary searches using a half-open interval [L, R). lower_bound finds the first index with a[i] ≥ x, while upper_bound finds the first index with a[i] > x. The difference ub − lb counts occurrences of x. Using [L, R) avoids special cases when L and R get adjacent and guarantees progress each loop.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Returns floor(sqrt(n)) for n >= 0 using binary search on answer. 5 // Predicate: p(x) = (x*x <= n), which is nondecreasing in x until boundary. 6 long long isqrt_floor(long long n) { 7 long long L = 0, R = min(n, (long long)2e9); // search [L, R] 8 long long ans = 0; 9 while (L <= R) { // closed interval 10 long long mid = L + (R - L) / 2; 11 // Avoid overflow: mid*mid <= n <=> mid <= n / mid (if mid > 0) 12 bool ok = (mid == 0) ? true : (mid <= n / mid); 13 if (ok) { 14 ans = mid; // mid is feasible, try to go higher 15 L = mid + 1; 16 } else { 17 R = mid - 1; // infeasible, go lower 18 } 19 } 20 return ans; 21 } 22 23 int main() { 24 ios::sync_with_stdio(false); 25 cin.tie(nullptr); 26 27 vector<long long> tests = {0, 1, 2, 3, 4, 8, 9, 10, (long long)1e12}; 28 for (auto n : tests) { 29 cout << "n=" << n << " floor_sqrt=" << isqrt_floor(n) << "\n"; 30 } 31 return 0; 32 } 33
We search the integer domain for the largest x such that x^2 ≤ n (equivalently, the first false from the right). To avoid overflow in mid*mid, we compare mid ≤ n / mid. The closed-interval [L, R] pattern with L <= R and updating ans on feasible mid is a standard approach for maximum-feasible searches.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Given stall positions and k cows, place cows to maximize the minimum distance between any two. 5 // We binary search the answer d and check feasibility greedily. 6 7 bool canPlace(const vector<int>& pos, int k, int d) { 8 int count = 1; // place first cow at pos[0] 9 int last = pos[0]; 10 for (size_t i = 1; i < pos.size(); ++i) { 11 if (pos[i] - last >= d) { 12 ++count; 13 last = pos[i]; 14 if (count >= k) return true; // early exit 15 } 16 } 17 return false; 18 } 19 20 int maximizeMinDistance(vector<int> pos, int k) { 21 sort(pos.begin(), pos.end()); 22 int n = (int)pos.size(); 23 int L = 0; // always feasible (distance 0) 24 int R = pos.back() - pos.front(); // maximum possible distance 25 int best = 0; 26 while (L <= R) { // closed interval for max-feasible 27 int mid = L + (R - L) / 2; // candidate minimum distance 28 if (canPlace(pos, k, mid)) { 29 best = mid; // feasible; try larger distance 30 L = mid + 1; 31 } else { 32 R = mid - 1; // infeasible; try smaller distance 33 } 34 } 35 return best; 36 } 37 38 int main() { 39 ios::sync_with_stdio(false); 40 cin.tie(nullptr); 41 42 vector<int> stalls = {1, 2, 8, 12, 17}; 43 int k = 3; 44 cout << "max min distance = " << maximizeMinDistance(stalls, k) << "\n"; // Expected 6 (1,8,17) 45 return 0; 46 } 47
We search over the answer (minimum pairwise distance d). The predicate canPlace is monotone: if you can place cows with distance d, then you can also place them with any smaller distance. We greedily place cows left-to-right and count placements; the binary search finds the maximum feasible d.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Given rope lengths, find the maximum piece length L so that we can cut at least k pieces. 5 // Real-valued binary search with tolerance epsilon. 6 7 double maxPieceLength(const vector<double>& a, long long k, double eps = 1e-7) { 8 double L = 0.0, R = 0.0; 9 for (double x : a) R = max(R, x); 10 // Invariant: answer in [L, R] 11 for (int it = 0; it < 200 && R - L > eps; ++it) { // fixed iterations + epsilon guard 12 double mid = (L + R) / 2.0; 13 if (mid == 0.0) { L = mid; continue; } 14 long long pieces = 0; 15 for (double x : a) pieces += (long long)(x / mid); 16 if (pieces >= k) { 17 L = mid; // feasible: we can try longer pieces 18 } else { 19 R = mid; // infeasible: need shorter pieces 20 } 21 } 22 return L; // within eps of the true optimum 23 } 24 25 int main() { 26 ios::sync_with_stdio(false); 27 cin.tie(nullptr); 28 29 vector<double> ropes = {4.0, 7.5, 9.2}; 30 long long k = 7; 31 cout.setf(std::ios::fixed); cout << setprecision(6); 32 cout << "max piece length ≈ " << maxPieceLength(ropes, k) << "\n"; 33 return 0; 34 } 35
We search over real lengths to maximize the piece size while ensuring we can cut at least k pieces. The monotone predicate is “pieces(mid) ≥ k”, which becomes false as mid grows. We iterate until the interval is smaller than ε, returning an approximation with bounded error. Using a fixed iteration cap helps avoid edge-case non-convergence.