⚙️AlgorithmIntermediate

Binary Search

Key Points

  • Binary search quickly finds targets or boundaries in sorted or monotonic data by halving the search interval each step.
  • Use lowe to find the first index with and uppe to find the first index with .
  • Think in terms of finding the first true in a nondecreasing predicate p(x) that flips from false to true at some boundary.
  • Binary search on answer is powerful when checking feasibility is easier than constructing the optimal answer.
  • Handle integer overflow with mid = l + (r - l) / 2 and watch off-by-one errors by fixing interval conventions.
  • For doubles, stop when the interval width is below the error shrinks exponentially with iterations.
  • Standard library functions (std::lowe, std::uppe, std::binar) are safe and fast.
  • Typical complexity is O(log n) per query, or O(n log R) for answer-search with an O(n) check over range R.

Prerequisites

  • Ordering and comparisonsBinary search depends on a total order to decide which half to discard.
  • Arrays and indexingUnderstanding zero-based indices and bounds is crucial for interval management.
  • Mathematical monotonicityBinary search on answer requires a predicate that transitions once and only once.
  • Time complexity (Big-O notation)To analyze why halving leads to logarithmic time and to estimate iterations.
  • Integer and floating-point arithmeticPrevents overflow and manages precision when searching over reals.
  • Sorting algorithmsData must be sorted before applying binary search on arrays.

Detailed Explanation

Tap terms for definitions

01Overview

Binary search is a fundamental algorithm to locate a target or a transition point in a sorted or monotonic structure. The core idea is to maintain an interval [L, R] (or a variant like [L, R), (L, R], or (L, R)) that is guaranteed to contain the answer, and then repeatedly cut this interval in half by comparing the middle element (or testing a predicate at the midpoint). Because each step discards half of the remaining possibilities, the number of steps grows logarithmically with problem size. This makes binary search especially effective for large datasets or wide numeric ranges. In practical programming, binary search appears in two main forms. The first is searching sorted arrays: finding whether an element exists, counting duplicates, or locating boundaries such as the first element not less than a given value (lower_bound) or the first greater than a value (upper_bound). The second form is binary search on answer: we do not search a container, but a numeric domain (integers or reals) to find the smallest feasible value (or largest) with respect to a monotone feasibility predicate. C++ provides robust standard library tools—std::lower_bound, std::upper_bound, and std::binary_search—to implement searches safely. When searching over numeric ranges, developers typically write a custom loop with careful boundary updates and overflow-aware midpoint computation. The algorithm’s reliability hinges on maintaining a correct invariant and choosing consistent interval semantics.

02Intuition & Analogies

Imagine looking for a word in a huge dictionary. Rather than flipping pages one by one, you open the book in the middle. If your word comes alphabetically after the mid page, you ignore the left half and continue with the right half; otherwise, you focus on the left half. Each peek halves your uncertainty. This is binary search in a nutshell: narrowing down possibilities by repeatedly cutting the search space in half. Another analogy: a broken light string where bulbs are numbered from 1 to n, and bulbs from some point onward stop working. You can test any position i to know whether bulbs up to i are okay. If bulbs are working up to i but fail beyond, the first broken bulb is somewhere to the right; if they already fail at i, the first broken bulb is at i or to the left. By picking the middle bulb each time, you find the exact boundary efficiently. This captures the “first true/first false” framing: a predicate p(i) that transitions once—false, false, …, then true, true, …—and you want the switch point. A third picture: tuning a radio. Suppose turning the dial right increases signal clarity monotonically. You want the smallest dial position that gives “acceptable” clarity. If the mid setting is unacceptable, you must go right; if acceptable, you try to improve (go left) without dropping below the threshold. Again, you’re halving the dial range repeatedly. The same happens for real numbers: you shrink a continuous interval until it’s tiny enough to approximate the desired value within a tolerance ε.

03Formal Definition

Let S be a totally ordered set (e.g., sorted array indices, integers in [L, R], or reals in [L, R]). Let p: S → {false, true} be a nondecreasing predicate: for all , p(x) implies p(y). The binary search problem is to find the boundary point b such that for all , p(x) = false and for all , p(x) = true (the “first true”), or analogously the last false. If p is nonincreasing, roles invert. In arrays, define ≤ … ≤ . For a query value x, lowe is the smallest index i such that , i.e., i = min{i | a_i ≥ x}, if it exists; upper_bound is the smallest index i such that a_i > x. These are boundary-finding instances where p(i) := (a_i ≥ x) is nondecreasing in i. Classic membership testing uses binary search to check whether a_j = x for some j by inspecting midpoints and discarding halves based on comparisons. On numeric domains, one maintains an invariant interval I_k whose width shrinks by about half each iteration: I_{k+1} ⊂ I_k and |I_{k+1}| ≈ |I_k|/2. The algorithm terminates when I_k is empty for integers (L > R) or sufficiently small for reals (|I_k| < returning the respective boundary or an approximation thereof under the chosen interval semantics.

04When to Use

Use binary search whenever you can order candidates and prove a one-way transition in a predicate:

  • Sorted containers: presence checks, counting occurrences, and range queries. Examples: find the first timestamp ≥ t, locate insertion positions to keep a list sorted, or compute frequency via upper_bound - lower_bound.
  • Monotone feasibility (“binary search on answer”): choose the minimal capacity/threshold where a feasibility check succeeds. Examples: minimum ship capacity to deliver packages within D days; maximum minimum distance to place k routers; smallest processing speed to finish tasks within T time.
  • Real-valued optimization with monotone acceptability: maximize cut length of ropes while yielding at least k pieces; find minimal radius to cover points with m centers; compute square/cube roots when a monotone function crosses a target.
  • Batch queries: pre-sort once and answer many queries in O(log n) each. This is common in competitive programming for CF 800–1400 problems. Avoid binary search when the predicate is not monotone (e.g., values go false→true→false); in such unimodal cases, consider ternary search or other techniques.

⚠️Common Mistakes

  • Off-by-one and interval confusion: Mixing [L, R], [L, R), or (L, R) without consistent updates causes infinite loops or missed answers. Pick one convention and stick to it. For example, with [L, R), advance L = mid + 1 only when you are certain the answer is strictly to the right.
  • Midpoint overflow: Computing mid = (L + R) / 2 can overflow 32-bit integers. Use mid = L + (R - L) / 2 and 64-bit types when ranges are large.
  • Wrong direction on equality: For lower_bound, when a[mid] ≥ x you must move R = mid (not L), otherwise you skip the first valid position. For upper_bound, when a[mid] ≤ x you must move L = mid + 1.
  • Predicate not monotone: If your feasibility check sometimes flips back to false as you increase the candidate, binary search may return the wrong boundary. Prove monotonicity before coding.
  • Infinite loops on integers: With [L, R) and mid = (L + R) / 2, ensure at least one bound changes. If mid == L, you must update R = mid in the ‘true’ branch and L = mid + 1 in the ‘false’ branch.
  • Precision issues on doubles: Don’t compare doubles for exact equality. Stop when R - L < ε, or iterate a fixed number of times. Use adequate ε, and be mindful of accumulated floating-point error.
  • Forgetting to sort: Searching an unsorted array with binary search yields garbage results. Always ensure sorted order consistent with your comparisons.

Key Formulas

Binary Search Time

Explanation: Each iteration halves the search space, so it takes logarithmic steps in the number of elements n. This means doubling the input size adds only one extra step.

Overflow-Safe Midpoint

Explanation: This computes the middle without summing L and R directly, preventing integer overflow when L and R are large.

Lower Bound Definition

Explanation: The lower bound index is the first position where the array value is at least x. It’s the standard way to find the insertion point for x while keeping order.

Upper Bound Definition

Explanation: The upper bound index is the first position strictly greater than x. The count of x equals ub(x) − lb(x).

Interval Shrinkage

Explanation: After k iterations, the interval length is reduced by a factor of 2^k. This quantifies how quickly the uncertainty shrinks.

Iterations for Precision

Explanation: To reduce an initial interval to within error ε in real-valued search, you need at least this many iterations.

Answer Search Complexity

Explanation: If the feasibility check per candidate costs O(n) and the numeric range size is R, total time is O(n log R). This is typical for capacity or distance optimization problems.

First-True Boundary

Explanation: Binary search requires a single boundary where a nondecreasing predicate transitions from false to true. The algorithm finds b efficiently.

Counting Duplicates

Explanation: The number of occurrences of x in a sorted array is the difference between the upper and lower bound indices.

Batch Queries

Explanation: Answering q independent searches on a sorted container takes logarithmic time per query, totaling O(q log n).

Complexity Analysis

For a sorted array of size n, binary search performs at most ⌊log2 n⌋ + 1 iterations because each step discards half of the remaining candidates. Consequently, time complexity per query is O(log n), and space complexity is O(1) since only a few indices and temporary variables are kept. When counting occurrences via two boundary searches (lowe and uppe), the time is still O(log n) and space O(1). For binary search on answer across an integer range [L, R], the number of iterations is O(log(R − L + 1)). If each feasibility check runs in O(f(n)) time (commonly O(n) to scan an array), the total time becomes O(f(n) · log(R − L + 1)) with O(1) auxiliary space. Examples include maximizing a minimum distance or minimizing capacity subject to constraints; both typically use a linear check combined with logarithmic iterations. For real-valued searches on [L, R] with tolerance the interval length halves each iteration; after k steps it is (R − L)/2^k. To achieve error ≤ we need k ≥ ⌈log2((R − L)/ iterations. Each step costs the time of the predicate evaluation, often O(n), so the total is O(f(n) · log((R − L)/ Floating-point operations do not change the big-O but require attention to numerical stability. In all cases, memory usage stays constant (O(1)) unless the predicate itself uses extra storage. Using C++ STL algorithms does not change asymptotic costs but reduces constant factors and errors.

Code Examples

Manual lower_bound and upper_bound on a sorted array (duplicates handled)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Returns the first index i such that a[i] >= x; if not found, returns n
5int lower_bound_index(const vector<int>& a, int x) {
6 int n = (int)a.size();
7 int L = 0, R = n; // search in [L, R)
8 while (L < R) {
9 int mid = L + (R - L) / 2; // overflow-safe
10 if (a[mid] >= x) {
11 R = mid; // keep mid; answer is in [L, mid]
12 } else {
13 L = mid + 1; // answer is in (mid, R)
14 }
15 }
16 return L; // L == R: first >= x or n if none
17}
18
19// Returns the first index i such that a[i] > x; if not found, returns n
20int upper_bound_index(const vector<int>& a, int x) {
21 int n = (int)a.size();
22 int L = 0, R = n; // search in [L, R)
23 while (L < R) {
24 int mid = L + (R - L) / 2;
25 if (a[mid] > x) {
26 R = mid; // shrink right; mid could be the first > x
27 } else {
28 L = mid + 1; // need strictly larger, go right
29 }
30 }
31 return L;
32}
33
34int main() {
35 ios::sync_with_stdio(false);
36 cin.tie(nullptr);
37
38 vector<int> a = {1, 2, 2, 2, 4, 5};
39 for (int x : {0, 2, 3, 6}) {
40 int lb = lower_bound_index(a, x);
41 int ub = upper_bound_index(a, x);
42 int cnt = ub - lb;
43 cout << "x=" << x
44 << " lb=" << lb
45 << " ub=" << ub
46 << " count=" << cnt << "\n";
47 }
48 return 0;
49}
50

We implement the classic boundary searches using a half-open interval [L, R). lower_bound finds the first index with a[i] ≥ x, while upper_bound finds the first index with a[i] > x. The difference ub − lb counts occurrences of x. Using [L, R) avoids special cases when L and R get adjacent and guarantees progress each loop.

Time: O(log n) per searchSpace: O(1)
First true on integers: floor square root without overflow
1#include <bits/stdc++.h>
2using namespace std;
3
4// Returns floor(sqrt(n)) for n >= 0 using binary search on answer.
5// Predicate: p(x) = (x*x <= n), which is nondecreasing in x until boundary.
6long long isqrt_floor(long long n) {
7 long long L = 0, R = min(n, (long long)2e9); // search [L, R]
8 long long ans = 0;
9 while (L <= R) { // closed interval
10 long long mid = L + (R - L) / 2;
11 // Avoid overflow: mid*mid <= n <=> mid <= n / mid (if mid > 0)
12 bool ok = (mid == 0) ? true : (mid <= n / mid);
13 if (ok) {
14 ans = mid; // mid is feasible, try to go higher
15 L = mid + 1;
16 } else {
17 R = mid - 1; // infeasible, go lower
18 }
19 }
20 return ans;
21}
22
23int main() {
24 ios::sync_with_stdio(false);
25 cin.tie(nullptr);
26
27 vector<long long> tests = {0, 1, 2, 3, 4, 8, 9, 10, (long long)1e12};
28 for (auto n : tests) {
29 cout << "n=" << n << " floor_sqrt=" << isqrt_floor(n) << "\n";
30 }
31 return 0;
32}
33

We search the integer domain for the largest x such that x^2 ≤ n (equivalently, the first false from the right). To avoid overflow in mid*mid, we compare mid ≤ n / mid. The closed-interval [L, R] pattern with L <= R and updating ans on feasible mid is a standard approach for maximum-feasible searches.

Time: O(log n)Space: O(1)
Binary search on answer: maximize minimum distance (Aggressive Cows style)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Given stall positions and k cows, place cows to maximize the minimum distance between any two.
5// We binary search the answer d and check feasibility greedily.
6
7bool canPlace(const vector<int>& pos, int k, int d) {
8 int count = 1; // place first cow at pos[0]
9 int last = pos[0];
10 for (size_t i = 1; i < pos.size(); ++i) {
11 if (pos[i] - last >= d) {
12 ++count;
13 last = pos[i];
14 if (count >= k) return true; // early exit
15 }
16 }
17 return false;
18}
19
20int maximizeMinDistance(vector<int> pos, int k) {
21 sort(pos.begin(), pos.end());
22 int n = (int)pos.size();
23 int L = 0; // always feasible (distance 0)
24 int R = pos.back() - pos.front(); // maximum possible distance
25 int best = 0;
26 while (L <= R) { // closed interval for max-feasible
27 int mid = L + (R - L) / 2; // candidate minimum distance
28 if (canPlace(pos, k, mid)) {
29 best = mid; // feasible; try larger distance
30 L = mid + 1;
31 } else {
32 R = mid - 1; // infeasible; try smaller distance
33 }
34 }
35 return best;
36}
37
38int main() {
39 ios::sync_with_stdio(false);
40 cin.tie(nullptr);
41
42 vector<int> stalls = {1, 2, 8, 12, 17};
43 int k = 3;
44 cout << "max min distance = " << maximizeMinDistance(stalls, k) << "\n"; // Expected 6 (1,8,17)
45 return 0;
46}
47

We search over the answer (minimum pairwise distance d). The predicate canPlace is monotone: if you can place cows with distance d, then you can also place them with any smaller distance. We greedily place cows left-to-right and count placements; the binary search finds the maximum feasible d.

Time: O(n log R), where R is the coordinate spanSpace: O(1) extra (O(n) for storing input)
Real-valued binary search: maximum equal rope cut length with precision
1#include <bits/stdc++.h>
2using namespace std;
3
4// Given rope lengths, find the maximum piece length L so that we can cut at least k pieces.
5// Real-valued binary search with tolerance epsilon.
6
7double maxPieceLength(const vector<double>& a, long long k, double eps = 1e-7) {
8 double L = 0.0, R = 0.0;
9 for (double x : a) R = max(R, x);
10 // Invariant: answer in [L, R]
11 for (int it = 0; it < 200 && R - L > eps; ++it) { // fixed iterations + epsilon guard
12 double mid = (L + R) / 2.0;
13 if (mid == 0.0) { L = mid; continue; }
14 long long pieces = 0;
15 for (double x : a) pieces += (long long)(x / mid);
16 if (pieces >= k) {
17 L = mid; // feasible: we can try longer pieces
18 } else {
19 R = mid; // infeasible: need shorter pieces
20 }
21 }
22 return L; // within eps of the true optimum
23}
24
25int main() {
26 ios::sync_with_stdio(false);
27 cin.tie(nullptr);
28
29 vector<double> ropes = {4.0, 7.5, 9.2};
30 long long k = 7;
31 cout.setf(std::ios::fixed); cout << setprecision(6);
32 cout << "max piece length ≈ " << maxPieceLength(ropes, k) << "\n";
33 return 0;
34}
35

We search over real lengths to maximize the piece size while ensuring we can cut at least k pieces. The monotone predicate is “pieces(mid) ≥ k”, which becomes false as mid grows. We iterate until the interval is smaller than ε, returning an approximation with bounded error. Using a fixed iteration cap helps avoid edge-case non-convergence.

Time: O(n log((R - L)/ε))Space: O(1) extra (O(n) for input)