Sweepline Technique
Key Points
- •The sweep line technique processes geometric or time-based events in sorted order and maintains an active set that reflects the current state at the sweep position.
- •It reduces higher-dimensional problems (often 2D) into a sequence of 1D subproblems, enabling efficient solutions.
- •Typical runtime is O(n n) because we sort events and update a balanced data structure per event.
- •Key applications include rectangle union area, closest pair of points, segment intersection counting, and interval overlap queries.
- •The core design is: build events, sort them, choose a data structure (BST, segment tree, or Fenwick tree), and carefully define tie-breaking and updates.
- •Coordinate compression is often necessary to shrink large or real-valued coordinates to manageable integer indices for trees.
- •Robustness depends on careful handling of degeneracies (equal coordinates, overlapping edges, endpoint inclusion).
- •Well-chosen invariants (what the active set represents) make the technique reliable and easy to reason about.
Prerequisites
- →Sorting and comparison-based ordering — Events must be sorted with custom tie-breaking to define processing order.
- →Balanced BSTs / sets — Active sets often need O(log n) insert/erase and ordered range queries.
- →Fenwick tree (BIT) and Segment tree — Range count/sum and covered-length maintenance require these structures.
- →Coordinate compression — Transforms large or real coordinates into manageable integer indices for trees.
- →Basic computational geometry — Understanding segments, rectangles, distances, and endpoint conventions is essential.
- →Asymptotic analysis — To reason about O(n log n) behavior and output-sensitive bounds like O((n+k) log n).
Detailed Explanation
Tap terms for definitions01Overview
The sweep line technique is a general algorithmic pattern for solving problems where objects change state across an ordered dimension, such as the x-axis in geometry or time in scheduling. Imagine sliding a vertical line from left to right across a plane: as you pass object boundaries (segment endpoints, rectangle edges, or timestamps), the set of objects currently intersecting the line—the active set—changes. By processing these changes (events) in sorted order, you transform a complex global question into a sequence of local updates and queries on a dynamic set. This reduction is powerful: many 2D problems become a series of 1D subproblems because at any fixed sweep position, only the cross-section matters. The method typically pairs with a data structure—balanced BST, Fenwick tree (BIT), or segment tree—that supports fast insert, delete, and query operations. The general pattern is: create events (with positions and types), sort them, initialize an empty active set, and then iterate through events, updating the active set and extracting answers as needed. Many classic problems—computing the union area of rectangles, counting intersections between horizontal and vertical segments, finding the closest pair of points, or tracking interval overlaps over time—are solved in O(n log n) time with this approach.
02Intuition & Analogies
Picture a street sweeper driving down a long boulevard. As it moves, new trash enters the sweeper’s front (events when the sweeper reaches them), remains inside while it is under the sweeper, and then exits at the back. The sweeper only cares about what is currently under it—the active set—because that determines how full the bin is right now. The road itself is like the number line (x-axis or time), and each piece of trash is an object (segment, rectangle edge, or interval) that starts to matter when the sweeper reaches it and stops when it passes. This mental model explains both why we sort events (we encounter them in order along the road) and why we maintain a data structure (to summarize what’s currently inside the sweeper). In 2D, the vertical sweep line at position x reduces the plane to a vertical slice: the problem’s complexity collapses to “what’s happening along this slice?” For example, to compute the area of a union of rectangles, we only need to know how much of the y-axis is covered by rectangles intersecting the current x. Multiply that covered y-length by how far we move in x, and accumulate the area. For counting intersections between horizontal and vertical segments, each time we reach a vertical segment’s x, we ask: how many active horizontal segments currently cross its y-range? Even the closest pair of points problem uses the same idea: keep a small neighborhood of candidates near the sweep line because far-away points can’t beat the current best distance. The sweep line is the “moving viewpoint” that keeps problems small and local.
03Formal Definition
04When to Use
Use the sweep line technique when a problem can be naturally expressed as ‘what changes as we move along one ordered dimension?’ Typical scenarios include: (1) 2D geometry with axis-parallel structure (rectangle union area, orthogonal segment intersection counting) where one axis can be swept and the other is summarized with a data structure; (2) Problems where events happen over time (interval scheduling, counting maximum overlap, merging time windows) and you only need to know the current active intervals; (3) Nearest neighbor or distance problems where only nearby elements around the sweep position matter (closest pair of points); (4) Offline range counting queries that can be turned into adds and queries sorted by one coordinate, with a Fenwick tree answering queries on the other coordinate; (5) Detecting intersections among general segments using an ordered set by y that changes as segments enter/exit. Prefer sweep line when sorting plus O(\log n) updates are possible and the number of events is manageable. Avoid it if events are hard to order unambiguously, if updates require non-logarithmic time, or if the state depends on non-local information that the active set cannot represent compactly.
⚠️Common Mistakes
- Incorrect tie-breaking: At the same position, processing deletions after insertions (or vice versa) can double count or miss events. Define a clear order, e.g., at coordinate x, process adds, then queries, then removes, or whatever your inclusion rules require.
- Endpoint inclusion/exclusion: Whether intervals are closed [l, r] or half-open [l, r) changes results. Be consistent in both event creation and data structure updates to avoid off-by-one behavior.
- Not compressing coordinates: Using raw large (or floating) coordinates with segment trees/Fenwick trees causes memory issues and precision errors. Perform coordinate compression and store differences for accurate lengths.
- Failing to remove stale items: In closest pair, points that are too far in x should be purged from the active set, otherwise the algorithm devolves toward quadratic behavior.
- Comparing doubles directly: Floating-point comparisons need epsilons or rationalized integers. When possible, use integer math or compare squared distances to avoid sqrt and precision errors.
- Overlooking degeneracies: Overlapping segments, identical coordinates, vertical/horizontal special cases, or zero-length objects demand explicit handling in event generation and tie-breaking.
- Expensive per-event work: Keep active-set operations O(\log n). If you repeatedly scan linear ranges, you lose the sweep line’s advantage. Choose the right data structure (balanced BST, segment tree, or BIT) for required operations.
- Not grouping events at the same coordinate: For accumulation tasks (like area), you must first account for delta between coordinates, then apply all updates at the new position (or vice versa), consistently across the sweep.
Key Formulas
Typical Sweep Line Complexity
Explanation: Most sweep line solutions sort O(n) events and perform O( n) data structure operations per event. This yields overall O(n log n) time.
Rectangle Union Area via Sweep
Explanation: The total area equals the sum over consecutive x-slices of covered y-length times the slice width. (i) is the union length of y-coverage at slice i.
Segment Intersections Enumeration
Explanation: When k intersections are output, the sweep performs O( n) updates per event and handles O(k) extra events, resulting in O((n + k) log n) time.
Closest Pair Distance
Explanation: The minimal Euclidean distance among all pairs can be found in O(n log n) via sweep line, often computed with squared distances to avoid square roots.
Sorting Lower Bound Context
Explanation: Since sweep line relies on sorting, the O(n log n) lower bound for comparison sorting is relevant; naive pairwise checks would be O() which equals this sum order.
Per-Event Recurrence
Explanation: Scanning each of n events and doing O( n) work leads to O(n n) total time, after an initial O(n n) sort.
Coverage Indicator
Explanation: Defines whether a y-position is currently covered. Segment trees aggregate this to yield total covered length along y.
Event Count for Orthogonal Problems
Explanation: For axis-aligned rectangles or segments, each object yields a constant number of events, so the number of events m scales linearly with n.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute union area of axis-aligned rectangles using sweep line over x 5 // Each rectangle: (x1, y1) lower-left, (x2, y2) upper-right, with x1 < x2, y1 < y2 6 // Time: O(n log n), Space: O(n) 7 8 struct Event { 9 long long x; // sweep position 10 int y1, y2; // indices in compressed y array [y1, y2) 11 int type; // +1: add interval, -1: remove interval 12 bool operator<(const Event& other) const { 13 if (x != other.x) return x < other.x; 14 // Tie-breaking: process adds before removes to ensure coverage is updated correctly 15 return type > other.type; // +1 before -1 16 } 17 }; 18 19 struct SegTree { 20 int n; // number of elementary intervals = ys.size() - 1 21 vector<int> cover; // coverage count 22 vector<long long> length; // covered length in this node's segment 23 vector<long long> ys; // original y coordinates (sorted) 24 25 SegTree(const vector<long long>& ys_): ys(ys_) { 26 n = (int)ys.size() - 1; 27 cover.assign(4*n, 0); 28 length.assign(4*n, 0); 29 } 30 31 void pull(int idx, int l, int r) { 32 if (cover[idx] > 0) { 33 // fully covered: take full length of this segment 34 length[idx] = ys[r] - ys[l]; 35 } else if (l + 1 == r) { 36 // leaf, no coverage 37 length[idx] = 0; 38 } else { 39 // sum children 40 length[idx] = length[idx*2] + length[idx*2+1]; 41 } 42 } 43 44 void update(int idx, int l, int r, int ql, int qr, int val) { 45 if (qr <= l || r <= ql) return; // no overlap 46 if (ql <= l && r <= qr) { 47 cover[idx] += val; 48 pull(idx, l, r); 49 return; 50 } 51 int mid = (l + r) / 2; 52 update(idx*2, l, mid, ql, qr, val); 53 update(idx*2+1, mid, r, ql, qr, val); 54 pull(idx, l, r); 55 } 56 57 // public wrapper: update interval [l, r) with +1 or -1 58 void update(int l, int r, int val) { update(1, 0, n, l, r, val); } 59 60 long long covered_length() const { return length[1]; } 61 }; 62 63 int main() { 64 ios::sync_with_stdio(false); 65 cin.tie(nullptr); 66 67 int n; 68 if (!(cin >> n)) return 0; 69 struct Rect { long long x1, y1, x2, y2; }; 70 vector<Rect> rects(n); 71 vector<long long> ys; 72 ys.reserve(2*n); 73 for (int i = 0; i < n; ++i) { 74 cin >> rects[i].x1 >> rects[i].y1 >> rects[i].x2 >> rects[i].y2; 75 if (rects[i].x1 > rects[i].x2) swap(rects[i].x1, rects[i].x2); 76 if (rects[i].y1 > rects[i].y2) swap(rects[i].y1, rects[i].y2); 77 ys.push_back(rects[i].y1); 78 ys.push_back(rects[i].y2); 79 } 80 81 // Coordinate compression of y 82 sort(ys.begin(), ys.end()); 83 ys.erase(unique(ys.begin(), ys.end()), ys.end()); 84 85 // Build events at rectangle vertical edges 86 vector<Event> events; 87 events.reserve(2*n); 88 for (auto &r : rects) { 89 int y1 = (int)(lower_bound(ys.begin(), ys.end(), r.y1) - ys.begin()); 90 int y2 = (int)(lower_bound(ys.begin(), ys.end(), r.y2) - ys.begin()); 91 if (y1 == y2) continue; // zero-height 92 events.push_back({r.x1, y1, y2, +1}); 93 events.push_back({r.x2, y1, y2, -1}); 94 } 95 96 if (events.empty()) { cout << 0 << "\n"; return 0; } 97 98 sort(events.begin(), events.end()); 99 100 // Segment tree needs the segment endpoints array; intervals are [ys[i], ys[i+1]) 101 SegTree st(ys); 102 103 long long area = 0; 104 long long prev_x = events[0].x; 105 106 size_t i = 0; 107 while (i < events.size()) { 108 long long x = events[i].x; 109 // Accumulate area for the span [prev_x, x) 110 long long dx = x - prev_x; 111 if (dx > 0) { 112 area += st.covered_length() * dx; 113 prev_x = x; 114 } 115 // Process all events at this x 116 while (i < events.size() && events[i].x == x) { 117 st.update(events[i].y1, events[i].y2, events[i].type); 118 ++i; 119 } 120 } 121 122 cout << area << "\n"; 123 return 0; 124 } 125
We sweep a vertical line over x. Each rectangle contributes two events: add its y-interval at x1 and remove it at x2. A segment tree over compressed y-coordinates maintains the total covered y-length at the current x. For each consecutive x-slice, multiply covered y-length by the x-gap and sum to get the union area. Using half-open y-intervals [y1, y2) avoids double counting along shared edges.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Find the minimum distance between any pair of points in O(n log n) 5 // Approach: sort by x, maintain active set ordered by y; keep only points within current best distance in x 6 7 struct Point { long long x, y; }; 8 9 // Squared distance to avoid sqrt until the end 10 static inline long long dist2(const Point& a, const Point& b) { 11 long long dx = a.x - b.x; 12 long long dy = a.y - b.y; 13 return dx*dx + dy*dy; 14 } 15 16 int main() { 17 ios::sync_with_stdio(false); 18 cin.tie(nullptr); 19 20 int n; if (!(cin >> n)) return 0; 21 vector<Point> pts(n); 22 for (int i = 0; i < n; ++i) cin >> pts[i].x >> pts[i].y; 23 24 sort(pts.begin(), pts.end(), [](const Point& a, const Point& b){ 25 if (a.x != b.x) return a.x < b.x; 26 return a.y < b.y; 27 }); 28 29 // Active set: store (y, x) to order by y and allow erasure by value 30 set<pair<long long,long long>> active; 31 deque<Point> window; // maintains points in increasing x order, synchronized with active set 32 33 const long long INF = (1LL<<62); 34 long long best2 = INF; // best squared distance so far 35 36 for (const auto& p : pts) { 37 // Remove points with x too far from current x (beyond sqrt(best2)) 38 while (!window.empty()) { 39 long long dx = p.x - window.front().x; 40 if (best2 != INF && dx*dx > best2) { 41 active.erase({window.front().y, window.front().x}); 42 window.pop_front(); 43 } else break; 44 } 45 46 // Candidates must have y within sqrt(best2) of p.y 47 long long dy = (best2 == INF) ? (1LL<<62) : (long long)floor(sqrt((long double)best2)); 48 auto it_low = active.lower_bound({p.y - dy, LLONG_MIN}); 49 auto it_up = active.upper_bound({p.y + dy, LLONG_MAX}); 50 51 for (auto it = it_low; it != it_up; ++it) { 52 Point q{it->second, it->first}; 53 best2 = min(best2, dist2(p, q)); 54 } 55 56 active.insert({p.y, p.x}); 57 window.push_back(p); 58 } 59 60 cout << fixed << setprecision(6) << sqrt((long double)best2) << "\n"; 61 return 0; 62 } 63
Sort points by x and sweep from left to right. Maintain an active set of prior points whose x is within the current best distance of the current point. Only points with y within ±sqrt(best) can improve the answer, so we query that y-range in a set ordered by y. The number of candidates is constant on average, yielding O(n log n) time. Distances are squared to avoid precision issues, with a final sqrt for output.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Count intersections where horizontal segment [x1,x2] at y intersects vertical segment x at [y1,y2] 5 // Sweep over x: add/remove horizontals in a BIT keyed by y, and at each vertical, query count in [y1,y2] 6 7 struct Horizontal { long long y, x1, x2; }; 8 struct Vertical { long long x, y1, y2; }; 9 10 struct BIT { 11 int n; vector<long long> t; 12 BIT(int n=0): n(n), t(n+1,0) {} 13 void add(int i, long long v){ for(++i;i<=n;i+=i&-i) t[i]+=v; } 14 long long sumPrefix(int i){ long long s=0; for(++i;i>0;i-=i&-i) s+=t[i]; return s; } 15 long long rangeSum(int l,int r){ if(r<l) return 0; return sumPrefix(r)-sumPrefix(l-1); } 16 }; 17 18 struct Event { 19 long long x; int type; // 0=addH, 1=queryV, 2=remH 20 long long y1, y2; // for add/rem: y1==y2==y; for query: y-range 21 bool operator<(const Event& o) const { 22 if (x != o.x) return x < o.x; 23 // tie-break: add before query before remove so endpoints are counted 24 return type < o.type; 25 } 26 }; 27 28 int main(){ 29 ios::sync_with_stdio(false); 30 cin.tie(nullptr); 31 32 int H, V; if(!(cin>>H>>V)) return 0; 33 vector<Horizontal> hs(H); 34 vector<Vertical> vs(V); 35 vector<long long> ys; ys.reserve(2*H + 2*V); 36 37 for(int i=0;i<H;++i){ 38 cin>>hs[i].x1>>hs[i].x2>>hs[i].y; 39 if(hs[i].x1>hs[i].x2) swap(hs[i].x1, hs[i].x2); 40 ys.push_back(hs[i].y); 41 } 42 for(int i=0;i<V;++i){ 43 cin>>vs[i].x>>vs[i].y1>>vs[i].y2; 44 if(vs[i].y1>vs[i].y2) swap(vs[i].y1, vs[i].y2); 45 ys.push_back(vs[i].y1); 46 ys.push_back(vs[i].y2); 47 } 48 49 // compress y 50 sort(ys.begin(), ys.end()); ys.erase(unique(ys.begin(), ys.end()), ys.end()); 51 auto getY = [&](long long y){ return (int)(lower_bound(ys.begin(), ys.end(), y) - ys.begin()); }; 52 53 vector<Event> ev; ev.reserve(H*2 + V); 54 for(auto &h: hs){ 55 int yi = getY(h.y); 56 ev.push_back({h.x1, 0, h.y, h.y}); // add 57 ev.push_back({h.x2, 2, h.y, h.y}); // remove 58 } 59 for(auto &v: vs){ 60 ev.push_back({v.x, 1, v.y1, v.y2}); // query 61 } 62 63 sort(ev.begin(), ev.end()); 64 65 BIT bit((int)ys.size()); 66 long long ans = 0; 67 68 for(const auto &e: ev){ 69 if(e.type==0){ // add horizontal at y 70 bit.add(getY(e.y1), +1); 71 } else if(e.type==2){ // remove horizontal at y 72 bit.add(getY(e.y1), -1); 73 } else { 74 // query vertical: count horizontals with y in [y1,y2] 75 int l = getY(e.y1), r = getY(e.y2); 76 ans += bit.rangeSum(l, r); 77 } 78 } 79 80 cout<<ans<<"\n"; 81 return 0; 82 } 83
We sweep by x. Horizontal segments become add/remove events at their x1 and x2, updating a BIT keyed by y. Each vertical segment becomes a query at x: we count how many active horizontals have y within its [y1, y2] range. Tie-breaking (add before query before remove) ensures intersections at shared endpoints are counted.