Ordered Set and Map
Key Points
- •std::set and std::map store elements in sorted order using a balanced binary search tree (typically a Red-Black Tree).
- •All core operations—search, insert, erase, lowe, and uppe—run in O(log n) time.
- •std::multiset allows duplicates, while std::set and std::map do not; use equa to handle all duplicates efficiently.
- •Custom comparators define the sorting order and must implement a consistent strict weak ordering.
- •Use lowe (first element >= key) and uppe (first ) for neighbor and range queries.
- •GNU pbds orde_tree adds fin_order (k-th element) and orde_key (rank) in O(log n).
- •For duplicates with orde_tree, store pairs like (value, uniqu) to simulate multiset behavior.
- •Choose ordered containers for dynamic sorted data and neighbor queries; use unordered containers when order is unnecessary and average O(1) lookup is preferable.
Prerequisites
- →Binary Search — Understanding boundary conditions and the idea behind lower_bound and upper_bound.
- →Asymptotic Analysis — To reason about O(log n), O(n log n), and range iteration costs.
- →Pointers and Iterators in C++ — To safely traverse, erase, and manage references/iterators in containers.
- →Comparator Functions and Functors — To define custom orderings that satisfy strict weak ordering.
- →Tree Data Structures — To grasp how balanced BSTs maintain order and enable logarithmic-time operations.
- →C++ Templates and STL Basics — Ordered containers are template-based and rely on STL conventions.
Detailed Explanation
Tap terms for definitions01Overview
Ordered sets and maps in C++ (std::set, std::map, and their multi- variants) are associative containers that automatically keep their elements sorted. Internally, most standard library implementations use a Red-Black Tree (a balanced binary search tree), ensuring that operations like search, insertion, deletion, and boundary queries such as lower_bound and upper_bound take O(log n) time. std::set stores unique keys, std::multiset stores keys allowing duplicates, std::map maps unique keys to values, and std::multimap allows multiple values for the same key. Their ordering is controlled by a comparator (default is std::less<...>), but you can supply a custom comparator to sort in descending order or by custom fields (e.g., sort points by x then y). Because elements are stored in a tree, iterating through the container yields elements in sorted order. This makes ordered containers ideal for problems requiring dynamic maintenance of a sorted set of numbers or keys, neighbor queries (predecessor/successor), and range queries. For index-based queries (k-th smallest or rank), the GNU policy-based data structure order_statistics_tree extends a Red-Black Tree with find_by_order and order_of_key, which are not available in the standard containers.
02Intuition & Analogies
Imagine a bookshelf where every time you add a book, it magically slides into its correct alphabetical spot without you rearranging the whole shelf. That’s std::set and std::map: they keep things sorted automatically. If you want to quickly find where a book would go, you can ask for the first spot where a title should appear (lower_bound) or the first spot strictly after it (upper_bound). If you allow multiple copies of the same title (duplicates), think of a section where identical books can sit together in consecutive slots—this is std::multiset. Now, if you also want to ask, “What is the 5th book on the shelf?” or “How many books come before this title?”, you need a special shelf with index markers built-in—this is order_statistics_tree from pbds. Custom comparators are like changing your sorting rules: instead of alphabetically by title, maybe you sort by author and then by year. The only rule is to be consistent—your sorting logic must never contradict itself, or the shelf system breaks. Because the shelf is balanced (thanks to Red-Black Tree rules), adding or finding any book still takes only around log n steps, even when the shelf grows large.
03Formal Definition
04When to Use
Use std::set or std::map when you need elements always kept in sorted order and you expect to query neighbors (predecessor/successor), do range queries via iterators, or need deterministic iteration order. Examples include maintaining the active set of segments in a sweep line algorithm, tracking current contestants’ scores to find the next higher/lower score, or keeping timestamps to find the next event after time t. Choose std::multiset or std::multimap when duplicates must be stored, such as counting multiple identical values while still needing them in sorted order (e.g., a sliding window median where values can repeat). If you need to answer 'k-th smallest' and 'rank' queries directly, switch to GNU pbds order_statistics_tree, commonly available in competitive programming environments like Codeforces with -std=gnu++17. When order is not required and the main goal is constant average-time lookup by key, prefer unordered_set or unordered_map for potential performance gains. However, in adversarial cases or when hashing is tricky, ordered containers’ guaranteed O(log n) might be safer. Also, ordered containers are ideal when you need stable references/iterators (not invalidated by insertions except erasures of the referred element).
⚠️Common Mistakes
- Misusing erase with multiset: ms.erase(value) removes all occurrences; to remove a single occurrence, find an iterator and erase it (ms.erase(ms.find(value))). 2) Writing a comparator that is not a strict weak ordering (e.g., returns false both ways for different keys, or depends on mutable external state). This leads to undefined behavior, including elements “disappearing” or endless loops. 3) Confusing lower_bound and upper_bound semantics: lower_bound(x) finds the first element >= x, while upper_bound(x) finds the first element > x. 4) Expecting random access: std::set and std::map have bidirectional iterators, not random access—advance is O(k), not O(1). 5) Assuming duplicates in std::set/std::map: they do not allow them; use std::multiset/std::multimap or encode counts in the value. 6) Using pbds for duplicates without a workaround: order_statistics_tree doesn’t support less_equal; encode duplicates as pairs (value, unique_id). 7) Forgetting that only iterators to erased elements are invalidated; other iterators and references stay valid after insertions/erases elsewhere (unlike vectors). 8) Overusing ordered containers when unordered containers suffice; if you don’t need order or neighbor queries, unordered_map may be faster on average. 9) Using a comparator inconsistent with key equivalence: equal_range relies on the comparator’s equivalence relation (neither a<b nor b<a); violations break range queries. 10) Assuming std::map::operator[] works like at for const access: operator[] inserts a default-constructed value if the key is absent; use at or find for read-only lookups.
Key Formulas
Red-Black Tree Height Bound
Explanation: In a Red-Black Tree with n nodes, the height h is at most twice the base-2 logarithm of n+1. This guarantees logarithmic-time operations.
Search Complexity
Explanation: Searching for a key in an ordered set/map follows a root-to-leaf path in the balanced tree, which is proportional to its height.
Update Complexities
Explanation: Insertions and deletions locate a position in O(log n) and then fix balances with a constant number of rotations and recolorings.
Range Query by Iteration
Explanation: Finding the start of a range takes O(log n) via lowe, and iterating k elements costs O(k) more.
Building by Insertions
Explanation: Inserting n elements one by one into an ordered container costs O(log n) each on average, totaling O(n log n).
Multiplicity via equal_range
Explanation: The size of the range returned by equa(x) equals the count of elements equivalent to x, useful in multisets.
Rank Definition
Explanation: In an order-statistics tree, orde_key(x) returns how many elements are strictly less than x, i.e., x's rank.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 int main() { 5 // std::set: unique, ordered integers 6 set<int> s; 7 s.insert(5); 8 s.insert(1); 9 s.insert(3); 10 s.insert(3); // duplicate ignored in set 11 12 cout << "Set contents (sorted): "; 13 for (int x : s) cout << x << ' '; // 1 3 5 14 cout << "\n"; 15 16 // lower_bound: first element >= key 17 auto it1 = s.lower_bound(3); // points to 3 18 if (it1 != s.end()) cout << "lower_bound(3) = " << *it1 << "\n"; 19 20 // upper_bound: first element > key 21 auto it2 = s.upper_bound(3); // points to 5 22 if (it2 != s.end()) cout << "upper_bound(3) = " << *it2 << "\n"; 23 24 // Erase a key (if present) 25 s.erase(3); // O(log n) 26 cout << "After erase(3): "; 27 for (int x : s) cout << x << ' '; // 1 5 28 cout << "\n"; 29 30 // Neighbor queries (predecessor/successor of x) 31 int x = 4; 32 auto it = s.lower_bound(x); // first >= 4, so 5 here 33 if (it != s.end()) cout << "successor(4) = " << *it << "\n"; // 5 34 if (it != s.begin()) { 35 auto pred = prev(it); 36 cout << "predecessor(4) = " << *pred << "\n"; // 1 37 } 38 39 // std::map: key->value in sorted key order 40 map<string, int> freq; 41 freq["alice"] = 2; // inserts if absent, then assigns 42 freq["bob"] = 5; 43 freq.insert({"carol", 3}); 44 45 cout << "Map iteration (sorted by key):\n"; 46 for (auto &kv : freq) { 47 cout << kv.first << " -> " << kv.second << "\n"; 48 } 49 50 // lower_bound on map finds first key >= given key 51 auto itM = freq.lower_bound("b"); // first key >= "b" (i.e., "bob") 52 if (itM != freq.end()) { 53 cout << "map.lower_bound(\"b\") = (" << itM->first << ", " << itM->second << ")\n"; 54 } 55 56 // equal_range on set: range of all elements equal to key 57 auto p = s.equal_range(5); // [first,last) where elements == 5 58 cout << "equal_range(5) count = " << distance(p.first, p.second) << "\n"; 59 60 return 0; 61 } 62
Demonstrates insertion, iteration, boundary queries, neighbor queries, and map key lookup semantics. Shows that std::set ignores duplicates and that map is ordered by keys. equal_range returns the subrange of equal keys (size 0 or 1 in a set).
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 int main() { 5 multiset<int> ms; 6 // Insert duplicates 7 ms.insert(3); 8 ms.insert(5); 9 ms.insert(3); 10 ms.insert(7); 11 ms.insert(3); 12 13 cout << "Multiset: "; 14 for (int v : ms) cout << v << ' '; // 3 3 3 5 7 15 cout << "\n"; 16 17 cout << "count(3) = " << ms.count(3) << "\n"; // 3 occurrences 18 19 // Erase a single occurrence (use iterator!) 20 auto it = ms.find(3); // O(log n) 21 if (it != ms.end()) ms.erase(it); // removes only one 3 22 23 cout << "After erasing one 3: "; 24 for (int v : ms) cout << v << ' '; // 3 3 5 7 25 cout << "\n"; 26 27 // Erase all copies of 3 efficiently using equal_range 28 auto range = ms.equal_range(3); // [first,last) of all 3s 29 ms.erase(range.first, range.second); // O(log n + c), c = number erased 30 31 cout << "After erasing all 3s: "; 32 for (int v : ms) cout << v << ' '; // 5 7 33 cout << "\n"; 34 35 return 0; 36 } 37
Shows how multiset keeps duplicates in sorted order, how to remove exactly one occurrence via iterator, and how equal_range can remove all duplicates in one shot.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Point { 5 int x, y; 6 }; 7 8 // Sort by x ascending, then y descending 9 struct ByXThenYDesc { 10 bool operator()(const Point &a, const Point &b) const { 11 if (a.x != b.x) return a.x < b.x; 12 return a.y > b.y; // for equal x, larger y comes first 13 } 14 }; 15 16 // For printing 17 ostream& operator<<(ostream& os, const Point& p) { 18 return os << '(' << p.x << ',' << p.y << ')'; 19 } 20 21 int main() { 22 set<Point, ByXThenYDesc> st; 23 st.insert({2, 5}); 24 st.insert({1, 10}); 25 st.insert({2, 3}); 26 st.insert({1, 7}); 27 28 cout << "Custom-ordered set: "; 29 for (const auto &p : st) cout << p << ' '; 30 cout << "\n"; // (1,10) (1,7) (2,5) (2,3) 31 32 // lower_bound uses the same comparator semantics 33 Point probe{2, INT_MAX}; // first element >= (2, +inf) under comparator 34 auto it = st.lower_bound(probe); 35 if (it != st.end()) cout << "lower_bound((2,inf)) = " << *it << "\n"; // (2,5) 36 37 // map with custom key comparator 38 map<Point, string, ByXThenYDesc> mp; 39 mp[{2, 5}] = "A"; 40 mp[{1, 7}] = "B"; 41 mp[{1, 10}] = "C"; // replaces value for equivalent key? No: keys are distinct since comparator orders them strictly 42 43 cout << "Map iteration (custom order):\n"; 44 for (auto &kv : mp) cout << kv.first << " -> " << kv.second << "\n"; 45 46 return 0; 47 } 48
Defines a strict weak ordering that sorts by x ascending and, when x ties, by y descending. Demonstrates that lower_bound uses the same comparator and shows how to define a map keyed by a struct. The comparator is stateless and consistent, preventing undefined behavior.
1 #include <bits/stdc++.h> 2 #include <ext/pb_ds/assoc_container.hpp> 3 #include <ext/pb_ds/tree_policy.hpp> 4 using namespace std; 5 using namespace __gnu_pbds; 6 7 // Alias for order-statistics tree with unique keys 8 template <class T> 9 using ordered_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>; 10 11 int main() { 12 // Unique elements example 13 ordered_set<int> ost; 14 ost.insert(10); 15 ost.insert(30); 16 ost.insert(20); 17 18 cout << "0-th element (min) = " << *ost.find_by_order(0) << "\n"; // 10 19 cout << "order_of_key(25) = " << ost.order_of_key(25) << "\n"; // 2 elements < 25 20 21 // Duplicates workaround: store (value, unique_id) 22 using P = pair<int,int>; 23 ordered_set<P> ms_like; 24 int uid = 0; 25 auto add = [&](int v){ ms_like.insert({v, uid++}); }; 26 27 add(20); add(10); add(20); add(30); add(20); 28 29 // Count elements < 20: use (-inf) as second component 30 cout << "count(<20) = " << ms_like.order_of_key({20, INT_MIN}) << "\n"; // 1 (only 10) 31 // Count elements <= 20: use (+inf) as second component 32 cout << "count(<=20) = " << ms_like.order_of_key({20, INT_MAX}) << "\n"; // 1 + 3 = 4 33 34 // Get k-th smallest with duplicates 35 int k = 2; // 0-based 36 auto it = ms_like.find_by_order(k); 37 if (it != ms_like.end()) cout << "k-th element (k=2) = " << it->first << "\n"; // value part 38 39 // Erase a single occurrence of 20: find by order/rank or lower_bound pair 40 // Find first 20 via lower_bound on pair 41 auto it20 = ms_like.lower_bound({20, INT_MIN}); 42 if (it20 != ms_like.end() && it20->first == 20) ms_like.erase(it20); 43 44 cout << "After erasing one 20, count(<=20) = " 45 << ms_like.order_of_key({20, INT_MAX}) << "\n"; // decreased by 1 46 47 return 0; 48 } 49
Uses GNU pbds to get k-th element and rank queries in O(log n). Shows a standard trick to support duplicates by storing (value, unique_id) pairs and using INT_MIN/INT_MAX sentinels for < and <= counts.