Treap
Key Points
- •A treap is a binary search tree on keys combined with a heap on random priorities, which keeps the tree balanced in expectation.
- •Because priorities are random, the expected height is O(log n), so search, insert, erase, split, and merge all run in expected O(log n).
- •Split divides a treap into two treaps by a key threshold, and merge combines two treaps assuming all keys in the left treap are smaller than those in the right treap.
- •Treaps are simple to implement, often shorter and less error-prone than AVL or Red-Black Trees, while providing similar performance on average.
- •An implicit treap treats in-order positions as indices, enabling array-like operations such as insert at position, erase range, reverse range, and range sums.
- •Order statistics like kth smallest and rank are easy by maintaining subtree sizes.
- •Randomization is crucial: use a strong RNG (e.g., mt19937) to avoid adversarial cases that break balance.
- •Common pitfalls include incorrect split/merge invariants, forgetting to update subtree sizes or lazy flags, and mishandling duplicates.
Prerequisites
- →Binary search trees — Treaps enforce the BST property on keys; understanding BST traversal and search is essential.
- →Heaps and priority queues — Treaps also enforce a heap property on random priorities; this guides rotations or merges.
- →Big-O notation and expectations — Treap performance is analyzed in expected time; you need asymptotic and probabilistic reasoning.
- →Recursion — Split, merge, insert, and erase are implemented recursively and require careful base/recursive cases.
- →Augmented trees — Maintaining subtree sizes/sums enables order statistics and range queries.
- →Lazy propagation basics — Implicit treaps frequently use lazy flags (e.g., reverse) to apply range updates efficiently.
Detailed Explanation
Tap terms for definitions01Overview
A treap is a randomized balanced binary search tree that blends two structures: a binary search tree (BST) on keys and a heap on priorities. Each node stores a key (used to maintain in-order BST property) and a random priority (used to maintain heap property). The key insight is that if priorities are independent and uniformly random, then the resulting BST shape is distributed like a random BST, which has logarithmic expected height. That yields expected O(log n) time per operation for search, insertion, deletion, split, and merge. Treaps are attractive in competitive programming and systems work because they are simple to code (small constant factors, minimal rotation logic) and flexible to augment: you can maintain subtree sizes for order statistics, sums for range queries, and lazy propagation flags for range updates. Two core primitives make treaps especially powerful: split, which partitions the tree by key (or by index for implicit treaps), and merge, which joins two compatible treaps. Using these, you can implement complex operations like range deletions and concatenations in a few lines. In practice, treaps trade worst-case guarantees (they are probabilistic) for simplicity and excellent average-case performance, making them a go-to choice for dynamic ordered sets and sequence data structures.
02Intuition & Analogies
Imagine you’re building a seating chart (the BST) where everyone must sit in increasing order by their ticket number (the key). People also draw a random “priority” number from a hat when they arrive. Besides seating order, you impose a rule: no one can sit below someone with a lower random number (heap property). If two people compare seats, the person with better (smaller or larger depending on convention) priority sits higher in the hierarchy, but still in the right order by tickets. The random draw keeps the seating chart from becoming a long line; instead, it forms a nicely bushy hierarchy most of the time. Now think of split and merge as Lego moves. Split says: break the seating into those with ticket numbers less than K and those with ticket numbers at least K—like cutting a deck of cards at a given rank while preserving each half’s structure. Merge says: if everyone in group A has smaller tickets than everyone in group B, you can shuffle them together into one valid chart by letting the highest-priority person among the roots lead, and recursively merging the leftovers. For implicit treaps, forget explicit tickets: positions in the seating (1st, 2nd, 3rd) become the keys. This lets you do array tricks—insert at position 5, reverse positions 10 to 20, or sum a segment—by splitting around indices, tweaking the middle piece, and merging back. The magic is that randomness quietly maintains balance so all these splits and merges stay fast on average.
03Formal Definition
04When to Use
Use treaps when you need a dynamic ordered set or sequence with efficient average-case performance and simple code. Typical use cases include: (1) Ordered set/map with insert, erase, find, lower_bound, predecessor/successor, and order statistics (kth, rank), all in expected O(log n). (2) Batch operations via split/merge, such as deleting all keys in a key range [L, R] with two splits and one merge, concatenating sequences, or extracting subranges. (3) Augmented trees: maintain aggregates like subtree sum, min, max, or custom monoids to answer range queries; combine with lazy propagation flags (e.g., reverse segment, add-to-segment) for array-like updates. (4) Implicit treap as a flexible array: insert at position, erase a range, reverse a subarray, rotate segments, and compute range sums or minima—often replacing more complex structures like splay trees or ropes. Prefer treaps when coding time is tight and worst-case adversarial inputs are unlikely or can be mitigated with strong randomization. If you need strict worst-case guarantees, or the environment is adversarial (e.g., predictable RNG), balanced deterministic trees (AVL/Red-Black) may be more appropriate. For purely static arrays with only range queries, a segment tree may be simpler; for hash-based unordered operations without order statistics, an unordered_map/set might suffice.
⚠️Common Mistakes
• Inconsistent split definition: Decide whether split(T, x) puts keys < x on the left and (\ge x) on the right, or (\le x) on the left and > x on the right, and use that consistently everywhere (insert, erase, range operations). Mixing conventions leads to lost or duplicated keys. • Forgetting to update metadata: Always recompute size, sum, min/max, etc., after structural changes (rotations, merges/splits). Missing updates cause incorrect ranks, kth results, or aggregates. • Neglecting lazy propagation: In implicit treaps with reverse or add tags, always push() before descending and pull() after changing children; otherwise, indices and aggregates drift from reality. • Mishandling duplicates: Decide on a policy—either store a count in each node (multiset) or inject a tiebreaker (e.g., a unique id) into the key; naive handling can violate the BST invariant. • Weak RNG or repeated seeds: Using rand() or reseeding per operation can produce patterns that degrade performance; prefer a single long-lived mt19937 seeded from a good source. • Memory pitfalls: Not deleting detached subtrees in split-based erases causes leaks; dangling pointers can occur if you reuse node pointers after merges/splits incorrectly. • Off-by-one in implicit indices: Be explicit about 0-based vs 1-based splits; segment [l, r] typically uses split at l and r+1. • Deep recursion: Although expected depth is O(log n), pathological inputs can cause deep recursion; consider tail recursion elimination or iterative wrappers if stack limits are tight.
Key Formulas
BST Invariant
Explanation: Every node’s left subtree contains only smaller keys and the right subtree only larger keys. This ensures in-order traversal yields sorted keys.
Max-Heap Invariant
Explanation: Parent priorities dominate children’s priorities (or the reverse for min-heap). This, together with the BST property, uniquely shapes the treap.
Harmonic Numbers
Explanation: Harmonic numbers approximate the natural log up to Euler–Mascheroni constant \(\). They appear in expected depth and path length of random BSTs/treaps.
Expected Depth of Key x
Explanation: If x is the k-th smallest key among n, its expected depth is the sum of two harmonic numbers minus 2. In plain terms, nodes are expected to be O(log n) levels below the root.
Expected Height
Explanation: The expected height of a treap with n nodes is proportional to log n. This underpins the expected O(log n) time for operations.
Subtree Size
Explanation: Maintaining subtree sizes lets you find the k-th smallest element and ranks in O(log n) time by comparing k with left sizes.
Merge Complexity
Explanation: Merge descends along a path defined by priorities, whose expected length equals the tree height. Thus expected time is logarithmic.
Subtree Sum
Explanation: For implicit treaps with range sums, keep an aggregate sum per node to answer queries and maintain it after updates.
Distinct Priorities
Explanation: Assuming real-valued random priorities prevents ties with probability 1, which simplifies reasoning about structure and uniqueness.
Implicit Index
Explanation: In implicit treaps, the in-order position of a node equals one plus the size of its left subtree, enabling splits by index.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Rotation-based treap that supports duplicates (multiset), 5 // rank (count of keys < x), kth smallest, lower_bound, insert, erase. 6 // Expected O(log n) per operation. 7 8 struct Node { 9 int key; // BST key 10 uint32_t pri; // heap priority (max-heap) 11 int cnt; // multiplicity of this key 12 int sz; // size = cnt + size(left) + size(right) 13 Node *l, *r; 14 Node(int k, uint32_t p): key(k), pri(p), cnt(1), sz(1), l(nullptr), r(nullptr) {} 15 }; 16 17 static mt19937 rng((uint32_t)chrono::steady_clock::now().time_since_epoch().count()); 18 19 int getsz(Node* t){ return t ? t->sz : 0; } 20 void pull(Node* t){ if(t) t->sz = t->cnt + getsz(t->l) + getsz(t->r); } 21 22 void rotateRight(Node* &t){ 23 Node* x = t->l; // x becomes new root of this subtree 24 t->l = x->r; // move x's right subtree to t's left 25 x->r = t; // t becomes right child of x 26 pull(t); pull(x); // update sizes bottom-up 27 t = x; // set new root 28 } 29 30 void rotateLeft(Node* &t){ 31 Node* x = t->r; // x becomes new root of this subtree 32 t->r = x->l; // move x's left subtree to t's right 33 x->l = t; // t becomes left child of x 34 pull(t); pull(x); // update sizes bottom-up 35 t = x; // set new root 36 } 37 38 void insert(Node* &t, int key){ 39 if(!t){ t = new Node(key, rng()); return; } 40 if(key == t->key){ 41 t->cnt++; pull(t); return; 42 } else if(key < t->key){ 43 insert(t->l, key); 44 if(t->l && t->l->pri > t->pri) rotateRight(t); 45 } else { 46 insert(t->r, key); 47 if(t->r && t->r->pri > t->pri) rotateLeft(t); 48 } 49 pull(t); 50 } 51 52 void erase(Node* &t, int key){ 53 if(!t) return; 54 if(key == t->key){ 55 if(t->cnt > 1){ t->cnt--; pull(t); return; } 56 // remove node: rotate a higher-priority child up until node has <=1 child 57 if(!t->l){ Node* tmp = t; t = t->r; delete tmp; return; } 58 if(!t->r){ Node* tmp = t; t = t->l; delete tmp; return; } 59 if(t->l->pri > t->r->pri){ rotateRight(t); erase(t->r, key); } 60 else { rotateLeft(t); erase(t->l, key); } 61 } else if(key < t->key){ 62 erase(t->l, key); 63 } else { 64 erase(t->r, key); 65 } 66 pull(t); 67 } 68 69 // count of keys strictly less than x 70 int rankOf(Node* t, int x){ 71 if(!t) return 0; 72 if(x <= t->key) return rankOf(t->l, x); 73 return getsz(t->l) + t->cnt + rankOf(t->r, x); 74 } 75 76 // kth smallest (1-indexed); assumes 1 <= k <= size 77 int kth(Node* t, int k){ 78 if(!t) throw runtime_error("k out of bounds"); 79 int left = getsz(t->l); 80 if(k <= left) return kth(t->l, k); 81 if(k <= left + t->cnt) return t->key; 82 return kth(t->r, k - left - t->cnt); 83 } 84 85 // lower_bound: smallest key >= x; returns pair(found?, value) 86 pair<bool,int> lower_bound(Node* t, int x){ 87 int ans = 0; bool ok = false; 88 while(t){ 89 if(t->key >= x){ ans = t->key; ok = true; t = t->l; } 90 else t = t->r; 91 } 92 return {ok, ans}; 93 } 94 95 // Simple demo 96 int main(){ 97 ios::sync_with_stdio(false); 98 cin.tie(nullptr); 99 100 Node* root = nullptr; 101 vector<int> a = {5, 1, 7, 3, 5, 9, 2, 5}; 102 for(int x : a) insert(root, x); 103 104 cout << "size = " << getsz(root) << "\n"; // 8 105 cout << "rankOf(5) = " << rankOf(root, 5) << "\n"; // count of keys < 5 106 cout << "kth(4) = " << kth(root, 4) << "\n"; // 4th smallest key 107 108 auto lb = lower_bound(root, 6); 109 cout << "lower_bound(6): "; 110 if(lb.first) cout << lb.second << "\n"; else cout << "none\n"; 111 112 erase(root, 5); // removes one occurrence of 5 113 cout << "size after erasing one 5 = " << getsz(root) << "\n"; 114 115 // Clean-up omitted for brevity (would require a post-order delete) 116 return 0; 117 } 118
This rotation-based treap stores multiplicities (cnt) to act like a multiset. It maintains subtree sizes to support rank and kth queries. Rotations ensure the heap property by priority while preserving BST order. Expected time for each operation is logarithmic due to the randomized shape.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // FHQ treap: operations implemented via split/merge only (no explicit rotations). 5 // Demonstrates insert, erase by key, and delete-by-range [L, R] using two splits. 6 7 struct Node { 8 int key; // BST key (assumed unique in this demo) 9 uint32_t pri; // heap priority (max-heap) 10 int sz; // subtree size 11 Node *l, *r; 12 Node(int k, uint32_t p): key(k), pri(p), sz(1), l(nullptr), r(nullptr) {} 13 }; 14 15 static mt19937 rng((uint32_t)chrono::steady_clock::now().time_since_epoch().count()); 16 17 int getsz(Node* t){ return t ? t->sz : 0; } 18 void pull(Node* t){ if(t) t->sz = 1 + getsz(t->l) + getsz(t->r); } 19 20 // Split by key: L has keys < x, R has keys >= x 21 void split(Node* t, int x, Node* &L, Node* &R){ 22 if(!t){ L = R = nullptr; return; } 23 if(t->key < x){ 24 split(t->r, x, t->r, R); 25 L = t; pull(L); 26 } else { 27 split(t->l, x, L, t->l); 28 R = t; pull(R); 29 } 30 } 31 32 // Merge assumes all keys in L < all keys in R 33 Node* merge(Node* L, Node* R){ 34 if(!L || !R) return L ? L : R; 35 if(L->pri > R->pri){ 36 L->r = merge(L->r, R); 37 pull(L); return L; 38 } else { 39 R->l = merge(L, R->l); 40 pull(R); return R; 41 } 42 } 43 44 Node* insert(Node* root, int key){ 45 Node *A, *B; split(root, key, A, B); 46 Node* m = new Node(key, rng()); 47 return merge(merge(A, m), B); 48 } 49 50 Node* eraseKey(Node* root, int key){ 51 Node *A, *B, *C; // A: < key, B: [key, +inf) 52 split(root, key, A, B); 53 split(B, key+1, B, C); // B: == key, C: > key 54 // delete all nodes in B (here keys are unique, so B is at most one node) 55 // In general, iterate and delete B subtree 56 // Freeing memory: 57 function<void(Node*)> del = [&](Node* t){ if(!t) return; del(t->l); del(t->r); delete t; }; 58 del(B); 59 return merge(A, C); 60 } 61 62 // Delete all keys in [L, R] 63 Node* eraseRange(Node* root, int Lb, int Rb){ 64 Node *A, *B, *C; // A: < Lb, B: [Lb, +inf) 65 split(root, Lb, A, B); 66 Node *M, *C2; // M: [Lb, Rb], C2: > Rb 67 split(B, Rb+1, M, C2); 68 function<void(Node*)> del = [&](Node* t){ if(!t) return; del(t->l); del(t->r); delete t; }; 69 del(M); 70 return merge(A, C2); 71 } 72 73 int main(){ 74 ios::sync_with_stdio(false); 75 cin.tie(nullptr); 76 77 Node* root = nullptr; 78 vector<int> keys = {10, 5, 20, 7, 15, 30, 3, 8}; 79 for(int k : keys) root = insert(root, k); 80 81 cout << "size = " << getsz(root) << "\n"; // 8 82 root = eraseKey(root, 7); 83 cout << "size after erase 7 = " << getsz(root) << "\n"; // 7 84 85 root = eraseRange(root, 8, 18); // remove keys in [8,18]: 8,10,15 86 cout << "size after erase [8,18] = " << getsz(root) << "\n"; // 4 87 88 // Clean-up omitted for brevity 89 return 0; 90 } 91
This FHQ treap uses split and merge exclusively. Insertion splits by key and merges the new node in between. Erasing a single key isolates it via two splits, deletes it, and merges the remaining parts. Range deletion [L, R] takes two splits to isolate the middle and deletes it. All operations take expected logarithmic time.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Implicit treap: indices are implicit via in-order positions. 5 // Supports: insert at position, erase range, reverse range, and range sum. 6 7 struct Node { 8 long long val; // value stored at this position 9 uint32_t pri; // random priority (max-heap) 10 int sz; // subtree size (number of elements) 11 long long sum; // subtree sum aggregate 12 bool rev; // lazy reverse flag 13 Node *l, *r; 14 Node(long long v, uint32_t p): val(v), pri(p), sz(1), sum(v), rev(false), l(nullptr), r(nullptr) {} 15 }; 16 17 static mt19937 rng((uint32_t)chrono::steady_clock::now().time_since_epoch().count()); 18 19 int getsz(Node* t){ return t ? t->sz : 0; } 20 long long getsum(Node* t){ return t ? t->sum : 0LL; } 21 22 void push(Node* t){ 23 if(t && t->rev){ 24 t->rev = false; 25 swap(t->l, t->r); 26 if(t->l) t->l->rev ^= 1; 27 if(t->r) t->r->rev ^= 1; 28 } 29 } 30 31 void pull(Node* t){ 32 if(!t) return; 33 t->sz = 1 + getsz(t->l) + getsz(t->r); 34 t->sum = t->val + getsum(t->l) + getsum(t->r); 35 } 36 37 // Split by index k: first k elements go to A, rest to B (0-based k) 38 void split(Node* t, int k, Node* &A, Node* &B){ 39 if(!t){ A = B = nullptr; return; } 40 push(t); 41 int left = getsz(t->l); 42 if(k <= left){ 43 split(t->l, k, A, t->l); 44 B = t; pull(B); 45 } else { 46 split(t->r, k - left - 1, t->r, B); 47 A = t; pull(A); 48 } 49 } 50 51 Node* merge(Node* A, Node* B){ 52 if(!A || !B) return A ? A : B; 53 if(A->pri > B->pri){ 54 push(A); 55 A->r = merge(A->r, B); 56 pull(A); return A; 57 } else { 58 push(B); 59 B->l = merge(A, B->l); 60 pull(B); return B; 61 } 62 } 63 64 // Insert value v at position pos (0-based) 65 Node* insertAt(Node* root, int pos, long long v){ 66 Node *A, *B; split(root, pos, A, B); 67 Node* m = new Node(v, rng()); 68 return merge(merge(A, m), B); 69 } 70 71 // Erase range [l, r] inclusive (0-based) 72 Node* eraseRange(Node* root, int l, int r){ 73 Node *A, *B, *C; // A: [0, l-1], B: [l, n-1] 74 split(root, l, A, B); 75 Node *M, *C2; // M: [l, r], C2: [r+1, n-1] 76 split(B, r - l + 1, M, C2); 77 function<void(Node*)> del = [&](Node* t){ if(!t) return; del(t->l); del(t->r); delete t; }; 78 del(M); 79 return merge(A, C2); 80 } 81 82 // Reverse range [l, r] 83 Node* reverseRange(Node* root, int l, int r){ 84 Node *A, *B, *C; split(root, l, A, B); 85 Node *M, *C2; split(B, r - l + 1, M, C2); 86 if(M) M->rev ^= 1; // toggle reverse flag 87 return merge(A, merge(M, C2)); 88 } 89 90 // Range sum query on [l, r] 91 long long rangeSum(Node* &root, int l, int r){ 92 Node *A, *B, *C; split(root, l, A, B); 93 Node *M, *C2; split(B, r - l + 1, M, C2); 94 long long ans = getsum(M); 95 root = merge(A, merge(M, C2)); // restore 96 return ans; 97 } 98 99 int main(){ 100 ios::sync_with_stdio(false); 101 cin.tie(nullptr); 102 103 Node* root = nullptr; 104 // Build array [1, 2, 3, 4, 5] 105 for(int i = 0; i < 5; ++i) root = insertAt(root, i, i+1); 106 107 cout << "sum[0,4] = " << rangeSum(root, 0, 4) << "\n"; // 15 108 root = reverseRange(root, 1, 3); // array becomes [1,4,3,2,5] 109 cout << "sum[1,3] = " << rangeSum(root, 1, 3) << "\n"; // 4+3+2 = 9 110 111 root = eraseRange(root, 2, 3); // erase positions 2..3 -> [1,4,5] 112 cout << "sum[0,2] = " << rangeSum(root, 0, 2) << "\n"; // 10 113 114 // Clean-up omitted for brevity 115 return 0; 116 } 117
This implicit treap treats in-order positions as indices. Splitting by index partitions the sequence. A reverse is implemented by toggling a lazy rev flag and pushing it when needed. Range sums are aggregates maintained via pull(). Range updates/queries decompose into O(1) splits/merges, giving expected logarithmic performance.