Priority Queue (Heap)
Key Points
- •A priority queue returns the highest-priority element first and is efficiently implemented by a binary heap.
- •Binary heaps support O(1) top (peek) and O( n) push and pop by maintaining a complete binary tree with the heap property.
- •C++ std::priorit is a max-heap by default; use std:: or a custom comparator to make a min-heap.
- •Heaps are essential in Dijkstra’s shortest path, Prim’s MST, K-th largest/smallest problems, and merging K sorted lists.
- •Building a heap from an array can be done in O(n) time using heapify (build-heap).
- •A custom comparator defines complex orderings (e.g., by multiple fields or descending/ascending ties).
- •Standard C++ priorit lacks decrease-key; you can emulate it via lazy deletion or implement an indexed priority queue to support decreaseKey in O( n).
Prerequisites
- →Arrays and 0-based indexing — Heaps are stored in arrays and rely on child/parent index arithmetic.
- →Big-O notation — To understand logarithmic time and overall algorithmic complexity.
- →Comparators and strict weak ordering — Custom priority rules require correct comparator functions in C++.
- →C++ STL: priority_queue, vector, pair/tuple — Practical heap usage and syntax depend on these containers and utilities.
- →Graphs and shortest paths (basic) — Dijkstra/Prim require understanding of vertices, edges, and weights.
- →Sorting fundamentals — Heap sort and k-way merge build on comparison-based ordering.
- →Recursion vs iteration — Sifting can be implemented iteratively or recursively; understanding loops and invariants helps.
- →Template programming basics — Custom comparators and generic priority_queues use templates.
Detailed Explanation
Tap terms for definitions01Overview
A priority queue is a data structure that always serves the element with the highest priority first, rather than strictly following insertion order. The most common and practical implementation is the binary heap, which stores elements in an array laid out as a complete binary tree and maintains the heap property: each parent node’s key is not greater than (for min-heap) or not less than (for max-heap) its children’s keys. This structure lets us access the top element in constant time and insert or remove elements in logarithmic time by sifting them up or down the tree to restore the heap property. In C++, std::priority_queue is a standard container adaptor that uses a heap under the hood and is a max-heap by default, but can be switched to a min-heap via std::greater or a custom comparator. Heaps power many efficient algorithms: Dijkstra’s shortest path and Prim’s minimum spanning tree manage frontier edges by priority; streaming top-k queries maintain a small heap; and multiway merging uses a heap to repeatedly pick the smallest next item. Heaps can also be used to sort data (heap sort) by repeatedly extracting the maximum/minimum. While binary heaps are ubiquitous due to their simplicity and strong performance, variants such as d-ary heaps, pairing heaps, or Fibonacci heaps exist to optimize specific operations like decrease-key or to improve constant factors in practice.
02Intuition & Analogies
Imagine an emergency room with patients arriving at different times but treated by urgency, not order of arrival. The nurse always wants to see, at a glance, the most urgent case (priority), quickly admit them, and then quickly restore the waiting list so that the next most urgent case is on top. A binary heap is like keeping patients in a cleverly arranged waiting room: it’s laid out in rows so the room is always full from front to back (complete tree), and each seated patient never has a more urgent patient sitting behind them in the same narrow aisle (heap property). When a new urgent case arrives, we seat them at the back (end of the array) and let them “bubble up” by swapping seats with less urgent neighbors until they reach their rightful position. When we treat the most urgent patient (the one at the front/top), we move the last-seated patient to that empty seat and let them “sink down,” swapping with the more urgent of their children until order is restored. These local swaps are like quick seat adjustments that keep the whole room ordered with minimal fuss. The beauty is that you can always point to the most urgent patient instantly (top in O(1)), and fixing the seating after arrivals or departures takes time proportional to the number of rows (the tree height), which grows like the logarithm of the total number of patients. If you need to slightly adjust a patient’s urgency (decrease-key), the ideal is to move just that person up; standard C++ priority_queue doesn’t do this directly, but a specialized indexed priority queue tracks everyone’s seat and helps move them efficiently.
03Formal Definition
04When to Use
Use a heap-backed priority queue whenever you must repeatedly access and remove the highest (or lowest) priority element and you can tolerate O(\log n) insertion and removal. Classic applications include graph algorithms like Dijkstra’s single-source shortest paths and Prim’s minimum spanning tree, where the frontier of candidates is always queried for the minimum tentative distance or edge weight; multiway merging of K sorted lists or streams, where the next output element is the minimum head among K candidates; and top-k or kth-element problems in streaming data, where a bounded-size heap tracks the current best items with small memory. Heaps also serve in event-driven simulations (scheduled by timestamps), in scheduling tasks (prioritized by deadlines or weights), and to implement priority-based caches. If your application needs change-priority (decrease-key) frequently, either use an indexed priority queue (binary heap with an index map) or consider a different heap variant. If you need fast membership tests or arbitrary deletions by value, combine a heap with a hash map (to enable lazy deletion) or choose a balanced BST like std::multiset. For pure sorting of static data, heap sort is predictable O(n \log n) time and O(1) extra space in-place; however, in practice, std::sort (introsort) is typically faster due to lower constants and cache behavior.
⚠️Common Mistakes
• Using the wrong heap direction in C++: std::priority_queue is a max-heap by default. To get a min-heap of int, declare priority_queue<int, vector<int>, greater<int>> pq; forgetting greater<> flips your logic. • Confusing pair ordering: priority_queue<pair<int,int>, vector<pair<int,int>>, greater<pair<int,int>>> is a min-heap by first, then second. If you need a different tie-break, write a custom comparator. • Expecting decrease-key in std::priority_queue: It doesn’t exist. Either push a new pair (lazy decrease) and ignore stale entries when popped, or implement an indexed priority queue to support decreaseKey in O(\log n). • Misusing comparator semantics: The comparator should impose a strict weak ordering. Don’t capture mutable state that breaks transitivity; and remember that with std::greater<T>, top() becomes the minimum element. • Assuming pop() returns a value: In C++, pop() returns void. You must fetch top() first, then pop(). • Forgetting build-heap is O(n): Repeatedly pushing n elements individually yields O(n \log n); heapify the whole array when possible to build faster. • Off-by-one index errors in custom heaps: With 0-based arrays, children are at 2i+1 and 2i+2; parent is (i-1)/2. Mixing 1-based and 0-based formulas breaks sifting. • Using a heap for operations it’s not suited for: Arbitrary removal, membership checks, or ordered iteration are not efficient with a plain heap—use a tree (multiset) if you need those.
Key Formulas
Heap Height
Explanation: A binary heap with n elements has height proportional to the base-2 logarithm of n. This bounds the number of swaps during sifting operations.
Array Index Mapping (0-based)
Explanation: These formulas map the array index of a node to its children and parent in a binary heap stored in an array. They enable O(1) navigation without pointers.
Core Operation Costs
Explanation: Insert and remove operations traverse up or down the heap height, yielding logarithmic time, while accessing the top element is constant time.
Linear-Time Heapify
Explanation: Building a heap bottom-up is linear time because lower levels require less work per node and there are many more nodes near the leaves.
Heap Sort Complexity
Explanation: Heapsort performs n extract operations at O( n) each after a linear heapify, and it can be done in-place with constant extra memory.
Dijkstra with Binary Heap
Explanation: Using a binary heap, each edge relaxation may push into the heap and each vertex is popped once, leading to logarithmic factors times edges and vertices.
Merge K Sorted Lists
Explanation: Merging K sorted sequences with total N items uses a min-heap of size K to always pick the next smallest head, costing log K per output.
Top-k via Bounded Heap
Explanation: Maintaining a min-heap of size k while scanning n elements yields logarithmic-in-k costs per insertion, efficient for k n.
Build-Heap Work Summation
Explanation: A classic summation shows why sifting down nodes across levels adds up to linear total work when building a heap.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Task { 5 int priority; // smaller number = higher priority 6 int id; // tie-breaker 7 }; 8 9 // Custom comparator for a min-heap of Task by (priority, id) 10 struct TaskCmp { 11 bool operator()(const Task &a, const Task &b) const { 12 // priority_queue puts the element considered "largest" at top. 13 // Returning true means a should come after b. 14 // To make a min-heap by (priority, id), we return true if a > b lexicographically. 15 if (a.priority != b.priority) return a.priority > b.priority; // smaller priority is better 16 return a.id > b.id; // tie-break by smaller id first 17 } 18 }; 19 20 int main() { 21 // 1) Default max-heap of ints 22 priority_queue<int> maxpq; // top() is the largest 23 for (int x : {5, 1, 7, 3}) maxpq.push(x); 24 cout << "Max-heap order: "; 25 while (!maxpq.empty()) { 26 cout << maxpq.top() << ' '; 27 maxpq.pop(); 28 } 29 cout << "\n"; 30 31 // 2) Min-heap of ints using greater<int> 32 priority_queue<int, vector<int>, greater<int>> minpq; // top() is the smallest 33 for (int x : {5, 1, 7, 3}) minpq.push(x); 34 cout << "Min-heap order: "; 35 while (!minpq.empty()) { 36 cout << minpq.top() << ' '; 37 minpq.pop(); 38 } 39 cout << "\n"; 40 41 // 3) Min-heap of custom tasks with tie-breaks 42 priority_queue<Task, vector<Task>, TaskCmp> tasks; 43 tasks.push({3, 42}); 44 tasks.push({1, 10}); 45 tasks.push({1, 5}); 46 tasks.push({2, 7}); 47 48 cout << "Tasks by (priority,id): "; 49 while (!tasks.empty()) { 50 auto t = tasks.top(); tasks.pop(); 51 cout << '(' << t.priority << ',' << t.id << ") "; 52 } 53 cout << "\n"; 54 55 return 0; 56 } 57
Shows the default max-heap behavior, converts it to a min-heap using std::greater<int>, and demonstrates a custom comparator that orders a struct by priority then id. The comparator returns true when the left element should come after the right, which is how priority_queue interprets ordering.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Dijkstra with non-negative edge weights. 5 // Uses a min-heap of pairs (dist, node). Lazy deletion ignores stale entries. 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int n, m, s; // nodes [0..n-1], edges, source 12 cin >> n >> m >> s; 13 vector<vector<pair<int,int>>> g(n); // (neighbor, weight) 14 for (int i = 0; i < m; ++i) { 15 int u, v, w; cin >> u >> v >> w; 16 g[u].push_back({v, w}); 17 g[v].push_back({u, w}); // comment out if directed 18 } 19 20 const long long INF = (1LL<<60); 21 vector<long long> dist(n, INF); 22 dist[s] = 0; 23 24 using P = pair<long long,int>; 25 priority_queue<P, vector<P>, greater<P>> pq; // min-heap by distance 26 pq.push({0, s}); 27 28 while (!pq.empty()) { 29 auto [du, u] = pq.top(); pq.pop(); 30 if (du != dist[u]) continue; // stale entry (lazy deletion) 31 for (auto [v, w] : g[u]) { 32 if (dist[v] > dist[u] + w) { 33 dist[v] = dist[u] + w; 34 pq.push({dist[v], v}); // no decrease-key: push new better pair 35 } 36 } 37 } 38 39 for (int i = 0; i < n; ++i) { 40 cout << (dist[i] == INF ? -1 : dist[i]) << (i+1==n?'\n':' '); 41 } 42 return 0; 43 } 44
Maintains tentative distances in a min-heap prioritized by the smallest distance. Since std::priority_queue lacks decrease-key, we push an updated (dist, node) pair whenever we find a better distance and skip entries that do not match the current best dist (lazy deletion).
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Node { 5 int val; // current value 6 int i; // which array 7 int j; // index within that array 8 }; 9 10 struct Cmp { 11 bool operator()(const Node &a, const Node &b) const { 12 return a.val > b.val; // min-heap by value 13 } 14 }; 15 16 int main() { 17 ios::sync_with_stdio(false); 18 cin.tie(nullptr); 19 20 int K; cin >> K; 21 vector<vector<int>> a(K); 22 for (int i = 0; i < K; ++i) { 23 int len; cin >> len; 24 a[i].resize(len); 25 for (int j = 0; j < len; ++j) cin >> a[i][j]; 26 } 27 28 priority_queue<Node, vector<Node>, Cmp> pq; 29 for (int i = 0; i < K; ++i) if (!a[i].empty()) pq.push({a[i][0], i, 0}); 30 31 vector<int> merged; 32 while (!pq.empty()) { 33 Node cur = pq.top(); pq.pop(); 34 merged.push_back(cur.val); 35 int ni = cur.i, nj = cur.j + 1; 36 if (nj < (int)a[ni].size()) { 37 pq.push({a[ni][nj], ni, nj}); 38 } 39 } 40 41 for (int x : merged) cout << x << ' '; 42 cout << '\n'; 43 return 0; 44 } 45
Push the first element of each sorted array into a min-heap, then repeatedly extract the smallest and push the next element from that array. The heap size stays at most K, giving an efficient O(N log K) total where N is the sum of lengths.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Indexed Min-Priority Queue for keys associated with fixed indices [0..N-1]. 5 // Supports: push(i, key), decreaseKey(i, newKey), contains(i), topIndex(), topKey(), pop(). 6 // All heap operations are O(log n). This is handy for Dijkstra/Prim without lazy deletion. 7 8 struct IndexedMinPQ { 9 int N; // maximum index 10 vector<long long> key; // key[i] is the current key for index i 11 vector<int> pq; // heap of indices: pq[pos] = index at heap position pos 12 vector<int> pos; // pos[i] = position of index i in heap, or -1 if not in heap 13 14 IndexedMinPQ(int n) : N(n), key(n, LLONG_MAX), pos(n, -1) {} 15 16 bool contains(int i) const { return pos[i] != -1; } 17 bool empty() const { return pq.empty(); } 18 19 bool lessIdx(int a, int b) const { return key[a] < key[b]; } 20 21 void swapPos(int i, int j) { 22 swap(pq[i], pq[j]); 23 pos[pq[i]] = i; pos[pq[j]] = j; 24 } 25 26 void siftUp(int i) { 27 while (i > 0) { 28 int p = (i - 1) / 2; 29 if (!lessIdx(pq[i], pq[p])) break; 30 swapPos(i, p); 31 i = p; 32 } 33 } 34 35 void siftDown(int i) { 36 int n = pq.size(); 37 while (true) { 38 int l = 2*i + 1, r = 2*i + 2, best = i; 39 if (l < n && lessIdx(pq[l], pq[best])) best = l; 40 if (r < n && lessIdx(pq[r], pq[best])) best = r; 41 if (best == i) break; 42 swapPos(i, best); 43 i = best; 44 } 45 } 46 47 void push(int i, long long k) { 48 if (contains(i)) { 49 if (k < key[i]) { key[i] = k; siftUp(pos[i]); } 50 else { key[i] = k; siftDown(pos[i]); } 51 return; 52 } 53 key[i] = k; 54 pq.push_back(i); 55 pos[i] = (int)pq.size() - 1; 56 siftUp(pos[i]); 57 } 58 59 void decreaseKey(int i, long long nk) { 60 if (!contains(i) || nk >= key[i]) return; // no-op if not smaller 61 key[i] = nk; 62 siftUp(pos[i]); 63 } 64 65 int topIndex() const { return pq.empty() ? -1 : pq[0]; } 66 long long topKey() const { return pq.empty() ? LLONG_MAX : key[pq[0]]; } 67 68 pair<int,long long> pop() { 69 int i = pq[0]; long long k = key[i]; 70 int last = pq.back(); pq.pop_back(); 71 pos[i] = -1; 72 if (!pq.empty()) { 73 pq[0] = last; pos[last] = 0; siftDown(0); 74 } 75 return {i, k}; 76 } 77 }; 78 79 int main() { 80 // Example: manage keys for indices 0..5 81 IndexedMinPQ ipq(6); 82 ipq.push(3, 50); 83 ipq.push(1, 30); 84 ipq.push(5, 40); 85 ipq.decreaseKey(3, 10); // index 3 becomes best 86 ipq.push(2, 20); 87 88 while (!ipq.empty()) { 89 auto [i, k] = ipq.pop(); 90 cout << "(idx=" << i << ", key=" << k << ") "; 91 } 92 cout << "\n"; 93 return 0; 94 } 95
Implements a classic indexed min-heap by storing a heap of indices and two maps: key[i] gives the priority, and pos[i] gives the heap position for index i. decreaseKey updates key[i] and sifts it upward in O(log n). This supports algorithms needing true decrease-key semantics.