🗂️Data StructureIntermediate

Priority Queue (Heap)

Key Points

  • A priority queue returns the highest-priority element first and is efficiently implemented by a binary heap.
  • Binary heaps support O(1) top (peek) and O( n) push and pop by maintaining a complete binary tree with the heap property.
  • C++ std::priorit is a max-heap by default; use std:: or a custom comparator to make a min-heap.
  • Heaps are essential in Dijkstra’s shortest path, Prim’s MST, K-th largest/smallest problems, and merging K sorted lists.
  • Building a heap from an array can be done in O(n) time using heapify (build-heap).
  • A custom comparator defines complex orderings (e.g., by multiple fields or descending/ascending ties).
  • Standard C++ priorit lacks decrease-key; you can emulate it via lazy deletion or implement an indexed priority queue to support decreaseKey in O( n).

Prerequisites

  • Arrays and 0-based indexingHeaps are stored in arrays and rely on child/parent index arithmetic.
  • Big-O notationTo understand logarithmic time and overall algorithmic complexity.
  • Comparators and strict weak orderingCustom priority rules require correct comparator functions in C++.
  • C++ STL: priority_queue, vector, pair/tuplePractical heap usage and syntax depend on these containers and utilities.
  • Graphs and shortest paths (basic)Dijkstra/Prim require understanding of vertices, edges, and weights.
  • Sorting fundamentalsHeap sort and k-way merge build on comparison-based ordering.
  • Recursion vs iterationSifting can be implemented iteratively or recursively; understanding loops and invariants helps.
  • Template programming basicsCustom comparators and generic priority_queues use templates.

Detailed Explanation

Tap terms for definitions

01Overview

A priority queue is a data structure that always serves the element with the highest priority first, rather than strictly following insertion order. The most common and practical implementation is the binary heap, which stores elements in an array laid out as a complete binary tree and maintains the heap property: each parent node’s key is not greater than (for min-heap) or not less than (for max-heap) its children’s keys. This structure lets us access the top element in constant time and insert or remove elements in logarithmic time by sifting them up or down the tree to restore the heap property. In C++, std::priority_queue is a standard container adaptor that uses a heap under the hood and is a max-heap by default, but can be switched to a min-heap via std::greater or a custom comparator. Heaps power many efficient algorithms: Dijkstra’s shortest path and Prim’s minimum spanning tree manage frontier edges by priority; streaming top-k queries maintain a small heap; and multiway merging uses a heap to repeatedly pick the smallest next item. Heaps can also be used to sort data (heap sort) by repeatedly extracting the maximum/minimum. While binary heaps are ubiquitous due to their simplicity and strong performance, variants such as d-ary heaps, pairing heaps, or Fibonacci heaps exist to optimize specific operations like decrease-key or to improve constant factors in practice.

02Intuition & Analogies

Imagine an emergency room with patients arriving at different times but treated by urgency, not order of arrival. The nurse always wants to see, at a glance, the most urgent case (priority), quickly admit them, and then quickly restore the waiting list so that the next most urgent case is on top. A binary heap is like keeping patients in a cleverly arranged waiting room: it’s laid out in rows so the room is always full from front to back (complete tree), and each seated patient never has a more urgent patient sitting behind them in the same narrow aisle (heap property). When a new urgent case arrives, we seat them at the back (end of the array) and let them “bubble up” by swapping seats with less urgent neighbors until they reach their rightful position. When we treat the most urgent patient (the one at the front/top), we move the last-seated patient to that empty seat and let them “sink down,” swapping with the more urgent of their children until order is restored. These local swaps are like quick seat adjustments that keep the whole room ordered with minimal fuss. The beauty is that you can always point to the most urgent patient instantly (top in O(1)), and fixing the seating after arrivals or departures takes time proportional to the number of rows (the tree height), which grows like the logarithm of the total number of patients. If you need to slightly adjust a patient’s urgency (decrease-key), the ideal is to move just that person up; standard C++ priority_queue doesn’t do this directly, but a specialized indexed priority queue tracks everyone’s seat and helps move them efficiently.

03Formal Definition

A binary heap is a complete binary tree along with a key for each node such that the heap property holds: for a min-heap, each node’s key is less than or equal to the keys of its children; for a max-heap, each node’s key is greater than or equal to the keys of its children. Because it is complete, a binary heap of n elements has height h = n . It is stored compactly in an array A where the node at index i has children at indices 2i+1 and 2i+2 and a parent at (i-1)/2 (0-based indexing). The primary operations are: top (return A[0]) in O(1) time; push (insert a new element by appending at the end and sifting up) in O( n); and pop (remove top by swapping A[0] with A[n-1], shrinking, and sifting down) in O( n). The build-heap (heapify) operation transforms an arbitrary array into a heap in O(n) time by sifting down internal nodes from bottom-up. In C++, std::priority_, Container, these operations where Compare defines ordering; by default makes a max-heap. While binary heaps do not support decrease-key efficiently without extra machinery, an indexed priority queue augments the heap with position maps so that decreaseKey(i, newKey) takes O( n) by sifting the affected element toward the root.

04When to Use

Use a heap-backed priority queue whenever you must repeatedly access and remove the highest (or lowest) priority element and you can tolerate O(\log n) insertion and removal. Classic applications include graph algorithms like Dijkstra’s single-source shortest paths and Prim’s minimum spanning tree, where the frontier of candidates is always queried for the minimum tentative distance or edge weight; multiway merging of K sorted lists or streams, where the next output element is the minimum head among K candidates; and top-k or kth-element problems in streaming data, where a bounded-size heap tracks the current best items with small memory. Heaps also serve in event-driven simulations (scheduled by timestamps), in scheduling tasks (prioritized by deadlines or weights), and to implement priority-based caches. If your application needs change-priority (decrease-key) frequently, either use an indexed priority queue (binary heap with an index map) or consider a different heap variant. If you need fast membership tests or arbitrary deletions by value, combine a heap with a hash map (to enable lazy deletion) or choose a balanced BST like std::multiset. For pure sorting of static data, heap sort is predictable O(n \log n) time and O(1) extra space in-place; however, in practice, std::sort (introsort) is typically faster due to lower constants and cache behavior.

⚠️Common Mistakes

• Using the wrong heap direction in C++: std::priority_queue is a max-heap by default. To get a min-heap of int, declare priority_queue<int, vector<int>, greater<int>> pq; forgetting greater<> flips your logic. • Confusing pair ordering: priority_queue<pair<int,int>, vector<pair<int,int>>, greater<pair<int,int>>> is a min-heap by first, then second. If you need a different tie-break, write a custom comparator. • Expecting decrease-key in std::priority_queue: It doesn’t exist. Either push a new pair (lazy decrease) and ignore stale entries when popped, or implement an indexed priority queue to support decreaseKey in O(\log n). • Misusing comparator semantics: The comparator should impose a strict weak ordering. Don’t capture mutable state that breaks transitivity; and remember that with std::greater<T>, top() becomes the minimum element. • Assuming pop() returns a value: In C++, pop() returns void. You must fetch top() first, then pop(). • Forgetting build-heap is O(n): Repeatedly pushing n elements individually yields O(n \log n); heapify the whole array when possible to build faster. • Off-by-one index errors in custom heaps: With 0-based arrays, children are at 2i+1 and 2i+2; parent is (i-1)/2. Mixing 1-based and 0-based formulas breaks sifting. • Using a heap for operations it’s not suited for: Arbitrary removal, membership checks, or ordered iteration are not efficient with a plain heap—use a tree (multiset) if you need those.

Key Formulas

Heap Height

Explanation: A binary heap with n elements has height proportional to the base-2 logarithm of n. This bounds the number of swaps during sifting operations.

Array Index Mapping (0-based)

Explanation: These formulas map the array index of a node to its children and parent in a binary heap stored in an array. They enable O(1) navigation without pointers.

Core Operation Costs

Explanation: Insert and remove operations traverse up or down the heap height, yielding logarithmic time, while accessing the top element is constant time.

Linear-Time Heapify

Explanation: Building a heap bottom-up is linear time because lower levels require less work per node and there are many more nodes near the leaves.

Heap Sort Complexity

Explanation: Heapsort performs n extract operations at O( n) each after a linear heapify, and it can be done in-place with constant extra memory.

Dijkstra with Binary Heap

Explanation: Using a binary heap, each edge relaxation may push into the heap and each vertex is popped once, leading to logarithmic factors times edges and vertices.

Merge K Sorted Lists

Explanation: Merging K sorted sequences with total N items uses a min-heap of size K to always pick the next smallest head, costing log K per output.

Top-k via Bounded Heap

Explanation: Maintaining a min-heap of size k while scanning n elements yields logarithmic-in-k costs per insertion, efficient for k n.

Build-Heap Work Summation

Explanation: A classic summation shows why sifting down nodes across levels adds up to linear total work when building a heap.

Complexity Analysis

In a binary heap, the array representation gives O(1) access to the root (top). The height h of a complete binary tree with n elements is h = ⌊log2 n⌋, so any path from a leaf to the root has logarithmic length. Insertion (push) appends the new key at the end and performs sift-up: at each step, we compare with the parent and swap if the heap property is violated. The number of swaps is at most the height, giving O(log n) time and O(1) extra space. Removal (pop) swaps the root with the last element, shrinks the array, then sifts the new root down by repeatedly picking the better child and swapping, again taking O(log n) time and O(1) extra space. Accessing the top is O(1) since it is the first array element. Building a heap from an unsorted array can be done in O(n) time by sifting down all internal nodes from the bottom up (heapify). Although an individual sift-down can cost O(log n), most nodes are near the leaves and move only a short distance; summing the work per level yields linear total time. Heapsort performs a build-heap in O(n) followed by n extract-max/min steps, each O(log n), for O(n log n) time and O(1) extra space when sorting in-place. In algorithmic applications, Dijkstra’s algorithm with a binary heap runs in O((n + m) log n) for n vertices and m edges. K-way merging and top-k selection benefit from heaps with complexities O(N log K) and O(n log k), respectively. Standard C++ priorit lacks decrease-key; using lazy deletion may incur extra memory and up to logarithmic overhead per stale item processed. An indexed priority queue augments the heap with position maps, supporting decreaseKey and delete in O(log n) while using O(n) additional memory for the maps.

Code Examples

Basics: Max-heap, Min-heap, and Custom Comparator
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Task {
5 int priority; // smaller number = higher priority
6 int id; // tie-breaker
7};
8
9// Custom comparator for a min-heap of Task by (priority, id)
10struct TaskCmp {
11 bool operator()(const Task &a, const Task &b) const {
12 // priority_queue puts the element considered "largest" at top.
13 // Returning true means a should come after b.
14 // To make a min-heap by (priority, id), we return true if a > b lexicographically.
15 if (a.priority != b.priority) return a.priority > b.priority; // smaller priority is better
16 return a.id > b.id; // tie-break by smaller id first
17 }
18};
19
20int main() {
21 // 1) Default max-heap of ints
22 priority_queue<int> maxpq; // top() is the largest
23 for (int x : {5, 1, 7, 3}) maxpq.push(x);
24 cout << "Max-heap order: ";
25 while (!maxpq.empty()) {
26 cout << maxpq.top() << ' ';
27 maxpq.pop();
28 }
29 cout << "\n";
30
31 // 2) Min-heap of ints using greater<int>
32 priority_queue<int, vector<int>, greater<int>> minpq; // top() is the smallest
33 for (int x : {5, 1, 7, 3}) minpq.push(x);
34 cout << "Min-heap order: ";
35 while (!minpq.empty()) {
36 cout << minpq.top() << ' ';
37 minpq.pop();
38 }
39 cout << "\n";
40
41 // 3) Min-heap of custom tasks with tie-breaks
42 priority_queue<Task, vector<Task>, TaskCmp> tasks;
43 tasks.push({3, 42});
44 tasks.push({1, 10});
45 tasks.push({1, 5});
46 tasks.push({2, 7});
47
48 cout << "Tasks by (priority,id): ";
49 while (!tasks.empty()) {
50 auto t = tasks.top(); tasks.pop();
51 cout << '(' << t.priority << ',' << t.id << ") ";
52 }
53 cout << "\n";
54
55 return 0;
56}
57

Shows the default max-heap behavior, converts it to a min-heap using std::greater<int>, and demonstrates a custom comparator that orders a struct by priority then id. The comparator returns true when the left element should come after the right, which is how priority_queue interprets ordering.

Time: O(n log n) to pop all after pushing n itemsSpace: O(n)
Dijkstra’s Algorithm using a Min-heap (lazy decrease-key)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Dijkstra with non-negative edge weights.
5// Uses a min-heap of pairs (dist, node). Lazy deletion ignores stale entries.
6
7int main() {
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 int n, m, s; // nodes [0..n-1], edges, source
12 cin >> n >> m >> s;
13 vector<vector<pair<int,int>>> g(n); // (neighbor, weight)
14 for (int i = 0; i < m; ++i) {
15 int u, v, w; cin >> u >> v >> w;
16 g[u].push_back({v, w});
17 g[v].push_back({u, w}); // comment out if directed
18 }
19
20 const long long INF = (1LL<<60);
21 vector<long long> dist(n, INF);
22 dist[s] = 0;
23
24 using P = pair<long long,int>;
25 priority_queue<P, vector<P>, greater<P>> pq; // min-heap by distance
26 pq.push({0, s});
27
28 while (!pq.empty()) {
29 auto [du, u] = pq.top(); pq.pop();
30 if (du != dist[u]) continue; // stale entry (lazy deletion)
31 for (auto [v, w] : g[u]) {
32 if (dist[v] > dist[u] + w) {
33 dist[v] = dist[u] + w;
34 pq.push({dist[v], v}); // no decrease-key: push new better pair
35 }
36 }
37 }
38
39 for (int i = 0; i < n; ++i) {
40 cout << (dist[i] == INF ? -1 : dist[i]) << (i+1==n?'\n':' ');
41 }
42 return 0;
43}
44

Maintains tentative distances in a min-heap prioritized by the smallest distance. Since std::priority_queue lacks decrease-key, we push an updated (dist, node) pair whenever we find a better distance and skip entries that do not match the current best dist (lazy deletion).

Time: O((n + m) log n)Space: O(n + m)
Merge K Sorted Arrays with a Min-heap
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Node {
5 int val; // current value
6 int i; // which array
7 int j; // index within that array
8};
9
10struct Cmp {
11 bool operator()(const Node &a, const Node &b) const {
12 return a.val > b.val; // min-heap by value
13 }
14};
15
16int main() {
17 ios::sync_with_stdio(false);
18 cin.tie(nullptr);
19
20 int K; cin >> K;
21 vector<vector<int>> a(K);
22 for (int i = 0; i < K; ++i) {
23 int len; cin >> len;
24 a[i].resize(len);
25 for (int j = 0; j < len; ++j) cin >> a[i][j];
26 }
27
28 priority_queue<Node, vector<Node>, Cmp> pq;
29 for (int i = 0; i < K; ++i) if (!a[i].empty()) pq.push({a[i][0], i, 0});
30
31 vector<int> merged;
32 while (!pq.empty()) {
33 Node cur = pq.top(); pq.pop();
34 merged.push_back(cur.val);
35 int ni = cur.i, nj = cur.j + 1;
36 if (nj < (int)a[ni].size()) {
37 pq.push({a[ni][nj], ni, nj});
38 }
39 }
40
41 for (int x : merged) cout << x << ' ';
42 cout << '\n';
43 return 0;
44}
45

Push the first element of each sorted array into a min-heap, then repeatedly extract the smallest and push the next element from that array. The heap size stays at most K, giving an efficient O(N log K) total where N is the sum of lengths.

Time: O(N log K)Space: O(K)
Indexed Priority Queue supporting decreaseKey in O(log n)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Indexed Min-Priority Queue for keys associated with fixed indices [0..N-1].
5// Supports: push(i, key), decreaseKey(i, newKey), contains(i), topIndex(), topKey(), pop().
6// All heap operations are O(log n). This is handy for Dijkstra/Prim without lazy deletion.
7
8struct IndexedMinPQ {
9 int N; // maximum index
10 vector<long long> key; // key[i] is the current key for index i
11 vector<int> pq; // heap of indices: pq[pos] = index at heap position pos
12 vector<int> pos; // pos[i] = position of index i in heap, or -1 if not in heap
13
14 IndexedMinPQ(int n) : N(n), key(n, LLONG_MAX), pos(n, -1) {}
15
16 bool contains(int i) const { return pos[i] != -1; }
17 bool empty() const { return pq.empty(); }
18
19 bool lessIdx(int a, int b) const { return key[a] < key[b]; }
20
21 void swapPos(int i, int j) {
22 swap(pq[i], pq[j]);
23 pos[pq[i]] = i; pos[pq[j]] = j;
24 }
25
26 void siftUp(int i) {
27 while (i > 0) {
28 int p = (i - 1) / 2;
29 if (!lessIdx(pq[i], pq[p])) break;
30 swapPos(i, p);
31 i = p;
32 }
33 }
34
35 void siftDown(int i) {
36 int n = pq.size();
37 while (true) {
38 int l = 2*i + 1, r = 2*i + 2, best = i;
39 if (l < n && lessIdx(pq[l], pq[best])) best = l;
40 if (r < n && lessIdx(pq[r], pq[best])) best = r;
41 if (best == i) break;
42 swapPos(i, best);
43 i = best;
44 }
45 }
46
47 void push(int i, long long k) {
48 if (contains(i)) {
49 if (k < key[i]) { key[i] = k; siftUp(pos[i]); }
50 else { key[i] = k; siftDown(pos[i]); }
51 return;
52 }
53 key[i] = k;
54 pq.push_back(i);
55 pos[i] = (int)pq.size() - 1;
56 siftUp(pos[i]);
57 }
58
59 void decreaseKey(int i, long long nk) {
60 if (!contains(i) || nk >= key[i]) return; // no-op if not smaller
61 key[i] = nk;
62 siftUp(pos[i]);
63 }
64
65 int topIndex() const { return pq.empty() ? -1 : pq[0]; }
66 long long topKey() const { return pq.empty() ? LLONG_MAX : key[pq[0]]; }
67
68 pair<int,long long> pop() {
69 int i = pq[0]; long long k = key[i];
70 int last = pq.back(); pq.pop_back();
71 pos[i] = -1;
72 if (!pq.empty()) {
73 pq[0] = last; pos[last] = 0; siftDown(0);
74 }
75 return {i, k};
76 }
77};
78
79int main() {
80 // Example: manage keys for indices 0..5
81 IndexedMinPQ ipq(6);
82 ipq.push(3, 50);
83 ipq.push(1, 30);
84 ipq.push(5, 40);
85 ipq.decreaseKey(3, 10); // index 3 becomes best
86 ipq.push(2, 20);
87
88 while (!ipq.empty()) {
89 auto [i, k] = ipq.pop();
90 cout << "(idx=" << i << ", key=" << k << ") ";
91 }
92 cout << "\n";
93 return 0;
94}
95

Implements a classic indexed min-heap by storing a heap of indices and two maps: key[i] gives the priority, and pos[i] gives the heap position for index i. decreaseKey updates key[i] and sifts it upward in O(log n). This supports algorithms needing true decrease-key semantics.

Time: O(log n) per push/pop/decreaseKeySpace: O(n)