Binary Trie for XOR
Key Points
- •A binary trie (also called a bitwise trie) stores numbers by their binary bits, branching on at each level.
- •Insert, delete, and query operations run in O(B) time, where )) + 1; this is usually written as O(log MAX).
- •To maximize XOR with a query x, greedily follow the opposite bit at each level if it exists, starting from the most significant bit.
- •Store a count at each node so duplicates are supported and deletions are safe without losing shared paths.
- •A persistent binary trie keeps an immutable version for every prefix of an array, enabling range queries like range maximum XOR in O(B).
- •Use unsigned types and a correct bit-length (e.g., 31 for 32-bit, 60 for 64-bit) to avoid sign and off-by-one errors.
- •You can also count how many stored numbers satisfy (num XOR x) < K in O(B) using counts while traversing.
- •Memory usage is linear in the number of inserted elements times the bit length; persistent versions multiply this by the number of versions.
Prerequisites
- →Bitwise operations (AND, OR, XOR, shifts) — All logic in a binary trie depends on extracting and comparing bits efficiently.
- →Binary representation of integers — Understanding how integers map to sequences of bits is essential for trie traversal.
- →Trie (prefix tree) basics — Binary tries are a specialization of tries with a binary alphabet.
- →Asymptotic analysis — To select the right data structure and tune B, you need to reason about O(B) time and memory.
- →Persistence and path-copying — Range queries via persistent tries rely on creating and querying versions efficiently.
- →Prefix XOR technique — Many XOR problems (e.g., maximum subarray XOR) rely on prefix XOR combined with a trie.
Detailed Explanation
Tap terms for definitions01Overview
A binary trie is a tree-based data structure specialized for storing integers by their binary representations. Each level corresponds to a bit position (most significant to least), and edges are labeled 0 or 1. This structure makes bitwise queries efficient because you navigate the tree following bits of the query key. The most celebrated operation is maximizing XOR with a given number x: at each bit, you choose the branch that differs from x's bit if it exists, because differing bits contribute 2^i to the XOR. By storing a counter at each node, the trie supports multisets (duplicates) and safe deletions. Furthermore, a persistent version of the binary trie keeps multiple immutable versions as elements are added—each version corresponds to a prefix of an array—allowing range-restricted queries such as maximum XOR in a subarray. All core operations—insert, delete, max XOR query, and persistent range queries—run in O(B) time, where B is the number of bits you consider. In competitive programming, pick B = 31 for 32-bit nonnegative numbers or B = 60 for 64-bit values. The binary trie shines on problems involving XOR optimization, subarray XORs, and counting constraints that decompose bit-by-bit.
02Intuition & Analogies
Think of a phonebook that organizes numbers by their digits, but here the digits are bits (0 or 1). To store a number, you walk from the root, at the most significant bit: go left for 0 or right for 1, then continue until the least significant bit. Many numbers share long prefixes, so they share the same path for many levels. Now, what does XOR mean here? XOR at a bit is 1 if two bits differ, 0 if they are the same. If you want to make the XOR with x as large as possible, you want the highest-value bits of the result to be 1. That means: at the highest bit, prefer a number that differs from x's bit; at the next bit, do the same, and so on. This greedy choice works because higher bits (like 2^60) outweigh any combination of lower bits. The trie conveniently tells you if there is any number with the opposite bit at each step. If yes, you go there; if not, you have to settle for matching x's bit. The persistent version is like taking a snapshot of the phonebook after each insertion, but cleverly, it only copies the nodes on the path you change. When you need the best XOR in a range [l, r], you compare two snapshots: the one after r insertions and the one after l-1 insertions, effectively subtracting counts to ignore elements outside the range. The structure remains fast and memory-savvy because you only copy O(B) nodes per version.
03Formal Definition
04When to Use
Use a binary trie when your operations or goals are bitwise and local to each bit, especially with XOR. Classic uses include: finding the maximum XOR pair in an array; computing range maximum XOR with persistent tries; finding maximum subarray XOR by inserting prefix XORs; counting pairs with (a_i XOR a_j) < K in O(n B) using a running trie; supporting dynamic sets with insert/remove and queries like max XOR or count of elements less than a threshold. It is particularly effective when values are limited to a fixed-width integer domain and when O(B) per operation is acceptable—often far faster than O(log n) if B is small (e.g., 31 or 60) and n is large. If you need immutable versions to answer offline range queries, a persistent trie is a strong choice over segment trees for XOR-specific tasks. On the other hand, if your queries are not bit-structured (e.g., sum or min over ranges) or you need general order statistics over arbitrary keys, other structures like Fenwick trees, segment trees, or balanced BSTs may be more appropriate.
⚠️Common Mistakes
- Wrong bit-length B: If B is too small, high bits are ignored, leading to incorrect answers. Pick B = floor(log2(max_value)) + 1, often 31 for 32-bit positives or 60 for 64-bit. 2) Signed shifts: Right-shifting negative signed integers may keep sign bits, corrupting traversal. Prefer unsigned types (uint32_t/uint64_t) or carefully cast before shifting. 3) Missing counts: Without cnt at each node, deletions can break shared paths or queries may follow non-existent branches. Always maintain and check cnt > 0. 4) Off-by-one in bit indices: Consistently iterate from B-1 down to 0 and mask with (x >> i) & 1. 5) Forgetting neutral base in prefix tricks: For maximum subarray XOR with prefix XORs, you must insert 0 initially; otherwise, you miss subarrays starting at index 1. 6) Persistent memory blow-ups: Accidentally cloning entire subtrees instead of only the path increases memory to O(n * 2^B). Path-copy only O(B) nodes per update. 7) Returning arbitrary results on empty tries: When querying max XOR on an empty trie, return a sentinel or handle gracefully. 8) Using int where long long is needed: If values may reach 1e18, use uint64_t and set B = 60. 9) Not checking range counts in persistent queries: Use cnt_r(child) - cnt_{l-1}(child) to decide feasibility when descending.
Key Formulas
Bit-length
Explanation: B is the number of bits needed to represent values up to ma. All trie operations iterate over these B bits.
Per-operation time
Explanation: Each insert, delete, or query touches at most one node per bit position. If M is the maximum value domain size, then B = log2(M).
Greedy XOR choice
Explanation: To maximize the XOR, set the highest differing bit whenever feasible since higher bits contribute more (2^i) than lower ones.
Persistent memory
Explanation: Building prefix versions for N insertions clones only O(B) nodes per insertion, so total nodes are linear in N times B.
Range difference
Explanation: To restrict queries to A[l..r], subtract node counts between versions r and l-1 when deciding which branch is available.
Counting XOR-bounded items
Explanation: The number of elements that satisfy an XOR threshold can be computed by traversing the trie with K's bits controlling the allowed branches.
Bitwise contribution
Explanation: The XOR value is the sum of contributions per bit, so choosing a 1 at higher i dominates lower bits. This justifies the greedy traversal.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct BinaryTrie { 5 struct Node { 6 int child[2]; 7 int cnt; // number of elements passing through this node 8 Node() { child[0] = child[1] = -1; cnt = 0; } 9 }; 10 11 vector<Node> t; 12 int B; // number of bits (e.g., 31 for 32-bit positives, 60 for 64-bit) 13 14 BinaryTrie(int bits = 31) : B(bits) { t.reserve(1 << 12); t.push_back(Node()); } 15 16 void insert(uint64_t x) { 17 int v = 0; 18 t[v].cnt++; 19 for (int i = B - 1; i >= 0; --i) { 20 int b = (x >> i) & 1ULL; 21 if (t[v].child[b] == -1) { 22 t[v].child[b] = (int)t.size(); 23 t.push_back(Node()); 24 } 25 v = t[v].child[b]; 26 t[v].cnt++; 27 } 28 } 29 30 // Returns the maximum XOR value achievable with x; returns 0 if trie is empty. 31 uint64_t max_xor(uint64_t x) const { 32 if (t[0].cnt == 0) return 0ULL; // empty trie 33 int v = 0; 34 uint64_t ans = 0ULL; 35 for (int i = B - 1; i >= 0; --i) { 36 int b = (x >> i) & 1ULL; 37 int want = b ^ 1; // prefer opposite bit to maximize XOR 38 int go = -1; 39 if (t[v].child[want] != -1 && t[t[v].child[want]].cnt > 0) { 40 go = t[v].child[want]; 41 ans |= (1ULL << i); // this bit in XOR becomes 1 42 } else { 43 go = t[v].child[b]; // must match 44 } 45 if (go == -1) break; // safety: no further nodes 46 v = go; 47 } 48 return ans; 49 } 50 }; 51 52 int main() { 53 ios::sync_with_stdio(false); 54 cin.tie(nullptr); 55 56 // Example: insert numbers and query maximum XOR with x 57 BinaryTrie bt(31); // 31 bits are enough for numbers up to ~2e9 58 vector<uint64_t> nums = {5, 25, 10, 2}; 59 for (auto x : nums) bt.insert(x); 60 61 vector<uint64_t> queries = {0, 5, 7, 100}; 62 for (auto x : queries) { 63 uint64_t best = bt.max_xor(x); 64 cout << "max XOR with " << x << " is " << best << "\n"; 65 } 66 return 0; 67 } 68
We build a binary trie over 31 bits and insert a small set of numbers. For each query x, we greedily prefer the branch opposite to x's current bit to maximize the XOR. The return value is the maximum XOR value itself, not the element achieving it.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct MultisetBinaryTrie { 5 struct Node { 6 int child[2]; 7 int cnt; 8 Node() { child[0] = child[1] = -1; cnt = 0; } 9 }; 10 11 vector<Node> t; 12 int B; 13 14 MultisetBinaryTrie(int bits = 31) : B(bits) { t.push_back(Node()); } 15 16 void insert(uint64_t x) { 17 int v = 0; t[v].cnt++; 18 for (int i = B - 1; i >= 0; --i) { 19 int b = (x >> i) & 1ULL; 20 if (t[v].child[b] == -1) { t[v].child[b] = (int)t.size(); t.push_back(Node()); } 21 v = t[v].child[b]; 22 t[v].cnt++; 23 } 24 } 25 26 // Attempts to remove one occurrence of x; returns false if x not present 27 bool erase(uint64_t x) { 28 if (!contains(x)) return false; 29 int v = 0; t[v].cnt--; 30 for (int i = B - 1; i >= 0; --i) { 31 int b = (x >> i) & 1ULL; 32 v = t[v].child[b]; 33 t[v].cnt--; 34 } 35 return true; 36 } 37 38 bool contains(uint64_t x) const { 39 int v = 0; if (t[v].cnt == 0) return false; 40 for (int i = B - 1; i >= 0; --i) { 41 int b = (x >> i) & 1ULL; 42 int nx = t[v].child[b]; 43 if (nx == -1 || t[nx].cnt == 0) return false; 44 v = nx; 45 } 46 return true; 47 } 48 49 // Returns pair(found, bestXorValue). If empty, found=false. 50 pair<bool, uint64_t> max_xor(uint64_t x) const { 51 if (t[0].cnt == 0) return {false, 0ULL}; 52 int v = 0; uint64_t ans = 0ULL; 53 for (int i = B - 1; i >= 0; --i) { 54 int b = (x >> i) & 1ULL; 55 int want = b ^ 1; 56 int go = -1; 57 if (t[v].child[want] != -1 && t[t[v].child[want]].cnt > 0) { 58 go = t[v].child[want]; ans |= (1ULL << i); 59 } else { 60 go = t[v].child[b]; 61 } 62 if (go == -1) break; 63 v = go; 64 } 65 return {true, ans}; 66 } 67 }; 68 69 int main() { 70 ios::sync_with_stdio(false); 71 cin.tie(nullptr); 72 73 MultisetBinaryTrie mt(31); 74 mt.insert(7); mt.insert(7); mt.insert(3); mt.insert(10); 75 cout << boolalpha; 76 cout << "contains 7? " << mt.contains(7) << "\n"; 77 cout << "erase 7 -> " << mt.erase(7) << ", contains 7? " << mt.contains(7) << "\n"; 78 cout << "erase 7 -> " << mt.erase(7) << ", contains 7? " << mt.contains(7) << "\n"; 79 auto q = mt.max_xor(5); 80 if (q.first) cout << "max XOR with 5 is " << q.second << "\n"; 81 else cout << "trie empty\n"; 82 return 0; 83 } 84
This version supports duplicates via per-node counts and provides safe erasure. The contains check ensures that we don't decrement counts below zero. The max_xor returns a flag if the trie is empty.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct PersistentBinaryTrie { 5 struct Node { int ch[2]; int cnt; Node(){ ch[0]=ch[1]=-1; cnt=0; } }; 6 vector<Node> T; // global pool of nodes 7 int B; 8 9 PersistentBinaryTrie(int bits=31) : B(bits) { T.reserve(1<<20); T.push_back(Node()); } 10 11 int clone(int v) { T.push_back(T[v]); return (int)T.size()-1; } 12 int newnode() { T.push_back(Node()); return (int)T.size()-1; } 13 14 // Insert x starting from previous root 'root', return new root 15 int insert(int root, uint64_t x) { 16 int newRoot = clone(root); 17 int vPrev = root, vNew = newRoot; 18 T[vNew].cnt++; 19 for (int i = B-1; i>=0; --i) { 20 int b = (x>>i) & 1ULL; 21 int prevChild = (vPrev==-1? -1 : T[vPrev].ch[b]); 22 int nextNew; 23 if (prevChild == -1) { 24 nextNew = newnode(); 25 } else { 26 nextNew = clone(prevChild); 27 } 28 T[vNew].ch[b] = nextNew; 29 vNew = nextNew; 30 T[vNew].cnt++; 31 vPrev = prevChild; 32 } 33 return newRoot; 34 } 35 36 // Query maximum XOR with x using elements present in version 'rRoot' but not in 'lRoot'. 37 uint64_t range_max_xor(int lRoot, int rRoot, uint64_t x) const { 38 uint64_t ans = 0ULL; 39 int vl = lRoot, vr = rRoot; 40 for (int i = B-1; i>=0; --i) { 41 int b = (x>>i) & 1ULL; 42 int want = b ^ 1; 43 auto getChild = [&](int v, int c){ return v==-1? -1 : T[v].ch[c]; }; 44 int rWant = getChild(vr, want); 45 int lWant = getChild(vl, want); 46 auto getCnt = [&](int v){ return v==-1? 0 : T[v].cnt; }; 47 // Check if branch 'want' has any count in range 48 if (getCnt(rWant) - getCnt(lWant) > 0) { 49 ans |= (1ULL<<i); 50 vr = rWant; vl = lWant; 51 } else { 52 // must go to b-branch 53 vr = getChild(vr, b); 54 vl = getChild(vl, b); 55 } 56 } 57 return ans; 58 } 59 }; 60 61 int main(){ 62 ios::sync_with_stdio(false); 63 cin.tie(nullptr); 64 65 // Build persistent versions for array A: version i contains A[1..i] 66 vector<uint64_t> A = {5, 1, 7, 3, 9}; 67 int n = (int)A.size(); 68 int B = 31; 69 PersistentBinaryTrie P(B); 70 vector<int> root(n+1, 0); // root[0] is empty (index 0 in pool) 71 for (int i=1;i<=n;++i) root[i] = P.insert(root[i-1], A[i-1]); 72 73 // Example queries: for [l, r], maximize y XOR x with y in A[l..r] 74 struct Q{int l,r; uint64_t x;}; 75 vector<Q> qs = {{1,3,2},{2,5,5},{3,3,7}}; 76 for (auto q: qs){ 77 uint64_t best = P.range_max_xor(root[q.l-1], root[q.r], q.x); 78 cout << "["<<q.l<<","<<q.r<<"] max XOR with x="<<q.x<<" -> "<<best<<"\n"; 79 } 80 return 0; 81 } 82
We create prefix versions of the trie by path-copying on insert. For a range [l, r], we compare counts between roots r and l-1 to decide whether a branch contains any elements in the range. The query returns the best achievable XOR value within that subarray.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct CountingBinaryTrie { 5 struct Node { int ch[2]; int cnt; Node(){ ch[0]=ch[1]=-1; cnt=0; } }; 6 vector<Node> t; int B; 7 CountingBinaryTrie(int bits=31): B(bits){ t.push_back(Node()); } 8 9 void insert(uint64_t x){ int v=0; t[v].cnt++; for(int i=B-1;i>=0;--i){ int b=(x>>i)&1ULL; if(t[v].ch[b]==-1){ t[v].ch[b]=(int)t.size(); t.push_back(Node()); } v=t[v].ch[b]; t[v].cnt++; } } 10 11 // Count numbers y in the trie such that (y XOR x) < K 12 uint64_t count_xor_less(uint64_t x, uint64_t K) const { 13 uint64_t ans = 0; 14 int v = 0; 15 for (int i = B-1; i >= 0 && v!=-1; --i) { 16 int xb = (x >> i) & 1ULL; 17 int kb = (K >> i) & 1ULL; 18 int same = t[v].ch[xb]; 19 int diff = t[v].ch[xb ^ 1]; 20 auto cnt = [&](int u){ return u==-1? 0 : t[u].cnt; }; 21 if (kb == 1) { 22 // If K's bit is 1: choosing yb=xb makes this bit 0 and prefix strictly less. Add all from 'same'. 23 ans += cnt(same); 24 // Must continue with yb != xb (diff branch) to keep prefixes equal so far. 25 v = diff; 26 } else { 27 // K's bit is 0: we must choose yb=xb (same branch) to avoid exceeding K at this bit. 28 v = same; 29 } 30 } 31 // If we never broke due to v==-1 and K's lower bits are all > 0 allowance, remaining path contributes but is already accounted. 32 return ans; 33 } 34 }; 35 36 // Example: count pairs (i<j) with (a[i] XOR a[j]) < K in O(n * B) 37 int main(){ 38 ios::sync_with_stdio(false); 39 cin.tie(nullptr); 40 41 vector<uint64_t> A = {5,1,7,3,9}; 42 uint64_t K = 6; 43 CountingBinaryTrie T(31); 44 uint64_t pairs = 0; 45 for (auto x : A) { 46 // Count how many previous y satisfy (y XOR x) < K, then insert x 47 pairs += T.count_xor_less(x, K); 48 T.insert(x); 49 } 50 cout << "#pairs with XOR < " << K << " = " << pairs << "\n"; 51 return 0; 52 } 53
The traversal is controlled by K's bits. When K's current bit is 1, taking the branch equal to x's bit guarantees the XOR prefix becomes strictly smaller, so we add the entire subtree count; then we continue on the opposite branch. When K's bit is 0, we must continue on the equal branch to keep the prefix below K.