Group Convolution
Key Points
- •Group convolution combines two functions defined on a group by summing over products aligned by the group operation, generalizing the usual circular convolution on integers modulo n.
- •For a finite group G, the definition is (f *_G k)(g) = f(h)\,k(h), which uses the inverse of g to 'shift' k before multiplying by f and summing.
- •On cyclic groups , group convolution is exactly circular convolution and can be computed efficiently with the FFT in O(n n) time.
- •On general (non-abelian) groups, convolution is still associative and has an identity (the delta at the identity element), but it need not be commutative.
- •The group Fourier transform turns convolution into matrix (or scalar, in the abelian case) multiplication: () = \hat f() \hat k().
- •A naive implementation over a finite group of size |G| costs O(|G|^2) time and O(|G|) extra space; precomputing multiplication and inverses helps avoid mistakes and speeds lookups.
- •Group convolution models random walks on groups: convolving a step distribution with itself t times gives the t-step distribution.
- •In machine learning and signal processing, group convolution is the backbone of equivariant architectures (e.g., rotations, permutations), enabling weight sharing across symmetries.
Prerequisites
- →Basic group theory — Understanding identities, inverses, and group multiplication is essential to interpret the convolution formula.
- →Discrete convolution on integers — Provides the familiar sliding-and-summing intuition that generalizes to groups.
- →Fourier transform and FFT — Enables fast convolution on abelian groups and explains the convolution theorem.
- →Permutations and composition — Needed to implement non-abelian groups like S3 using permutation composition.
- →Probability basics — Interpreting convolution powers as t-step distributions in random walks requires normalization and expectation concepts.
- →C++ vectors, maps, and complex numbers — Required to implement efficient data structures, FFTs, and numerical operations in code.
Detailed Explanation
Tap terms for definitions01Overview
Group convolution is a way to blend two functions that live on a group so that the result respects the group’s symmetry structure. If you already know ordinary convolution on the integers (sliding, flipping, and summing), group convolution is the same idea but performed with the group’s multiplication instead of integer addition. Given a finite group G and two functions f, k: G → R (or C), their convolution produces a new function f *_G k: G → R defined by summing products of values of f and a shifted version of k. The shift is expressed using the group operation and an inverse, ensuring the operation behaves well under left translations. This operation is central in many fields: in signal processing on cyclic groups (circular convolution), in probability on groups (random walks and mixing), in representation theory (diagonalization by the group Fourier transform), and in machine learning (equivariant neural networks over symmetries like rotations or permutations). For abelian groups such as C_n, powerful FFT algorithms make convolution extremely fast, reducing complexity from quadratic to near-linear. For non-abelian groups, convolution remains associative and has an identity element (the delta at the group identity), but is generally non-commutative; its spectral analysis uses matrix-valued irreducible representations.
02Intuition & Analogies
Think of a group as a set of moves you can make—like rotating a shape, permuting cards, or adding hours on a clock. A function on a group assigns a number to each move: for example, how much we “like” that move, or the probability we take that move in a random walk. Convolution asks: if I first pick a move according to f, then follow it by a move according to k (but aligned relative to a target move g), what is the overall compatibility? To measure this at g, we consider all possible first moves h, and then choose the second move so that together they land at g. On a clock (the cyclic group C_n), this becomes the familiar circular convolution: to score position g, sum over all ways to reach g by first going to h then moving from h to g. On permutations (like shuffling three items), convolution tallies all two-step shuffles whose product equals a target shuffle g. This is why convolution naturally models multi-step processes: applying distributions step-by-step is equivalent to convolving them. The inverse in the formula is the bookkeeping that makes “shift by g” line up properly with left-multiplication. It also ensures beautiful properties: convolving with the delta at the identity does nothing; associativity matches doing three steps in any grouping; and with Fourier analysis on the group, this complicated sum turns into simple multiplication (scalar for abelian groups, matrix for non-abelian groups), the same magic that powers the FFT for ordinary signals.
03Formal Definition
04When to Use
- Signals on periodic domains: For data defined on a cycle (e.g., time on a clock, angles discretized into n bins), use group convolution on C_n to implement circular filtering without boundary artifacts. FFT-based methods give O(n \log n) performance.
- Random walks and Markov chains on groups: If a step distribution p on G defines one-step movement, then the t-step distribution is p^{(*t)} (convolved with itself t times). This is key for mixing time analysis and sampling.
- Symmetry-aware machine learning: Group convolutions encode equivariance—outputs transform predictably when inputs are transformed—crucial for rotation/permutation-equivariant CNNs, spherical CNNs, and graph models built on Cayley graphs.
- Counting and combinatorics with symmetry: Convolution tallies compositions of group-labeled structures and appears in Burnside/Polya contexts via class functions and characters.
- Spectral methods: When you can exploit the group Fourier transform (characters for abelian groups, irreps for general groups), convolution becomes pointwise (or matrix) multiplication in the frequency domain, enabling fast algorithms, denoising, and deconvolution.
- Physical systems with symmetry: Lattices with periodic boundary, molecular symmetries, or robotics orientation spaces often discretize to finite groups where group convolution is the natural filtering operation.
⚠️Common Mistakes
- Mixing left and right conventions: The definition here uses k(g^{-1} h). Using k(h g^{-1}) is a right-convolution variant. Be consistent throughout your code and proofs.
- Forgetting the inverse: Writing (f *_G k)(g) = \sum_h f(h) k(gh) breaks equivariance to left translations. The inverse is essential.
- Confusing linear vs circular convolution: On C_n, group convolution is circular. Padding and using a length-2n FFT computes linear convolution and will wrap incorrectly if you expect group convolution.
- Mismatched element orderings: Arrays representing f and k must share the same element ordering and group law. Precompute and reuse a single multiplication table and inverse map.
- Assuming commutativity: For non-abelian groups, f *_G k generally ≠ k *_G f. Do not reorder factors in code or algebra unless the group is abelian.
- Numerical issues in FFT: Not using complex numbers, forgetting 1/n scaling on the inverse FFT, or using non power-of-two sizes with a radix-2 FFT cause errors.
- Not normalizing probabilities: In random-walk applications, ensure \sum_g f(g) = 1 and preserve normalization after convolution.
- Inefficient lookups: Recomputing products or inverses inside double loops is costly. Precompute mul[g][h] and inv[g] to achieve tight O(|G|^2) performance.
Key Formulas
Group Convolution (Left)
Explanation: Defines convolution of two functions on a finite group. The inverse aligns k relative to g so the operation is left-equivariant.
Delta at Identity
Explanation: This function acts as the multiplicative identity for convolution: convolving with it leaves any function unchanged.
Associativity
Explanation: Group convolution can be grouped in any order, matching the idea of composing three steps regardless of parenthesization.
Commutativity Criterion
Explanation: Convolution commutes exactly when the underlying group’s multiplication commutes.
Convolution Theorem (Abelian)
Explanation: For abelian groups, the character transform converts convolution into pointwise multiplication, enabling FFT-based acceleration.
Convolution Theorem (Non-Abelian)
Explanation: In the non-abelian case, the Fourier transform is matrix-valued; convolution becomes matrix multiplication in each irrep block.
Plancherel (Finite Groups)
Explanation: Energy is preserved between spatial and spectral domains up to representation dimensions d_ρ. This underpins Parseval-like identities on groups.
Convolution Power
Explanation: Repeated application of the same step distribution models t-step processes such as random walks on groups.
FFT Complexity (Cyclic)
Explanation: Computing circular convolution on via the FFT takes near-linear time in n, much faster than the naive O() sum.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Iterative radix-2 FFT. Requires n to be a power of two. 5 void fft(vector<complex<double>>& a, bool invert) { 6 int n = (int)a.size(); 7 static vector<int> rev; 8 static vector<complex<double>> roots{0, 1}; 9 10 if ((int)rev.size() != n) { 11 int k = __builtin_ctz(n); 12 rev.assign(n, 0); 13 for (int i = 0; i < n; i++) { 14 rev[i] = (rev[i >> 1] >> 1) | ((i & 1) << (k - 1)); 15 } 16 } 17 if ((int)roots.size() < n) { 18 int k = __builtin_ctz(roots.size()); 19 roots.resize(n); 20 while ((1 << k) < n) { 21 double angle = 2 * M_PI / (1 << (k + 1)); 22 for (int i = 1 << (k - 1); i < (1 << k); i++) { 23 roots[2 * i] = roots[i]; 24 double ang = angle * (2 * i + 1 - (1 << k)); 25 roots[2 * i + 1] = complex<double>(cos(ang), sin(ang)); 26 } 27 k++; 28 } 29 } 30 31 for (int i = 0; i < n; i++) if (i < rev[i]) swap(a[i], a[rev[i]]); 32 33 for (int len = 1; len < n; len <<= 1) { 34 for (int i = 0; i < n; i += 2 * len) { 35 for (int j = 0; j < len; j++) { 36 complex<double> u = a[i + j]; 37 complex<double> v = a[i + j + len] * roots[len + j]; 38 a[i + j] = u + v; 39 a[i + j + len] = u - v; 40 } 41 } 42 } 43 if (invert) { 44 reverse(a.begin() + 1, a.end()); 45 for (auto &x : a) x /= n; 46 } 47 } 48 49 bool is_power_of_two(size_t n) { 50 return n && ((n & (n - 1)) == 0); 51 } 52 53 // Circular convolution on C_n: (f * k)[i] = sum_j f[j] * k[(i - j) mod n] 54 vector<double> circular_convolution(const vector<double>& f, const vector<double>& k) { 55 size_t n = f.size(); 56 if (k.size() != n) throw runtime_error("Inputs must have same size"); 57 if (!is_power_of_two(n)) throw runtime_error("n must be a power of two for this FFT implementation"); 58 59 vector<complex<double>> Fa(n), Ka(n); 60 for (size_t i = 0; i < n; i++) { 61 Fa[i] = complex<double>(f[i], 0.0); 62 Ka[i] = complex<double>(k[i], 0.0); 63 } 64 fft(Fa, false); 65 fft(Ka, false); 66 for (size_t i = 0; i < n; i++) Fa[i] *= Ka[i]; 67 fft(Fa, true); // inverse FFT (with 1/n scaling inside) 68 69 vector<double> out(n); 70 for (size_t i = 0; i < n; i++) out[i] = Fa[i].real(); 71 return out; 72 } 73 74 int main() { 75 // Example on C_8 (size 8, power of two): simple smoothing kernel 76 vector<double> f = {1,2,3,4,3,2,1,0}; 77 vector<double> k = {0.25, 0.5, 0.25, 0, 0, 0, 0, 0}; // local averaging (mod 8) 78 79 vector<double> conv = circular_convolution(f, k); 80 81 cout.setf(ios::fixed); cout << setprecision(6); 82 cout << "Circular convolution on C_8:\n"; 83 for (double x : conv) cout << x << ' '; 84 cout << "\n"; 85 return 0; 86 } 87
Implements circular convolution on the cyclic group C_n using an iterative radix-2 FFT. The group law is addition mod n, so (f * k)[i] = sum_j f[j] k[(i - j) mod n]. The FFT converts convolution into pointwise multiplication of spectra, then the inverse FFT returns the result. This matches the group-convolution definition with g^{-1}h becoming (i - j) mod n in the abelian additive notation.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Represent a permutation on {0,1,2} as a vector p where p[i] is the image of i. 5 using Perm = array<int,3>; 6 7 Perm compose(const Perm& a, const Perm& b) { 8 // Composition a ∘ b: first apply b, then a. (a∘b)(i) = a[b[i]] 9 return Perm{ a[b[0]], a[b[1]], a[b[2]] }; 10 } 11 12 Perm inverse(const Perm& p) { 13 Perm inv{}; 14 for (int i = 0; i < 3; i++) inv[p[i]] = i; 15 return inv; 16 } 17 18 struct FiniteGroup { 19 vector<Perm> elems; // list of elements 20 unordered_map<string,int> id_of; // map serialized perm -> index 21 vector<vector<int>> mul; // mul[i][j] = index of elems[i] * elems[j] 22 vector<int> inv; // inv[i] = index of inverse of elems[i] 23 24 static string key(const Perm& p){ return string{char('0'+p[0]), char('0'+p[1]), char('0'+p[2])}; } 25 26 static FiniteGroup S3() { 27 // Generators: s = (0 1), t = (1 2) 28 Perm e{0,1,2}; 29 Perm s{1,0,2}; 30 Perm t{0,2,1}; 31 32 FiniteGroup G; 33 queue<Perm> q; 34 auto add = [&](const Perm& p) { 35 string k = key(p); 36 if (!G.id_of.count(k)) { 37 int id = (int)G.elems.size(); 38 G.id_of[k] = id; G.elems.push_back(p); q.push(p); 39 } 40 }; 41 add(e); add(s); add(t); 42 // Close under composition by generators until no growth 43 while (!q.empty()) { 44 Perm a = q.front(); q.pop(); 45 for (const Perm& g : {s, t}) { 46 Perm ag = compose(a, g); 47 Perm ga = compose(g, a); 48 add(ag); add(ga); 49 } 50 } 51 int n = (int)G.elems.size(); 52 G.mul.assign(n, vector<int>(n)); 53 G.inv.assign(n, -1); 54 for (int i = 0; i < n; i++) { 55 // inverse index 56 Perm pinv = inverse(G.elems[i]); 57 G.inv[i] = G.id_of[key(pinv)]; 58 for (int j = 0; j < n; j++) { 59 Perm c = compose(G.elems[i], G.elems[j]); 60 G.mul[i][j] = G.id_of[key(c)]; 61 } 62 } 63 return G; 64 } 65 }; 66 67 // Group convolution: (f *_G k)(g) = sum_h f(h) * k(g^{-1} h) 68 vector<double> group_convolution(const FiniteGroup& G, const vector<double>& f, const vector<double>& k) { 69 int n = (int)G.elems.size(); 70 if ((int)f.size() != n || (int)k.size() != n) throw runtime_error("Size mismatch"); 71 vector<double> out(n, 0.0); 72 for (int g = 0; g < n; g++) { 73 int ginv = G.inv[g]; 74 double acc = 0.0; 75 for (int h = 0; h < n; h++) { 76 int idx = G.mul[ginv][h]; // index of g^{-1} * h 77 acc += f[h] * k[idx]; 78 } 79 out[g] = acc; 80 } 81 return out; 82 } 83 84 int main(){ 85 FiniteGroup G = FiniteGroup::S3(); 86 int n = (int)G.elems.size(); 87 cout << "|S3| = " << n << "\n"; // should be 6 88 89 // Define two functions on S3 by index; for demo pick simple values 90 vector<double> f(n, 0.0), k(n, 0.0); 91 f[0] = 1.0; // delta at identity 92 // Let k assign weights: k[0]=1, others small 93 for (int i = 0; i < n; i++) k[i] = (i==0 ? 1.0 : 0.1*i); 94 95 // delta_e * k == k 96 vector<double> conv1 = group_convolution(G, f, k); 97 cout << "delta_e * k: "; for (double x: conv1) cout << fixed << setprecision(3) << x << ' '; cout << "\n"; 98 99 // General convolution f2 * k 100 vector<double> f2(n); iota(f2.begin(), f2.end(), 1.0); // f2[i] = 1,2,3,4,5,6 101 vector<double> conv2 = group_convolution(G, f2, k); 102 cout << "f2 * k: "; for (double x: conv2) cout << fixed << setprecision(3) << x << ' '; cout << "\n"; 103 104 return 0; 105 } 106
Builds S3 from permutation generators without hard-coding a Cayley table, then computes (f *_G k)(g) = sum_h f(h) k(g^{-1} h) using precomputed multiplication and inverses. The example verifies that convolving with the delta at the identity leaves k unchanged and shows a generic convolution output.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 using Perm = array<int,3>; 5 Perm compose(const Perm& a, const Perm& b){ return Perm{ a[b[0]], a[b[1]], a[b[2]] }; } 6 Perm inverse(const Perm& p){ Perm inv{}; for(int i=0;i<3;i++) inv[p[i]] = i; return inv; } 7 8 struct FiniteGroup { 9 vector<Perm> elems; unordered_map<string,int> id_of; vector<vector<int>> mul; vector<int> inv; 10 static string key(const Perm& p){ return string{char('0'+p[0]), char('0'+p[1]), char('0'+p[2])}; } 11 static FiniteGroup S3(){ 12 Perm e{0,1,2}, s{1,0,2}, t{0,2,1}; 13 FiniteGroup G; queue<Perm> q; 14 auto add = [&](const Perm& p){ string k=key(p); if(!G.id_of.count(k)){ int id=G.elems.size(); G.id_of[k]=id; G.elems.push_back(p); q.push(p);} }; 15 add(e); add(s); add(t); 16 while(!q.empty()){ Perm a=q.front(); q.pop(); for(const Perm& g: {s,t}){ auto ag=compose(a,g); auto ga=compose(g,a); add(ag); add(ga);} } 17 int n=G.elems.size(); G.mul.assign(n, vector<int>(n)); G.inv.assign(n,-1); 18 for(int i=0;i<n;i++){ Perm pinv=inverse(G.elems[i]); G.inv[i]=G.id_of[key(pinv)]; for(int j=0;j<n;j++){ Perm c=compose(G.elems[i], G.elems[j]); G.mul[i][j]=G.id_of[key(c)]; }} 19 return G; 20 } 21 }; 22 23 vector<double> convolve(const FiniteGroup& G, const vector<double>& f, const vector<double>& k){ 24 int n = G.elems.size(); vector<double> out(n,0.0); 25 for(int g=0; g<n; g++){ 26 int ginv = G.inv[g]; double acc=0.0; for(int h=0; h<n; h++){ int idx = G.mul[ginv][h]; acc += f[h]*k[idx]; } out[g]=acc; } 27 return out; 28 } 29 30 vector<double> delta_identity(const FiniteGroup& G){ vector<double> d(G.elems.size(),0.0); d[0]=1.0; return d; } 31 32 // Fast exponentiation under convolution: returns p^{(*t)} 33 vector<double> conv_power(const FiniteGroup& G, vector<double> p, long long t){ 34 vector<double> res = delta_identity(G); // identity for convolution 35 while(t>0){ 36 if (t & 1LL) res = convolve(G, res, p); 37 t >>= 1LL; 38 if (t>0) p = convolve(G, p, p); 39 } 40 return res; 41 } 42 43 int main(){ 44 FiniteGroup G = FiniteGroup::S3(); 45 int n = G.elems.size(); 46 47 // Step distribution: with probability 1/2 apply s=(0 1), with 1/2 apply t=(1 2) 48 vector<double> step(n, 0.0); 49 // Identify indices of s and t 50 Perm s{1,0,2}, t{0,2,1}; 51 auto idx = [&](const Perm& p){ return G.id_of[FiniteGroup::key(p)]; }; 52 step[idx(s)] = 0.5; step[idx(t)] = 0.5; 53 54 long long T = 5; // number of steps 55 vector<double> dist = conv_power(G, step, T); 56 57 cout.setf(ios::fixed); cout<<setprecision(6); 58 cout << "Distribution after "<<T<<" steps on S3 (sum should be 1):\n"; 59 double sum=0.0; for(double x: dist){ cout<<x<<' '; sum+=x; } cout<<"\nTotal= "<<sum<<"\n"; 60 return 0; 61 } 62
Models a random walk on S3 where each step applies one of two transpositions with equal probability. Convolution powers p^{(*t)} give the t-step distribution. Exponentiation by squaring reduces the number of convolutions from O(t) to O(log t).