šŸŽ“How I Study AIHISA
šŸ“–Read
šŸ“„PapersšŸ“°BlogsšŸŽ¬Courses
šŸ’”Learn
šŸ›¤ļøPathsšŸ“šTopicsšŸ’”ConceptsšŸŽ“Shorts
šŸŽÆPractice
ā±ļøCoach🧩Problems🧠ThinkingšŸŽÆPrompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
šŸ“šTheoryAdvanced

Gauge Equivariant Networks

Key Points

  • •
    Gauge equivariant networks are neural networks that respect local symmetries (gauges) on manifolds, such as how vectors rotate when you change the local reference frame on a surface.
  • •
    They work by representing features in fibers (like tangent spaces) and ensuring every operation (transport, convolution, nonlinearity) transforms correctly under the structure group (e.g., SO(2) on surfaces).
  • •
    Discrete implementations on meshes use parallel transport angles on edges to align neighbor features before aggregation and restrict weights to be intertwiners that do not break equivariance.
  • •
    A practical recipe is to expand features into circular-harmonic types (orders m=0,1,2,...) where rotations act by simple 2Ɨ2 blocks and to share weights within each order.
  • •
    Nonlinearities must be chosen carefully (e.g., norm-based or gated) so that higher-order vector features remain equivariant.
  • •
    You can numerically verify equivariance by randomly changing gauges per vertex, updating edge transport angles accordingly, and checking outputs rotate exactly as predicted.
  • •
    The computational cost is similar to message-passing GNNs, typically O(E Ā· F) per layer, with small extra trigonometric costs for rotations.
  • •
    Gauge equivariance is essential for learning on surfaces, vector fields, and physics problems where predictions should not depend on arbitrary local frame choices.

Prerequisites

  • →Basic group theory — Understand group actions, representations, and the difference between invariance and equivariance.
  • →Linear algebra — Features live in vector spaces and are transformed by matrices; weights are intertwiners between representations.
  • →Multivariable calculus and manifolds (intro) — Frames, tangent spaces, and the exponential map underpin convolution on manifolds.
  • →Graph neural networks (GNNs) — Gauge-equivariant layers on meshes are implemented as message passing with special alignment and weights.
  • →Fourier analysis on the circle — Circular harmonics (cos/sin pairs) are the irreducible representations of SO(2) used to rotate features.
  • →Differential geometry (connections, parallel transport) — Edge transport angles discretize connections; equivariance relies on correct parallel transport.
  • →Representation theory (irreps, intertwiners) — Weight sharing per order and nonlinearity design come from representation-theoretic constraints.
  • →Numerical methods on meshes/graphs — Discretization choices (angles, neighbors) affect stability and correctness.
  • →C++ programming (vectors, performance) — Implementing efficient rotations, aggregations, and tests requires solid C++.
  • →Trigonometry — Rotations use sin/cos and angle normalization; numerical care is needed.

Detailed Explanation

Tap terms for definitions

01Overview

Gauge equivariant networks are neural architectures designed to process data that live on manifolds with local symmetries (gauges). Think of a 2D surface like a curved sheet: at every point you can draw a little x–y axis (a tangent frame). There is no globally consistent way to pick all these axes on a curved surface, and you are free to rotate each local frame without changing the underlying geometry. A network is gauge equivariant if its outputs transform predictably when you rotate these local frames. This is crucial when inputs and outputs are not just scalars but geometric objects like tangent vectors or higher-order features that live in fibers of a bundle over the manifold. In practice, we discretize the manifold as a mesh or graph, store a frame (gauge) per node, and store how to align frames along edges (a discrete connection, or parallel transport angle). Features are decomposed into irreducible representations of the gauge group (e.g., circular harmonics for SO(2)), which makes rotations act by tiny block matrices. Layers align neighbor features via parallel transport, apply order-wise shared weights (intertwiners), aggregate, and optionally apply equivariant nonlinearities. With these ingredients, the network’s predictions depend only on the underlying geometry and data—not on the arbitrary choice of frames.

02Intuition & Analogies

Imagine you are hiking across a hilly terrain (a surface). At every point you carry a tiny compass (your local frame) showing ā€œeastā€ and ā€œnorth.ā€ When you walk to a neighbor point, the notion of east/north twists a little, depending on the terrain’s curvature—this twist is parallel transport. Now suppose you want a rule to combine wind measurements (vectors) from your neighbors to predict the wind at your location. If your friend uses a different brand of compass (rotated by, say, 30° everywhere), both of you should still get predictions that agree up to that same 30° rotation. A network with this property is gauge equivariant. A second analogy: playlists and headphones. You can wear your headphones at any rotation; the song you hear ā€œrotatesā€ with the headphones, but the underlying melody is unchanged. If two people wear their headphones differently, a device that analyzes the song should produce the same conclusions regardless of headphone angle. In geometry, the headphone angle is the gauge (local frame choice), and the song is the true geometric signal. Operationally, we break features into types that we know how to rotate easily—like splitting a signal on a circle into sines and cosines. When we rotate the frame by an angle Īø, an m-th order sine/cosine pair just rotates by mĀ·Īø via a 2Ɨ2 matrix. That makes it cheap to ā€œline upā€ neighbor features by undoing their local rotation before summing. We then only allow weights that do not mix different types (orders), so the whole computation commutes with every local rotation. The result is a model that learns geometry-aware patterns without being confused by arbitrary frame choices.

03Formal Definition

Let M be a smooth manifold with a principal G-bundle of frames (structure group G, e.g., G=SO(2) for 2D surface tangents). A feature field is a section s of an associated vector bundle E=F×ρin​​Vin​, where F is the frame bundle and ρin​:G→ GL(Vin​) is a representation. A gauge is a choice of local trivialization (frame) g:M→ G. Under a gauge change g, the section transforms pointwise as sg(x)=ρin​(g(x))\,s(x). A network layer F mapping sections Ein​→ Eout​ is gauge equivariant if for every gauge field g, F obeys F[sg](x)=ρout​(g(x))\,F[s](x), where ρout​ is the output representation. In continuous gauge-equivariant convolution on M, one constructs at each x a kernel Kx​:Tx​M→ Hom(Vin​,Vout​), defined in a local frame. The kernel must satisfy the steerability constraint Kx​(Rā‹… p)=ρout​(R)\,Kx​(p)\,ρin​(R)^{-1} for every R∈ G acting on tangent vectors p. With this, the operator (F s)(x)=∫Tx​M​Kx​(p)\,\big(s(expx​p)\big)\,dp is well defined and independent of the chosen frame at x. On a discrete mesh/graph, we store a frame at each vertex u and a parallel transport (connection) along edges as a group element gu←vā€‹āˆˆ G mapping v’s frame to u’s frame. A message-passing layer is equivariant if it aggregates as h(l+1)(u)=āˆ‘v∈N(u)​ρout​(gu←v​)\,Φ(duv​)\,ρin​(gu←v​)^{-1}\,h(l)(v), with weight sharing restricted so that Φ interwines the two representations per order. Nonlinearities are required to be equivariant too (e.g., act on norms or via gates from invariant channels).

04When to Use

  • Surface or manifold data: Learning from signals on curved surfaces (e.g., textures on 3D meshes, climate fields on the sphere) where directions matter and rotate from point to point.
  • Vector/tensor fields: Tasks with inputs/outputs that are not scalars—wind velocity on Earth, fiber orientations in diffusion MRI, principal curvature directions, or electromagnetic fields—where predictions must rotate with local frames.
  • Physics and PDEs on meshes: When physical laws are covariant and you want models that respect local symmetry, avoiding spurious frame-dependent artifacts.
  • Low-data or generalization-critical regimes: Equivariance bakes in strong inductive bias, improving data efficiency and robustness when orientations vary or gauges are arbitrary.
  • Non-Euclidean sensor layouts: Graphs from geodesic neighborhoods, point clouds on surfaces, or non-orientable domains where a global orientation is impossible; local gauge handling keeps the model consistent. Avoid gauge-equivariant machinery if: signals are purely scalar and orientation-agnostic (standard GNN/CNN suffices), the domain is flat with a fixed global frame (ordinary SE(2)/SO(3) equivariance may be simpler), or if obtaining a reliable discrete connection (parallel transport) is infeasible/noisy enough to dominate errors.

āš ļøCommon Mistakes

  • Confusing invariance with equivariance: Invariance forces outputs not to change; equivariance requires predictable change (e.g., vectors rotate). Using invariant pooling when you need vector outputs discards information.
  • Forgetting frame alignment: Aggregating neighbors without parallel transport (or with the wrong sign) breaks equivariance. Always rotate neighbor features into the receiver’s frame before combining.
  • Mixing representation orders: Weight matrices must not mix different harmonic orders (m); only mix channels within the same order. Cross-order mixing violates the intertwiner constraint and breaks equivariance.
  • Using standard nonlinearities on vector/tensor channels: ReLU on each component is not equivariant. Use norm-based, gated, or tensor product nonlinearities designed for the representation.
  • Ignoring reflections when the domain/group is O(2) rather than SO(2): If local frames can flip, include reflection parity and use the correct representation; otherwise, predictions will be inconsistent on non-orientable meshes.
  • Inconsistent angle conventions: Swapping source/target, using degrees vs radians, or adding instead of subtracting angles in updates like \theta'{u\leftarrow v}=\theta{u\leftarrow v}+\gamma(u)-\gamma(v) yields subtle bugs.
  • Numerical drift: Accumulating many transports around loops (holonomy) can cause angle wrap issues; always normalize angles to (āˆ’\pi,\pi] and test loop consistency.
  • Overcomplicating kernels: Start with simple radial weights shared per order; only add complex steerable bases once the pipeline is correct and numerically verified.

Key Formulas

Gauge Equivariance

F[sg](x)=ρout​(g(x))F[s](x)

Explanation: This states that if inputs are transformed by a gauge field g via the input representation, the layer’s outputs transform by the output representation at each point. The computation commutes with any choice of local frames.

Input Gauge Action

sg(x)=ρin​(g(x))s(x)

Explanation: A feature section s transforms pointwise under a gauge change g by the input representation. This encodes how components rotate with the frame.

Steerability Constraint

K(Rā‹…p)=ρout​(R)K(p)ρin​(R)āˆ’1,R∈G

Explanation: The kernel defined in a local frame must transform consistently when we rotate the local coordinates. This ensures the resulting convolution does not depend on the arbitrary frame.

Manifold Convolution (Local Chart)

(Fs)(x)=∫Tx​M​Kx​(p)s(expx​p)dp

Explanation: At each point x, integrate a kernel over the tangent space using the exponential map to sample neighbors. With a steerable kernel, this is gauge independent.

SO(2) Irrep (Order m>0)

Rm​(Īø)=[cos(mĪø)sin(mĪø)ā€‹āˆ’sin(mĪø)cos(mĪø)​]

Explanation: Rotating the frame by Īø rotates a sine/cosine pair of order m by mĪø with a 2Ɨ2 rotation block. This is the basic building block for feature rotation.

Discrete Gauge-Equivariant Message Passing

h(l+1)(u)=v∈N(u)āˆ‘ā€‹Ļout​(gu←v​)Φ(duv​)ρin​(gu←v​)āˆ’1h(l)(v)

Explanation: Neighbor features are first aligned to u’s frame via the connection gu←v​, processed by order-wise weights Φ (intertwiners), and summed. This guarantees equivariance if Φ respects representation blocks.

Gauge Update of Edge Transport (SO(2))

Īøu←v′​=Īøu←v​+γ(u)āˆ’Ī³(v)

Explanation: When you rotate the frame at each vertex by a gauge field γ, the edge transport angles must be updated accordingly. This preserves the meaning of parallel transport between frames.

Feature Dimension by Orders

F=m=0āˆ‘L​dm​Cm​,d0​=1,dm​=2(m>0)

Explanation: The total number of scalar components per vertex equals the sum over orders: 1 per scalar channel (m=0) and 2 per vector-like channel (m>0). This is useful for complexity estimates.

Norm-based Equivariant Nonlinearity

Ļ•m​(v)=σ(∄v∄)∄v∄+εv​

Explanation: A function of the norm (invariant under rotations) scales the vector without changing its direction relative to the frame. This preserves equivariance for m>0 channels.

Complexity Analysis

Let V be the number of vertices, E the number of (directed) edges, and let feature types be indexed by orders m=0..L with Cm​ channels per order. Each order contributes dm​ components per channel, where d0​=1 and dm​=2 for m>0. The total feature dimension per vertex is F=āˆ‘m​ dm​ Cm​. A gauge-equivariant message passing layer performs, for each edge, (1) a rotation of neighbor features by the edge’s transport angle and (2) an order-wise linear mixing by weights that do not couple different m. The rotation cost per edge is O(F) with very small constants (just computing cos(mĪø), sin(mĪø) and applying 2Ɨ2 blocks for m>0). If mixing is done per edge, the linear map per order costs O(Cmin​·Cmout​·dm​), giving a total per edge cost O(āˆ‘m​ dm​ Cmin​ Cmout​). Summed over edges, the layer time complexity is O(EĀ·(F + āˆ‘m​ dm​ Cmin​ Cmout​)). When Cmin​=Cmout​=C across orders and L is small, this is effectively O(EĀ·(F + C2L)). If mixing is deferred until after summation (align first, sum, then apply the linear map once per node), the cost becomes O(EĀ·F + VĀ·āˆ‘m​ dm​ Cmin​ Cmout​), which is preferable when degrees are large. Space complexity is O(VĀ·F + E) for storing features and edge transports, plus O(āˆ‘m​ Cmin​ Cmout​) for the weights (independent of V and E). Caching sin/cos(mĪø) per edge and order costs O(EĀ·L) extra space but reduces recomputation. Overall, the asymptotics match standard GNNs with a small overhead for frame alignment, while delivering strong inductive bias from equivariance.

Code Examples

Utility: SO(2) circular-harmonic feature rotation (orders m=0..L)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Rotate a feature vector organized by SO(2) orders m=0..L.
5// For each order m:
6// - m==0: C_m scalar channels (unchanged by rotation)
7// - m>0 : C_m pairs (cos,sin) per channel rotated by angle m*theta
8// Layout per m is contiguous: [m=0 scalars][m=1 (c0,c1,c0',c1'), ...][m=2 ...]
9
10struct HarmonicLayout {
11 vector<int> C; // C[m] = number of channels at order m
12 vector<int> off; // starting offset for order m
13 vector<int> size; // total scalar components for order m
14 int total = 0; // total feature dimension
15
16 explicit HarmonicLayout(const vector<int>& C_) : C(C_) {
17 int L = (int)C.size() - 1;
18 off.resize(L+1); size.resize(L+1);
19 int cur = 0;
20 for (int m = 0; m <= L; ++m) {
21 int dm = (m==0 ? 1 : 2);
22 off[m] = cur;
23 size[m] = dm * C[m];
24 cur += size[m];
25 }
26 total = cur;
27 }
28};
29
30vector<double> rotate_so2(const vector<double>& feat, double theta, const HarmonicLayout& H) {
31 vector<double> out(H.total);
32 for (int m = 0; m < (int)H.C.size(); ++m) {
33 int C_m = H.C[m];
34 int dm = (m==0 ? 1 : 2);
35 int base = H.off[m];
36 if (m == 0) {
37 for (int c = 0; c < C_m; ++c) out[base + c] = feat[base + c];
38 } else {
39 double ang = m * theta;
40 double cs = cos(ang), sn = sin(ang);
41 for (int c = 0; c < C_m; ++c) {
42 double a = feat[base + 2*c + 0]; // cos component
43 double b = feat[base + 2*c + 1]; // sin component
44 // [a'; b'] = R_m(theta) [a; b]
45 out[base + 2*c + 0] = cs * a - sn * b;
46 out[base + 2*c + 1] = sn * a + cs * b;
47 }
48 }
49 }
50 return out;
51}
52
53int main() {
54 // Example: orders m=0..2, with C0=1 scalar, C1=2 vector channels, C2=1 second-order channel
55 HarmonicLayout H({1, 2, 1});
56 cout << "Total feature dim = " << H.total << "\n";
57
58 // Build a feature: [m=0: s0] [m=1: (a1,b1),(a2,b2)] [m=2: (c1,d1)]
59 vector<double> f(H.total);
60 // m=0
61 f[H.off[0] + 0] = 3.0;
62 // m=1 (two channels)
63 f[H.off[1] + 0] = 1.0; // a1
64 f[H.off[1] + 1] = 0.5; // b1
65 f[H.off[1] + 2] = -0.3; // a2
66 f[H.off[1] + 3] = 2.0; // b2
67 // m=2 (one channel)
68 f[H.off[2] + 0] = 0.7; // c1
69 f[H.off[2] + 1] = -1.2; // d1
70
71 double theta = 0.4; // radians
72 auto fr = rotate_so2(f, theta, H);
73
74 // Rotate back by -theta; should recover original
75 auto f_rr = rotate_so2(fr, -theta, H);
76
77 cout.setf(std::ios::fixed); cout<<setprecision(6);
78 cout << "Original vs. Rotated-back (should match)\n";
79 double max_err = 0.0;
80 for (int i = 0; i < H.total; ++i) {
81 double err = fabs(f[i] - f_rr[i]);
82 max_err = max(max_err, err);
83 cout << i << ": " << f[i] << " vs " << f_rr[i] << "\n";
84 }
85 cout << "Max error: " << max_err << "\n";
86 return 0;
87}
88

This program defines a layout for SO(2) harmonic orders and rotates features accordingly. Scalar channels (m=0) stay fixed, while each (cos,sin) pair for m>0 is rotated by mĀ·Īø using a 2Ɨ2 block. The test rotates a feature by Īø and back by āˆ’Īø to verify correctness numerically.

Time: O(F) per rotation, where F = \sum_{m} d_{m} C_{m}.Space: O(F) auxiliary space for the output vector.
Gauge-equivariant message passing layer on a mesh (SO(2), order-wise intertwiners)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct HarmonicLayout {
5 vector<int> C; vector<int> off; vector<int> size; int total=0;
6 explicit HarmonicLayout(const vector<int>& C_) : C(C_) {
7 int L = (int)C.size() - 1; off.resize(L+1); size.resize(L+1);
8 int cur=0; for(int m=0;m<=L;++m){int dm=(m==0?1:2); off[m]=cur; size[m]=dm*C[m]; cur+=size[m];} total=cur;
9 }
10};
11
12vector<double> rotate_so2(const vector<double>& feat, double theta, const HarmonicLayout& H){
13 vector<double> out(H.total);
14 for(int m=0;m<(int)H.C.size();++m){int Cm=H.C[m];int base=H.off[m]; if(m==0){
15 for(int c=0;c<Cm;++c) out[base+c]=feat[base+c];
16 }else{double ang=m*theta; double cs=cos(ang), sn=sin(ang);
17 for(int c=0;c<Cm;++c){double a=feat[base+2*c+0]; double b=feat[base+2*c+1];
18 out[base+2*c+0]=cs*a - sn*b; out[base+2*c+1]=sn*a + cs*b;}
19 }} return out;
20}
21
22// Directed edge with transport angle from v -> u (align v's frame to u's frame)
23struct Edge { int v; double theta; double w; };
24
25// Gauge-equivariant layer: per-order weight matrices (do not mix orders)
26struct GaugeLayer {
27 HarmonicLayout H_in, H_out;
28 vector<vector<double>> W0; // m=0: [C0_out x C0_in]
29 vector<vector<vector<double>>> Wm; // m>0: per m, [C_out x C_in]
30
31 GaugeLayer(const HarmonicLayout& Hin, const HarmonicLayout& Hout)
32 : H_in(Hin), H_out(Hout) {
33 int L = (int)H_in.C.size() - 1; // assume same L in/out
34 // Initialize weights with small random values
35 std::mt19937 rng(42); std::normal_distribution<double> N(0.0, 0.1);
36 W0.assign(H_out.C[0], vector<double>(H_in.C[0]));
37 for (int i=0;i<H_out.C[0];++i) for (int j=0;j<H_in.C[0];++j) W0[i][j]=N(rng);
38 Wm.resize(L+1);
39 for (int m=1;m<=L;++m){
40 Wm[m].assign(H_out.C[m], vector<double>(H_in.C[m]));
41 for (int i=0;i<H_out.C[m];++i) for (int j=0;j<H_in.C[m];++j) Wm[m][i][j]=N(rng);
42 }
43 }
44
45 // Forward on a graph: adj[u] lists incoming neighbors v with transport theta_{u<-v}
46 vector<vector<double>> forward(const vector<vector<Edge>>& adj, const vector<vector<double>>& X){
47 int V = (int)adj.size();
48 vector<vector<double>> Y(V, vector<double>(H_out.total, 0.0));
49 for (int u=0; u<V; ++u){
50 // Accumulators per order (in u's frame)
51 // Start with zero
52 vector<double> acc0(H_out.C[0], 0.0);
53 vector<vector<double>> accm(H_out.C.size()); // m>0: (cos,sin) pairs per out channel
54 for (int m=1; m<(int)H_out.C.size(); ++m) accm[m] = vector<double>(2*H_out.C[m], 0.0);
55
56 for (const Edge& e: adj[u]){
57 int v = e.v; double theta = e.theta; double w = e.w;
58 // Align neighbor feature to u's frame
59 vector<double> x_align = rotate_so2(X[v], theta, H_in);
60 // m=0: scalar mixing
61 for (int i=0;i<H_out.C[0];++i){
62 double sum=0.0;
63 for (int j=0;j<H_in.C[0];++j){
64 sum += W0[i][j] * x_align[H_in.off[0] + j];
65 }
66 acc0[i] += w * sum;
67 }
68 // m>0: mix channels within the same m; apply weights identically to cos/sin components
69 for (int m=1; m<(int)H_in.C.size(); ++m){
70 int Cin = H_in.C[m], Cout = H_out.C[m];
71 int inb = H_in.off[m]; int outb = H_out.off[m];
72 for (int i=0;i<Cout;++i){
73 double sum_c=0.0, sum_s=0.0;
74 for (int j=0;j<Cin;++j){
75 double w_ij = Wm[m][i][j];
76 double xc = x_align[inb + 2*j + 0];
77 double xs = x_align[inb + 2*j + 1];
78 sum_c += w_ij * xc;
79 sum_s += w_ij * xs;
80 }
81 accm[m][2*i+0] += w * sum_c;
82 accm[m][2*i+1] += w * sum_s;
83 }
84 }
85 }
86 // Write accumulators into output feature vector Y[u]
87 for (int i=0;i<H_out.C[0];++i) Y[u][H_out.off[0]+i] = acc0[i];
88 for (int m=1; m<(int)H_out.C.size(); ++m){
89 for (int i=0;i<H_out.C[m]; ++i){
90 Y[u][H_out.off[m] + 2*i + 0] = accm[m][2*i+0];
91 Y[u][H_out.off[m] + 2*i + 1] = accm[m][2*i+1];
92 }
93 }
94 }
95 return Y;
96 }
97};
98
99int main(){
100 // Simple graph: 4 vertices in a chain 0-1-2-3
101 int V = 4; vector<vector<Edge>> adj(V);
102 auto add_dir = [&](int u,int v,double theta,double w){ adj[u].push_back({v,theta,w}); };
103
104 // Example SO(2) orders: m=0:1 channel, m=1:1 channel (2 comps) => total dim = 3
105 HarmonicLayout Hin({1,1}); HarmonicLayout Hout({1,1});
106
107 // Edge transports (toy angles), make directed both ways with opposite sign
108 add_dir(0,1, +0.2, 1.0); add_dir(1,0, -0.2, 1.0);
109 add_dir(1,2, -0.1, 1.0); add_dir(2,1, +0.1, 1.0);
110 add_dir(2,3, +0.3, 1.0); add_dir(3,2, -0.3, 1.0);
111
112 // Random input features per vertex
113 std::mt19937 rng(0); std::uniform_real_distribution<double> U(-1.0,1.0);
114 vector<vector<double>> X(V, vector<double>(Hin.total));
115 for(int u=0;u<V;++u){
116 for(int i=0;i<Hin.total;++i) X[u][i]=U(rng);
117 }
118
119 GaugeLayer layer(Hin, Hout);
120 auto Y = layer.forward(adj, X);
121
122 cout.setf(std::ios::fixed); cout<<setprecision(4);
123 for(int u=0;u<V;++u){
124 cout << "Y["<<u<<"] = ";
125 for(double z : Y[u]) cout << z << ' ';
126 cout << '\n';
127 }
128 return 0;
129}
130

This program implements a single gauge-equivariant message passing layer on a small directed graph. Each edge stores a parallel transport angle that aligns neighbor features to the receiver’s frame. Features are organized by harmonic order; weights mix channels only within the same order (intertwiners), preserving equivariance. The output at each node is the sum of aligned, linearly mixed neighbor features.

Time: O(E Ā· (F + \sum_{m} d_{m} C_{m}^{in} C_{m}^{out})), typically dominated by O(E Ā· F).Space: O(V Ā· F + E) for data and transports, plus O(\sum_{m} C_{m}^{in} C_{m}^{out}) for weights.
Numerical verification of gauge equivariance under random per-vertex gauges
1#include <bits/stdc++.h>
2using namespace std;
3
4struct HarmonicLayout { vector<int> C, off, size; int total=0; HarmonicLayout(const vector<int>& C_):C(C_){int L=C.size()-1;off.resize(L+1);size.resize(L+1);int cur=0;for(int m=0;m<=L;++m){int dm=(m==0?1:2);off[m]=cur;size[m]=dm*C[m];cur+=size[m];}total=cur;} };
5
6vector<double> rotate_so2(const vector<double>& x, double theta, const HarmonicLayout& H){ vector<double> y(H.total); for(int m=0;m<(int)H.C.size();++m){int Cm=H.C[m], base=H.off[m]; if(m==0){for(int c=0;c<Cm;++c) y[base+c]=x[base+c];} else {double a=m*theta, cs=cos(a), sn=sin(a); for(int c=0;c<Cm;++c){double xc=x[base+2*c+0], xs=x[base+2*c+1]; y[base+2*c+0]=cs*xc - sn*xs; y[base+2*c+1]=sn*xc + cs*xs;}} } return y; }
7
8struct Edge { int v; double theta; double w; };
9
10struct GaugeLayer {
11 HarmonicLayout H; vector<double> W0; vector<double> W1; // simple: same in/out channels and only m=0,1
12 GaugeLayer(const HarmonicLayout& H_) : H(H_) {
13 std::mt19937 rng(123); std::normal_distribution<double> N(0.0, 0.1);
14 W0.resize(H.C[0]*H.C[0]); for(double &z:W0) z=N(rng);
15 W1.resize(H.C[1]*H.C[1]); for(double &z:W1) z=N(rng);
16 }
17 vector<vector<double>> forward(const vector<vector<Edge>>& adj, const vector<vector<double>>& X){
18 int V=adj.size(); vector<vector<double>> Y(V, vector<double>(H.total,0.0));
19 for(int u=0;u<V;++u){
20 // accumulators
21 vector<double> acc0(H.C[0], 0.0); vector<double> acc1(2*H.C[1], 0.0);
22 for(const Edge& e: adj[u]){
23 auto xa = rotate_so2(X[e.v], e.theta, H); // align into u's frame
24 // m=0 mixing: y0_i += w * sum_j W0[i,j] * x0_j
25 for(int i=0;i<H.C[0];++i){ double s=0.0; for(int j=0;j<H.C[0];++j) s += W0[i*H.C[0]+j] * xa[H.off[0]+j]; acc0[i]+=e.w*s; }
26 // m=1 mixing: apply same weights to cos/sin components
27 for(int i=0;i<H.C[1];++i){ double sc=0.0, ss=0.0; for(int j=0;j<H.C[1];++j){ double w=W1[i*H.C[1]+j]; sc += w * xa[H.off[1]+2*j+0]; ss += w * xa[H.off[1]+2*j+1]; } acc1[2*i+0]+=e.w*sc; acc1[2*i+1]+=e.w*ss; }
28 }
29 // write out
30 for(int i=0;i<H.C[0];++i) Y[u][H.off[0]+i]=acc0[i];
31 for(int i=0;i<H.C[1];++i){ Y[u][H.off[1]+2*i+0]=acc1[2*i+0]; Y[u][H.off[1]+2*i+1]=acc1[2*i+1]; }
32 }
33 return Y;
34 }
35};
36
37// Apply a random gauge gamma[u] to features and update edge transports: theta' = theta + gamma(u) - gamma(v)
38void apply_gauge(const vector<double>& gamma, const HarmonicLayout& H,
39 vector<vector<double>>& X, vector<vector<Edge>>& adj){
40 int V = X.size();
41 for(int u=0; u<V; ++u){ X[u] = rotate_so2(X[u], gamma[u], H); }
42 for(int u=0; u<V; ++u){ for(Edge& e: adj[u]){ e.theta = e.theta + gamma[u] - gamma[e.v]; } }
43 // normalize angles to (-pi, pi]
44 auto wrap = [](double a){ while(a<=-M_PI) a+=2*M_PI; while(a>M_PI) a-=2*M_PI; return a; };
45 for(int u=0; u<V; ++u){ for(Edge& e: adj[u]) e.theta = wrap(e.theta); }
46}
47
48int main(){
49 // Graph: 5 nodes in a ring to include nontrivial holonomy
50 int V=5; vector<vector<Edge>> adj(V);
51 auto add_bidir = [&](int u,int v,double th){ adj[u].push_back({v, th, 1.0}); adj[v].push_back({u, -th, 1.0}); };
52 add_bidir(0,1, 0.20); add_bidir(1,2, -0.10); add_bidir(2,3, 0.25); add_bidir(3,4, -0.15); add_bidir(4,0, 0.05);
53
54 // Orders m=0:1 scalar, m=1:2 vector channels (dim=1+2*2=5)
55 HarmonicLayout H({1,2});
56
57 // Random features
58 std::mt19937 rng(7); std::uniform_real_distribution<double> U(-1.0,1.0);
59 vector<vector<double>> X(V, vector<double>(H.total));
60 for(int u=0; u<V; ++u) for(int i=0;i<H.total;++i) X[u][i]=U(rng);
61
62 GaugeLayer layer(H);
63 auto Y1 = layer.forward(adj, X); // baseline output
64
65 // Draw random per-vertex gauge angles
66 vector<double> gamma(V); for(double &g:gamma) g = U(rng);
67
68 // Copy X and adj, apply gauge
69 auto Xg = X; auto adjg = adj; apply_gauge(gamma, H, Xg, adjg);
70
71 // Forward under new gauge
72 auto Y2 = layer.forward(adjg, Xg);
73
74 // Rotate Y2 back by -gamma[u] for comparison with Y1
75 double max_err = 0.0; double sum_err = 0.0; int cnt=0;
76 for(int u=0; u<V; ++u){ auto Y2b = rotate_so2(Y2[u], -gamma[u], H);
77 for(int i=0;i<H.total;++i){ double err=fabs(Y1[u][i]-Y2b[i]); max_err=max(max_err,err); sum_err+=err; ++cnt; }
78 }
79 cout.setf(std::ios::fixed); cout<<setprecision(6);
80 cout << "Equivariance check: max abs error = " << max_err << ", mean abs error = " << (sum_err/cnt) << "\n";
81 return 0;
82}
83

This program verifies equivariance numerically. It computes a layer output, then applies a random per-vertex gauge (rotating features and updating edge transport angles) and recomputes. Finally, it rotates the new outputs back by the inverse gauge and compares them to the original outputs. The tiny numerical error demonstrates correct gauge equivariance.

Time: O(E Ā· F) for each forward pass, plus O(V Ā· F + E Ā· L) to apply a gauge (updating features and angles).Space: O(V Ā· F + E) for features and edges; weights are O(\sum_{m} C_{m}^{2}) in this simplified setting.
#gauge equivariant networks#geometric deep learning#manifold learning#parallel transport#so(2)#o(2)#steerable kernels#circular harmonics#message passing#tangent bundle#fiber bundle#connection#equivariance#nonlinearity#mesh convolution