🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
∑MathAdvanced

Manifolds & Manifold Hypothesis

Key Points

  • •
    A manifold is a space that locally looks like Euclidean space, stitched together by coordinate charts and smooth transition maps.
  • •
    Topological manifolds require homeomorphisms to open sets in \(Rk\), while smooth manifolds require those transitions to be differentiable (\(Cr\) or \(C^∞\)).
  • •
    Tangent spaces capture all possible directions you can move from a point within the manifold and are computed via derivatives of charts.
  • •
    In practice, many surfaces (like a sphere or torus) are smooth manifolds, and their geometry can be computed using Jacobians and metrics.
  • •
    The manifold hypothesis in machine learning posits that high-dimensional data often lie near a much lower-dimensional smooth manifold.
  • •
    Local PCA on neighborhoods is a simple way to estimate the intrinsic dimension implied by the manifold hypothesis.
  • •
    Meshes representing 2D surfaces are manifold when every edge is shared by at most two faces and orientations are globally consistent.
  • •
    C++ implementations can illustrate charts on \(S2\), mesh manifold checks, and local PCA-based intrinsic dimension estimation.

Prerequisites

  • →Linear Algebra — Vectors, matrices, eigenvalues/eigenvectors, and orthonormalization are essential for tangent spaces, PCA, and metrics.
  • →Multivariable Calculus — Jacobians, differentials, and optimization on manifolds rely on derivatives and chain rule.
  • →Basic Topology — Understanding open sets, homeomorphisms, and continuity underpins the definition of topological manifolds.
  • →Numerical Methods — Eigen-decomposition, numerical stability, and iterative algorithms are needed for practical computations on manifolds.
  • →Graph Theory — Manifold learning methods like Isomap build k-NN graphs to approximate geodesic distances.
  • →Geometry and 3D Graphics Basics — Normals, meshes, and parameterizations are common practical contexts for manifolds.
  • →Probability and Statistics — PCA, covariance, and noise modeling are statistical tools used in manifold hypothesis testing.

Detailed Explanation

Tap terms for definitions

01Overview

Manifolds generalize curves and surfaces to higher dimensions. Intuitively, although a manifold may curve and twist globally, each small neighborhood around any point looks flat, like a patch of Euclidean space. For example, the Earth’s surface is globally curved (a sphere) but locally flat (what maps try to capture). Formally, a k-dimensional topological manifold is a space where each point has a neighborhood homeomorphic to an open set in (\mathbb{R}^k). A smooth manifold strengthens this by requiring smoothly compatible coordinate charts, allowing calculus to be done on the manifold. This compatibility ensures that switching between overlapping coordinate systems is differentiable, so derivatives, tangent vectors, and integrals make sense globally. In computer science and engineering, manifolds appear across 3D graphics (meshes approximating 2D manifolds in (\mathbb{R}^3)), robotics (configuration spaces), and machine learning (low-dimensional structures within high-dimensional data). The manifold hypothesis suggests that despite high ambient dimension, the data effectively live near a low-dimensional manifold; this justifies dimensionality reduction techniques. Understanding manifolds equips you to reason about local linear approximations (tangent spaces), curvature, geodesics, and coordinate changes—core tools in numerical geometry, simulation, and representation learning.

02Intuition & Analogies

Imagine exploring a large, oddly shaped park at night with only a small flashlight. Wherever you stand, the patch of ground you see looks roughly flat, even if the entire park has hills and valleys. That small lit patch represents a coordinate chart: a simple map from your local view to a flat coordinate grid. As you walk, you switch to new patches and new flat grids; the rules that translate between those grids are the transition maps. If those translation rules are smooth (no sudden jumps), you can differentiate paths, talk about directions, and measure speeds—this is a smooth manifold. Think about the globe: sailors use different nautical charts for different ocean regions. Each chart is accurate locally but distorts globally; where charts overlap, there are instructions (transition maps) to convert coordinates from one chart to another. The globe is a 2D smooth manifold embedded in 3D space. Similarly, a person’s pose (joint angles) lies in a curved configuration space; nearby poses differ smoothly along a few degrees of freedom—again, a manifold. In high-dimensional data (like images), the manifold hypothesis says that although each image might have thousands of pixels (dimensions), the set of all valid images of, say, handwritten digits lies near a much lower-dimensional sheet parameterized by factors like stroke width, slant, and digit identity. Locally, you can approximate this sheet with a flat plane (tangent space), much like approximating a hill with a tangent plane; PCA on a local neighborhood is precisely this flattening.

03Formal Definition

A topological space \(M\) is a k-dimensional topological manifold if it is Hausdorff, second countable, and for every \(x ∈ M\) there exists an open neighborhood \(U ⊂ M\) and a homeomorphism (chart) \(φ: U → V\) where \(V\) is an open set in \(Rk\). An atlas is a collection of charts whose domains cover \(M\). If for every pair of overlapping charts \((U,φ)\) and \((W,ψ)\), the transition maps \(ψ ∘ φ−1: φ(U ∩ W) → ψ(U ∩ W)\) are \(Cr\) (commonly \(C^∞\)), then \(M\) is a \(Cr\) (smooth) manifold. The dimension k is uniquely determined (invariance of domain) and equals the dimension of each chart’s Euclidean target. For a smooth manifold, the tangent space \(Tx​M\) is a k-dimensional vector space attached to each \(x\), defined as equivalence classes of curves \(γ\) with \(γ(0)=x\) where two curves are equivalent if their derivatives in coordinates coincide; equivalently, as derivations at \(x\) on smooth functions. If \(φ\) is a chart, the differential \(Dφx​: Tx​M → Rk\) is a linear isomorphism. If \(M\) is embedded in \(Rn\), we can compute tangent spaces as kernels of constraint Jacobians or as spans of chart Jacobian columns. A Riemannian metric equips each \(Tx​M\) with an inner product, often induced from the ambient Euclidean metric via \(g=JT J\), enabling definitions of angles, lengths, and geodesics.

04When to Use

Use the topological manifold notion when you need local Euclidean structure without calculus—e.g., reasoning about connectivity, compactness, or the existence of global coordinates. Upgrade to smooth manifolds when differentiation matters: computing velocities, tangent vectors, curvature, or integrating fields over the space. In graphics and geometry processing, surfaces are modeled as 2-manifolds embedded in (\mathbb{R}^3); manifoldness ensures well-defined normals, parameterizations, and PDEs. In robotics and control, configuration spaces (like rotations (SO(3)) or poses (SE(3))) are smooth manifolds, so filters and optimizers must respect manifold constraints. In data science, apply the manifold hypothesis as a modeling assumption: even if data live in a high-dimensional ambient space, their intrinsic degrees of freedom are small. Then, local linear methods (PCA), nonlinear dimensionality reduction (Isomap, LLE, t-SNE, UMAP), and manifold regularization are appropriate. You might estimate intrinsic dimension by looking at the decay of local covariance eigenvalues or by scaling laws of neighbor distances. Use atlases/parameterizations when designing texture mappings, remeshing, or numerical PDE solvers on surfaces. Always verify manifoldness of meshes in preprocessing to avoid downstream geometric or simulation failures.

⚠️Common Mistakes

• Confusing global and local properties: A manifold can be globally non-Euclidean (like a sphere) yet still be locally Euclidean. Expecting a single global chart causes issues; many manifolds require multiple overlapping charts. • Ignoring transition smoothness: Having local coordinates is not enough for a smooth manifold—transition maps must be differentiable. Discontinuous or non-differentiable transitions break calculus on the manifold. • Treating nonmanifold meshes as surfaces: Edges shared by more than two faces or inconsistent orientation lead to ambiguous normals and invalid simulations. Always run manifold checks before geometry processing. • Misusing ambient vs intrinsic dimension: High ambient dimension does not imply complexity if the intrinsic (manifold) dimension is small. Conversely, PCA on the whole dataset may miss curved structure; local PCA better approximates tangent spaces. • Numerical instability in tangent estimation: Finite differences or PCA can be sensitive to scale and noise. Always center data, normalize scales, and use thresholds to separate signal eigenvalues from noise. • Assuming the manifold hypothesis always holds: Real data can have branching structures, self-intersections (in embedding), or heavy noise, violating smooth manifold assumptions. Validate with diagnostics (eigenvalue gaps, neighborhood consistency, graph connectivity) before relying on manifold-based algorithms.

Key Formulas

Topological Manifold (Local Euclidean Structure)

M is a k-manifold ⟺∀x∈M,∃(U,φ), x∈U, φ:U→V⊂Rk a homeomorphism

Explanation: Every point has a neighborhood that looks like an open set in Euclidean space via a homeomorphism. This captures the idea of being locally flat.

Atlas Compatibility

Smoothness: (U,φ),(W,ψ)⟹ψ∘φ−1:φ(U∩W)→ψ(U∩W) is Cr

Explanation: A smooth manifold requires that switching coordinates between overlapping charts is differentiable. This ensures calculus works consistently across charts.

Tangent Space via Curves

Tx​M={γ′(0)∣γ:(−ϵ,ϵ)→M, γ(0)=x}

Explanation: The tangent space is the set of velocities of curves passing through the point. It is a k-dimensional vector space for a k-manifold.

Differential of a Chart

Dφx​:Tx​M→Rk,(Dφx​)ij​=∂uj​∂φi​​(x)

Explanation: The Jacobian of the chart gives a linear isomorphism between the tangent space and Euclidean space. Its columns are a basis for tangent vectors.

Induced Metric

g(x)=J(x)TJ(x)

Explanation: For an embedded manifold parameterized by a chart with Jacobian J, the inner product on tangent vectors is given by J transposed times J. This lets you measure lengths and angles on the manifold.

Geodesic Equation

γ geodesic⟺∇γ˙​​γ˙​=0

Explanation: A geodesic has zero covariant acceleration under the Levi-Civita connection. It generalizes straight lines to curved spaces.

Unit Sphere

Sn−1={x∈Rn∣∥x∥=1}

Explanation: The set of points at unit distance from the origin is a smooth (n−1)-dimensional manifold. It is a canonical example.

Local PCA Covariance

C=m1​i=1∑m​(xi​−μ)(xi​−μ)T,μ=m1​i=1∑m​xi​

Explanation: The covariance of m neighbor points (centered by their mean) summarizes variation. Eigenvectors with largest eigenvalues approximate tangent directions.

Explained Variance Ratio

ExplainedVar(r)=∑i=1d​λi​∑i=1r​λi​​

Explanation: The fraction of total variance captured by the top r eigenvalues helps pick intrinsic dimension by thresholding (e.g., 95%).

Manifold Hypothesis

k≤d, data lies near a k-manifold in Rd

Explanation: This expresses that only k degrees of freedom matter even if the ambient dimension d is large. It motivates dimensionality reduction.

Whitney Embedding Theorem

Whitney: any smooth k-manifold embeds in R2k

Explanation: A theoretical guarantee that low-dimensional Euclidean space suffices to host any smooth manifold without self-intersections.

Great-Circle Geodesic on the Sphere

pgeod​(s)=cos(s)p+sin(s)t,t∈Tp​S2, ∥t∥=1

Explanation: Starting at point p with unit tangent t, moving distance s along the geodesic stays on the sphere via this closed-form expression.

Complexity Analysis

The three C++ examples involve geometry on manifolds, mesh manifold checks, and local PCA. For the S2 chart/tangent example, operations are constant-time analytic formulas: stereographic projections, evaluating Jacobians, Gram–Schmidt orthonormalization, and the closed-form exponential map on the sphere. Each call runs in O(1) time and O(1) space. The mesh manifold checker processes F triangles and their edges. Building edge incidence with an unorderedm​ap uses O(F) expected insertions and lookups; overall expected time is O(F) with good hashing, and memory is O(E), where E is the number of unique edges (E ≈ 3F/2 for typical closed triangle meshes). Orientation consistency via BFS/DFS over the face adjacency graph visits each face and shared edge at most once, leading to O(F) time and O(F) space. Pathological hash collisions could degrade to O(F log F) or worse, but in practice the expected linear behavior holds. The local PCA intrinsic-dimension estimator for N points in ambient dimension d with k nearest neighbors uses a naive k-NN search that computes all pairwise distances: O(N2 d) time and O(N) space for distances on the fly. For each point, forming the d×d covariance from k neighbors costs O(k d2). Diagonalizing a d×d symmetric matrix via a Jacobi method costs O(d3) per point. Thus total time is O(N2 d + N(k d2 + d3)) and space is O(d2) per covariance plus O(d2) for eigenvectors; storing all data points is O(N d). For small d (common in practice, e.g., d≤10 or 50), the dominant term is the O(N2 d) neighbor search; using a k-d tree or ball tree can reduce expected search to O(N log N) per query in low ambient dimensions, but worst case remains higher in high d due to the curse of dimensionality.

Code Examples

Charts, transition maps, tangent basis, and geodesics on S^2 (sphere)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Vec3 {
5 double x, y, z;
6 Vec3(double x_=0, double y_=0, double z_=0): x(x_), y(y_), z(z_) {}
7 Vec3 operator+(const Vec3& o) const { return Vec3(x+o.x, y+o.y, z+o.z); }
8 Vec3 operator-(const Vec3& o) const { return Vec3(x-o.x, y-o.y, z-o.z); }
9 Vec3 operator*(double s) const { return Vec3(x*s, y*s, z*s); }
10};
11
12static inline double dot(const Vec3& a, const Vec3& b){ return a.x*b.x + a.y*b.y + a.z*b.z; }
13static inline Vec3 cross(const Vec3& a, const Vec3& b){
14 return Vec3(a.y*b.z - a.z*b.y, a.z*b.x - a.x*b.z, a.x*b.y - a.y*b.x);
15}
16static inline double norm(const Vec3& a){ return sqrt(dot(a,a)); }
17static inline Vec3 normalize(const Vec3& a){ double n = norm(a); return (n>0? a*(1.0/n): a); }
18
19// Stereographic projection from the NORTH pole: (u,v) in R^2 -> S^2 \ {north}
20Vec3 stereoNorth(double u, double v){
21 double r2 = u*u + v*v;
22 double d = r2 + 1.0;
23 double x = 2.0*u / d;
24 double y = 2.0*v / d;
25 double z = (r2 - 1.0) / d;
26 return Vec3(x,y,z);
27}
28
29// Inverse stereographic projection to the NORTH chart: S^2 \ {north} -> R^2
30pair<double,double> invStereoNorth(const Vec3& p){
31 // Undefined near north pole where 1 - z ~ 0; caller should avoid that region for this chart.
32 double denom = 1.0 - p.z;
33 return { p.x / denom, p.y / denom };
34}
35
36// Stereographic projection from the SOUTH pole: (u,v) in R^2 -> S^2 \ {south}
37Vec3 stereoSouth(double u, double v){
38 double r2 = u*u + v*v;
39 double d = r2 + 1.0;
40 double x = 2.0*u / d;
41 double y = 2.0*v / d;
42 double z = (1.0 - r2) / d; // sign flipped in z
43 return Vec3(x,y,z);
44}
45
46// Inverse stereographic projection to the SOUTH chart: S^2 \ {south} -> R^2
47pair<double,double> invStereoSouth(const Vec3& p){
48 // Undefined near south pole where 1 + z ~ 0; caller should avoid that region for this chart.
49 double denom = 1.0 + p.z;
50 return { p.x / denom, p.y / denom };
51}
52
53// Jacobian of the north chart at (u,v): columns are tangent vectors in R^3
54pair<Vec3,Vec3> jacobianStereoNorth(double u, double v){
55 double r2 = u*u + v*v;
56 double d = r2 + 1.0;
57 double d2 = d*d;
58 // x = 2u/d, y = 2v/d, z = (r2 - 1)/d
59 // dx/du = 2(-u*u + v*v + 1)/d^2; dx/dv = -4uv/d^2
60 // dy/du = -4uv/d^2; dy/dv = 2(u*u - v*v + 1)/d^2
61 // dz/du = 4u/d^2; dz/dv = 4v/d^2
62 Vec3 Ju( 2.0*(-u*u + v*v + 1.0)/d2, -4.0*u*v/d2, 4.0*u/d2 );
63 Vec3 Jv( -4.0*u*v/d2, 2.0*(u*u - v*v + 1.0)/d2, 4.0*v/d2 );
64 return {Ju, Jv};
65}
66
67// Exponential map on S^2 (exact great-circle formula)
68Vec3 expMapSphere(const Vec3& p_unit, const Vec3& tangent){
69 // p_unit must be unit length; tangent must be orthogonal to p_unit (tangent in TpS^2)
70 double s = norm(tangent);
71 if (s < 1e-12) return p_unit; // zero step
72 Vec3 t = tangent * (1.0/s);
73 Vec3 res = p_unit * cos(s) + t * sin(s);
74 // Numerical cleanup to unit length
75 return normalize(res);
76}
77
78int main(){
79 cout.setf(std::ios::fixed); cout<<setprecision(6);
80
81 // Pick a point in the north chart coordinates (u,v)
82 double u = 0.5, v = 0.4;
83 Vec3 p = stereoNorth(u,v); // point on S^2
84 cout << "Point on S^2 via north chart: ("<<p.x<<","<<p.y<<","<<p.z<<")\n";
85
86 // Transition map: north (u,v) -> sphere -> south (u',v')
87 auto us = invStereoSouth(p);
88 cout << "Transition to south chart: (u',v') = ("<<us.first<<","<<us.second<<")\n";
89
90 // Analytic transition map should be (u',v') = (u/(u^2+v^2), v/(u^2+v^2))
91 double r2 = u*u + v*v;
92 cout << "Analytic check: ("<< (u/(r2)) <<","<< (v/(r2)) <<")\n";
93
94 // Tangent basis from Jacobian columns, then Gram-Schmidt to orthonormalize
95 auto [Ju, Jv] = jacobianStereoNorth(u,v);
96 Vec3 e1 = normalize(Ju);
97 // remove component along e1 from Jv
98 Vec3 Jv_ortho = Jv - e1 * dot(e1, Jv);
99 Vec3 e2 = normalize(Jv_ortho);
100
101 // Verify tangent vectors are orthogonal to p
102 cout << "dot(p,e1) = "<< dot(p,e1) << ", dot(p,e2) = "<< dot(p,e2) <<" (both ~0)\n";
103
104 // Move along a tangent direction using the exponential map
105 double step = 0.3; // radians on the sphere
106 Vec3 p_next = expMapSphere(normalize(p), e1 * step);
107 cout << "Moved along geodesic: ("<<p_next.x<<","<<p_next.y<<","<<p_next.z<<") with |p_next|="<< norm(p_next) <<"\n";
108
109 return 0;
110}
111

This program demonstrates key smooth-manifold concepts on the unit sphere S^2. It constructs stereographic charts (north and south), computes a transition map by moving through the sphere, and verifies the analytic transition formula. It then evaluates the Jacobian of the north chart to obtain two tangent vectors, orthonormalizes them, and confirms they are orthogonal to the surface normal. Finally, it uses the exact exponential map (great-circle formula) to move along a geodesic within the tangent direction, ensuring the result remains on the sphere.

Time: O(1) per operationSpace: O(1)
Triangle mesh edge-manifold and orientation consistency checker
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Edge { int a,b; };
5struct EdgeHash { size_t operator()(Edge const& e) const noexcept { return ( (uint64_t)e.a<<32 ) ^ (uint64_t)e.b; } };
6bool operator==(Edge const& e1, Edge const& e2){ return e1.a==e2.a && e1.b==e2.b; }
7
8// Ensure undirected canonical ordering (a<b)
9static inline Edge make_edge(int i, int j){ if(i<j) return {i,j}; else return {j,i}; }
10
11// For orientation: return +1 if face uses edge as (i->j), -1 if (j->i), 0 if not present
12int edge_direction_in_face(const array<int,3>& f, int i, int j){
13 for(int t=0;t<3;++t){
14 int u=f[t], v=f[(t+1)%3];
15 if(u==i && v==j) return +1;
16 if(u==j && v==i) return -1;
17 }
18 return 0;
19}
20
21struct ManifoldReport{
22 bool edge_manifold=true; // no edge has >2 incident faces
23 bool has_boundary=false; // at least one edge has exactly 1 incident face
24 bool orientable=true; // orientation can be made globally consistent
25 vector<pair<Edge,vector<int>>> nonmanifold_edges; // edges with >2 faces
26};
27
28ManifoldReport check_mesh_manifold(const vector<array<int,3>>& F){
29 ManifoldReport rep;
30 unordered_map<Edge, vector<int>, EdgeHash> edge2faces;
31 edge2faces.reserve(F.size()*2);
32
33 auto add_edge = [&](int a, int b, int fidx){ edge2faces[make_edge(a,b)].push_back(fidx); };
34
35 for(int fi=0; fi<(int)F.size(); ++fi){
36 auto f=F[fi];
37 add_edge(f[0],f[1],fi); add_edge(f[1],f[2],fi); add_edge(f[2],f[0],fi);
38 }
39
40 for(auto &kv : edge2faces){
41 if((int)kv.second.size() == 1) rep.has_boundary = true;
42 else if((int)kv.second.size() > 2){ rep.edge_manifold=false; rep.nonmanifold_edges.push_back({kv.first, kv.second}); }
43 }
44
45 // Orientation consistency check via BFS if edge-manifold (ignoring boundaries works too)
46 int nF = (int)F.size();
47 vector<int> orient(nF, 0); // 0=unvisited, +1 or -1 = orientation sign
48
49 // Build adjacency: for each edge with 2 incident faces, connect them
50 vector<vector<pair<int,int>>> adj(nF); // neighbor face, sign relation
51 for(auto &kv : edge2faces){
52 auto &faces = kv.second;
53 if(faces.size()==2){
54 int f0=faces[0], f1=faces[1];
55 // Determine if orientations should be same or opposite:
56 // Across a shared edge, consistent global orientation requires opposite traversal of the edge.
57 // If both faces traverse edge in the same direction, then their face signs must be opposite.
58 // Let s = dir0 * dir1. If s=+1 (same), then orient[f1] = -orient[f0]. If s=-1 (opposite), then orient[f1] = +orient[f0].
59 const auto &A=F[f0], &B=F[f1];
60 int i=kv.first.a, j=kv.first.b;
61 int d0 = edge_direction_in_face(A,i,j);
62 int d1 = edge_direction_in_face(B,i,j);
63 int s = d0 * d1; // should be +1 or -1
64 if(s==0){ /*degenerate face*/ }
65 int rel = (s==+1 ? -1 : +1);
66 adj[f0].push_back({f1, rel});
67 adj[f1].push_back({f0, rel});
68 }
69 }
70
71 for(int start=0; start<nF; ++start){
72 if(orient[start]!=0) continue;
73 orient[start]=+1;
74 queue<int>q; q.push(start);
75 while(!q.empty()){
76 int u=q.front(); q.pop();
77 for(auto [v,rel]: adj[u]){
78 int desired = orient[u]*rel;
79 if(orient[v]==0){ orient[v]=desired; q.push(v); }
80 else if(orient[v]!=desired){ rep.orientable=false; }
81 }
82 }
83 }
84
85 return rep;
86}
87
88int main(){
89 // Example: Tetrahedron (closed 2-manifold, orientable, no boundary)
90 // Vertices: 0,1,2,3; Faces: 4 triangles
91 vector<array<int,3>> F = {
92 {0,1,2}, {0,3,1}, {1,3,2}, {0,2,3}
93 };
94
95 ManifoldReport rep = check_mesh_manifold(F);
96 cout << boolalpha;
97 cout << "edge_manifold = "<<rep.edge_manifold<<"\n";
98 cout << "has_boundary = "<<rep.has_boundary<<"\n";
99 cout << "orientable = "<<rep.orientable<<"\n";
100 if(!rep.nonmanifold_edges.empty()){
101 cout << "Nonmanifold edges: \n";
102 for(auto &e: rep.nonmanifold_edges){ cout << "("<<e.first.a<<","<<e.first.b<<") incident faces="<<e.second.size()<<"\n"; }
103 }
104
105 return 0;
106}
107

This program checks a triangle mesh for edge-manifoldness (no edge has more than two incident faces) and attempts to assign a globally consistent orientation across faces. It builds an edge-to-faces map to detect nonmanifold and boundary edges, then constructs a face adjacency graph and uses BFS to propagate orientations using the rule that shared edges must be traversed in opposite directions in consistently oriented faces. The tetrahedron example is a closed, orientable 2-manifold; the program reports true for edge_manifold and orientable, and false for has_boundary.

Time: O(F) expected with hashing (O(F log F) worst-case with pathological hashing)Space: O(E) where E is the number of unique edges (typically O(F))
Local PCA intrinsic-dimension estimator (Swiss roll demo, manifold hypothesis)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Generate Swiss roll in R^3: intrinsic dimension ~2
5vector<array<double,3>> swiss_roll(int N, double noise=0.05){
6 vector<array<double,3>> X; X.reserve(N);
7 std::mt19937 rng(42);
8 std::uniform_real_distribution<double> unif(0.0, 1.0);
9 std::normal_distribution<double> gauss(0.0, noise);
10 for(int i=0;i<N;++i){
11 double t = 1.5*M_PI + 3.0*M_PI*unif(rng); // [1.5pi, 4.5pi]
12 double h = 21.0*unif(rng); // height
13 double x = t * cos(t) + gauss(rng);
14 double y = h + gauss(rng);
15 double z = t * sin(t) + gauss(rng);
16 X.push_back({x,y,z});
17 }
18 return X;
19}
20
21// Compute squared Euclidean distance
22inline double dist2(const array<double,3>& a, const array<double,3>& b){
23 double dx=a[0]-b[0], dy=a[1]-b[1], dz=a[2]-b[2];
24 return dx*dx+dy*dy+dz*dz;
25}
26
27// Find k nearest neighbors (naive) indices for each point (excluding self)
28vector<vector<int>> knn(const vector<array<double,3>>& X, int k){
29 int N = (int)X.size();
30 vector<vector<int>> nbrs(N);
31 for(int i=0;i<N;++i){
32 vector<pair<double,int>> d; d.reserve(N-1);
33 for(int j=0;j<N;++j){ if(i==j) continue; d.push_back({dist2(X[i],X[j]), j}); }
34 nth_element(d.begin(), d.begin()+k, d.end());
35 d.resize(k);
36 vector<int> idx; idx.reserve(k);
37 for(auto &p: d) idx.push_back(p.second);
38 nbrs[i]=move(idx);
39 }
40 return nbrs;
41}
42
43// Jacobi eigen-decomposition for symmetric 3x3 matrix (sufficient for this demo)
44struct Eig3 { array<double,3> w; array<array<double,3>,3> V; };
45
46Eig3 jacobi_eig3(array<array<double,3>,3> A){
47 // Initialize V as identity
48 array<array<double,3>,3> V{}; for(int i=0;i<3;++i){ for(int j=0;j<3;++j) V[i][j]=(i==j); }
49 const int MAX_IT=50; const double EPS=1e-12;
50 for(int it=0; it<MAX_IT; ++it){
51 // Find largest off-diagonal element by magnitude
52 int p=0,q=1; double m=abs(A[0][1]);
53 auto upd=[&](int i,int j){ if(abs(A[i][j])>m){ m=abs(A[i][j]); p=i; q=j; } };
54 upd(0,2); upd(1,2);
55 if(m<EPS) break;
56 double app=A[p][p], aqq=A[q][q], apq=A[p][q];
57 double phi = 0.5*atan2(2.0*apq, (aqq - app));
58 double c = cos(phi), s = sin(phi);
59 // Rotate A = J^T A J in p-q plane
60 for(int k=0;k<3;++k){ if(k==p || k==q) continue; double aik=A[k][p], aiq=A[k][q]; A[k][p]=c*aik - s*aiq; A[p][k]=A[k][p]; A[k][q]=s*aik + c*aiq; A[q][k]=A[k][q]; }
61 double app2 = c*c*app - 2*s*c*apq + s*s*aqq;
62 double aqq2 = s*s*app + 2*s*c*apq + c*c*aqq;
63 A[p][p]=app2; A[q][q]=aqq2; A[p][q]=A[q][p]=0.0;
64 // Accumulate eigenvectors V = V J
65 for(int k=0;k<3;++k){ double vip=V[k][p], viq=V[k][q]; V[k][p]=c*vip - s*viq; V[k][q]=s*vip + c*viq; }
66 }
67 Eig3 R; R.w = {A[0][0], A[1][1], A[2][2]}; R.V = V;
68 // Sort by descending eigenvalues
69 array<int,3> idx={0,1,2};
70 sort(idx.begin(), idx.end(), [&](int i,int j){ return R.w[i]>R.w[j]; });
71 Eig3 S; for(int r=0;r<3;++r){ S.w[r]=R.w[idx[r]]; for(int c=0;c<3;++c) S.V[c][r]=R.V[c][idx[r]]; }
72 return S;
73}
74
75// Local PCA dimension estimate at each point via explained variance threshold
76int local_dim_at(const vector<array<double,3>>& X, const vector<int>& Nbr, double thr){
77 // Compute mean
78 array<double,3> mu{0,0,0};
79 for(int j: Nbr){ mu[0]+=X[j][0]; mu[1]+=X[j][1]; mu[2]+=X[j][2]; }
80 double m = (double)Nbr.size();
81 mu[0]/=m; mu[1]/=m; mu[2]/=m;
82 // Covariance
83 array<array<double,3>,3> C{}; // zero init
84 for(int j: Nbr){
85 array<double,3> d = { X[j][0]-mu[0], X[j][1]-mu[1], X[j][2]-mu[2] };
86 for(int a=0;a<3;++a) for(int b=0;b<3;++b) C[a][b] += d[a]*d[b];
87 }
88 for(int a=0;a<3;++a) for(int b=0;b<3;++b) C[a][b] /= m;
89 // Eigenvalues
90 Eig3 E = jacobi_eig3(C);
91 double sum = max(1e-18, E.w[0]+E.w[1]+E.w[2]);
92 double acc=0.0; for(int r=0;r<3;++r){ acc += max(0.0,E.w[r]); if(acc/sum >= thr) return r+1; }
93 return 3;
94}
95
96int main(){
97 ios::sync_with_stdio(false);
98 cin.tie(nullptr);
99
100 int N=400; int k=20; double thr=0.95;
101 auto X = swiss_roll(N, 0.05);
102 auto Nbrs = knn(X, k);
103
104 double avg_dim=0.0; int cnt=0;
105 for(int i=0;i<N;++i){
106 int dimi = local_dim_at(X, Nbrs[i], thr);
107 avg_dim += dimi; cnt++;
108 }
109 avg_dim /= max(1,cnt);
110
111 cout.setf(std::ios::fixed); cout<<setprecision(3);
112 cout << "Average local intrinsic dimension (thr="<<thr<<"): "<< avg_dim <<"\n";
113 cout << "(Expected ~2 for a swiss roll with modest noise)\n";
114 return 0;
115}
116

This program generates a 3D swiss roll dataset (a classic 2D manifold embedded in 3D), computes k-nearest neighbors via a naive all-pairs method, and performs local PCA at each point using a Jacobi eigen-decomposition for 3×3 covariance matrices. The local intrinsic dimension is the smallest r whose top-r eigenvalues explain a chosen fraction (e.g., 95%) of the local variance. Averaging across points yields an estimated intrinsic dimension close to 2, supporting the manifold hypothesis for this dataset.

Time: O(N^2 d + N d^3) with d=3 here (dominant term O(N^2))Space: O(N d) to store data plus O(d^2) workspace (here small constants)
#manifold#topological manifold#smooth manifold#tangent space#atlas#transition map#riemannian metric#geodesic#manifold hypothesis#pca#intrinsic dimension#stereographic projection#mesh manifold#whitney embedding#local linear approximation