Orthogonal & Unitary Matrices
Key Points
- •Orthogonal (real) and unitary (complex) matrices are length- and angle-preserving transformations, like perfect rotations and reflections.
- •They satisfy for real matrices and U^* for complex matrices, which means their inverse equals their transpose or conjugate transpose.
- •The columns (and rows) of an orthogonal or unitary matrix form an orthonormal basis: they are mutually perpendicular and each has length 1.
- •Applying these matrices does not change Euclidean 2-norms or inner products, a key property for numerical stability.
- •Eigenvalues of orthogonal/unitary matrices lie on the unit circle; for real orthogonal matrices, the determinant is ±1.
- •They are central in QR and polar decompositions, PCA, graphics rotations, signal processing transforms, and quantum computing.
- •In floating-point arithmetic, exact identities become approximate, so we check Q ≈ I or U^* U ≈ I within a tolerance.
- •Modified Gram–Schmidt or Householder reflections can compute orthonormal bases and QR factorizations in O() time.
Prerequisites
- →Matrix multiplication and transpose — Definitions Q^T Q = I and U^* U = I rely on basic matrix operations.
- →Inner products and norms — Orthogonality/unitarity preserve inner products and 2-norms, which is central to their meaning.
- →Complex numbers and conjugation — Unitary matrices require understanding complex conjugation and the Hermitian transpose.
- →Vector spaces and bases — Columns of orthogonal/unitary matrices form orthonormal bases.
- →Numerical linear algebra basics — Practical checks and constructions use norms, tolerances, and stable algorithms like Gram–Schmidt or Householder.
Detailed Explanation
Tap terms for definitions01Overview
Orthogonal and unitary matrices are special square matrices that preserve geometric structure. Think of them as the purest form of movement in space: rotate, reflect, or change coordinates without stretching or squashing anything. For real matrices, this class is called orthogonal; for complex matrices, the analogous class is unitary. Formally, a real matrix Q is orthogonal if Q^T Q = I, and a complex matrix U is unitary if U^* U = I, where T is transpose and * is conjugate transpose (Hermitian). These identities imply that Q^{-1} = Q^T and U^{-1} = U^*, so these matrices are always invertible and perfectly conditioned in the 2-norm. Because they preserve dot products and lengths, they play a fundamental role in numerical algorithms where stability matters: QR decomposition for solving least squares, eigenvalue algorithms, and Krylov methods rely on orthogonal/unitary transformations to avoid amplifying rounding errors. In applications, you see them everywhere: 2D/3D graphics rotations are orthogonal; the discrete Fourier transform is unitary (up to scaling); quantum gates are unitary by physical necessity. Understanding orthogonal/unitary matrices unlocks deep connections between geometry, linear algebra, and computation.
02Intuition & Analogies
Imagine standing on a rubber mat with a perfect printed grid. A general matrix can warp the mat—stretching, skewing, squashing—so squares become parallelograms and circles become ellipses. An orthogonal or unitary matrix is the opposite: it picks up the mat and only rotates or flips it before placing it back down. Squares remain squares, circles stay circles, and distances between any two points are unchanged. That is what “length- and angle-preserving” means. In the real case (orthogonal), think of familiar operations: rotating a shape by 30 degrees or reflecting it across a line. In the complex case (unitary), it’s like rotating in a higher-dimensional sense where each component can also have a complex phase; quantum states evolve via unitary transformations to preserve total probability (norm). The columns of these matrices are like a set of perfectly aligned compass directions: each direction has length 1, and any pair of directions is perpendicular. When you express a vector in this new set of directions, you haven’t changed the vector’s length—only the coordinates used to describe it. Numerically, this is why algorithms that rely on orthogonal/unitary steps tend to be stable: like carefully moving a glass of water by rotating it instead of sloshing it back and forth, you don’t amplify measurement noise or rounding error when you avoid stretching.
03Formal Definition
04When to Use
Use orthogonal/unitary matrices whenever you need to change coordinates or manipulate data without distorting lengths and angles. Examples: (1) QR decomposition: factor A = QR with Q orthogonal (or unitary) and R upper triangular to solve least squares stably. (2) Eigenvalue algorithms: iterative methods (QR algorithm, Arnoldi/Lanczos) repeatedly apply orthogonal/unitary similarity transforms to avoid growth of numerical errors. (3) Dimensionality reduction and data analysis: the principal components (eigenvectors) form an orthonormal basis; projections onto these bases use orthogonal matrices. (4) Computer graphics and robotics: representing rigid body rotations via orthogonal matrices ensures no unwanted scaling or shearing. (5) Signal processing: transforms like DFT, DCT, and wavelets are orthogonal/unitary (up to scaling), preserving energy (Parseval’s theorem). (6) Quantum computing: gates are unitary to conserve probability amplitudes. (7) Preconditioning and stability: whenever you orthonormalize vectors (e.g., Gram–Schmidt), you are building the columns of an orthogonal/unitary matrix to preserve norms and improve conditioning. In code, choose orthogonal/unitary transforms when you want numerical stability, especially in algorithms sensitive to roundoff, and prefer constructions like Householder reflectors or modified Gram–Schmidt over naïve methods.
⚠️Common Mistakes
• Forgetting conjugation in the complex case: U^T U = I is generally false; you must use U^* U = I. Always take the conjugate transpose for unitary checks. • Confusing “orthonormal columns” with “square orthogonal matrix”: rectangular matrices with orthonormal columns (m×n, m≥n) satisfy Q^T Q = I_n but not Q Q^T = I_m. They are isometries, not square orthogonal matrices. • Assuming determinant is +1: orthogonal matrices can have determinant −1 (reflections). If you specifically need a rotation, enforce det(Q)=+1. • Ignoring numerical tolerance: due to floating-point error, Q^T Q will not be exactly I. Check |Q^T Q − I| ≤ \varepsilon, not exact equality. • Using classical Gram–Schmidt (CGS) on ill-conditioned sets: CGS can lose orthogonality in finite precision. Prefer modified Gram–Schmidt (MGS) or Householder reflections. • Mixing inner-product conventions: in complex spaces, the standard inner product is conjugate-linear in the first argument. Implement dot products accordingly in code. • Overlooking stability properties: replacing an orthogonal step with a general invertible transform can drastically increase rounding error. • Mismanaging normalization: forgetting to normalize columns after orthogonalization breaks orthonormality. • Assuming commuting properties: in general, orthogonal/unitary matrices do not commute; preserve multiplication order in derivations and code. • Treating orthogonality as entrywise property: orthogonality is about column/row inner products, not each entry being small or special.
Key Formulas
Orthogonal Definition
Explanation: A real matrix Q is orthogonal if its transpose is its inverse. This means applying Q preserves dot products and lengths.
Unitary Definition
Explanation: A complex matrix U is unitary if its conjugate transpose is its inverse. This is the complex analogue of orthogonality.
Inner-Product and Norm Preservation (Orthogonal)
Explanation: Orthogonal matrices preserve the Euclidean inner product and hence the 2-norm. They do not stretch or skew vectors.
Inner-Product and Norm Preservation (Unitary)
Explanation: Unitary matrices preserve the complex inner product and thus vector norms. This underpins their role in quantum mechanics.
Column Orthonormality
Explanation: The columns of orthogonal/unitary matrices are mutually orthogonal and of unit length. This is equivalent to or U^* .
Determinant Constraints
Explanation: Unitary matrices have determinant with unit modulus, and orthogonal matrices have determinant ±1. A determinant of −1 indicates a reflection.
QR Decomposition
Explanation: Any full-rank matrix can be factored into an orthogonal/unitary factor and an upper triangular factor. This is central to least squares and eigenvalue algorithms.
Polar Decomposition
Explanation: Every matrix factors into a unitary part Q and a Hermitian positive semidefinite part H. Q is the closest unitary matrix to A in Frobenius norm.
Householder Reflector
Explanation: This matrix reflects across the hyperplane orthogonal to v and is unitary/orthogonal. It is used to introduce zeros below the diagonal in QR.
Perfect Conditioning
Explanation: Orthogonal/unitary matrices have the best possible 2-norm condition number. They do not amplify relative errors in the 2-norm.
Spectrum on Unit Circle
Explanation: All eigenvalues of orthogonal or unitary matrices have magnitude 1. Real orthogonal matrices may also have eigenvalues −1.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Generic conjugation: identity for real, std::conj for complex 5 template<typename T> 6 T my_conj(const T& x) { return x; } 7 template<typename T> 8 complex<T> my_conj(const complex<T>& z) { return conj(z); } 9 10 template<typename T> 11 using Matrix = vector<vector<T>>; 12 13 template<typename T> 14 Matrix<T> matmul(const Matrix<T>& A, const Matrix<T>& B) { 15 size_t m = A.size(), k = A[0].size(), n = B[0].size(); 16 Matrix<T> C(m, vector<T>(n, T{})); 17 for (size_t i = 0; i < m; ++i) 18 for (size_t p = 0; p < k; ++p) 19 for (size_t j = 0; j < n; ++j) 20 C[i][j] += A[i][p] * B[p][j]; 21 return C; 22 } 23 24 template<typename T> 25 Matrix<T> transpose(const Matrix<T>& A) { 26 size_t m = A.size(), n = A[0].size(); 27 Matrix<T> AT(n, vector<T>(m)); 28 for (size_t i = 0; i < m; ++i) 29 for (size_t j = 0; j < n; ++j) 30 AT[j][i] = A[i][j]; 31 return AT; 32 } 33 34 template<typename T> 35 Matrix<T> conj_transpose(const Matrix<T>& A) { 36 size_t m = A.size(), n = A[0].size(); 37 Matrix<T> AH(n, vector<T>(m)); 38 for (size_t i = 0; i < m; ++i) 39 for (size_t j = 0; j < n; ++j) 40 AH[j][i] = my_conj(A[i][j]); 41 return AH; 42 } 43 44 template<typename T> 45 Matrix<T> identity(size_t n) { 46 Matrix<T> I(n, vector<T>(n, T{})); 47 for (size_t i = 0; i < n; ++i) I[i][i] = T{1}; 48 return I; 49 } 50 51 template<typename T> 52 double frobenius_norm(const Matrix<T>& A) { 53 long double s = 0.0L; 54 for (const auto& row : A) 55 for (const auto& x : row) { 56 auto ax = x * my_conj(x); 57 s += (long double) (ax.real()); 58 } 59 return sqrt((double)s); 60 } 61 62 // Compute A - B 63 template<typename T> 64 Matrix<T> matdiff(const Matrix<T>& A, const Matrix<T>& B) { 65 size_t m = A.size(), n = A[0].size(); 66 Matrix<T> C(m, vector<T>(n)); 67 for (size_t i = 0; i < m; ++i) 68 for (size_t j = 0; j < n; ++j) 69 C[i][j] = A[i][j] - B[i][j]; 70 return C; 71 } 72 73 // Check if M^* M ≈ I and M M^* ≈ I within tolerance eps 74 // Works for real (orthogonal) and complex (unitary) matrices 75 76 template<typename T> 77 bool is_unitary(const Matrix<T>& M, double eps = 1e-9) { 78 size_t n = M.size(); 79 Matrix<T> MH = conj_transpose(M); 80 Matrix<T> I = identity<T>(n); 81 Matrix<T> left = matmul(MH, M); 82 Matrix<T> right = matmul(M, MH); 83 double e1 = frobenius_norm(matdiff(left, I)); 84 double e2 = frobenius_norm(matdiff(right, I)); 85 return (e1 <= eps) && (e2 <= eps); 86 } 87 88 int main() { 89 cout.setf(std::ios::fixed); cout<<setprecision(6); 90 // Example 1: Real orthogonal 2D rotation by theta 91 double theta = M_PI / 6.0; // 30 degrees 92 Matrix<double> Q = { 93 { cos(theta), -sin(theta) }, 94 { sin(theta), cos(theta) } 95 }; 96 cout << "Q is orthogonal? " << (is_unitary(Q) ? "yes" : "no") << "\n"; 97 98 // Example 2: Complex 2x2 unitary (Hadamard-like with phase) 99 double s = 1.0 / sqrt(2.0); 100 using cd = complex<double>; 101 Matrix<cd> U = { 102 { cd(s,0), cd(0,s) }, 103 { cd(0,s), cd(s,0) } 104 }; 105 cout << "U is unitary? " << (is_unitary(U) ? "yes" : "no") << "\n"; 106 107 // Slight perturbation breaks orthogonality/unitarity 108 Q[0][0] += 1e-6; 109 cout << "Perturbed Q is orthogonal? " << (is_unitary(Q) ? "yes" : "no") << "\n"; 110 return 0; 111 } 112
This program implements generic utilities to multiply matrices, build identities, compute conjugate transposes, and measure Frobenius norms. The function is_unitary checks both M^* M ≈ I and M M^* ≈ I within a tolerance, so it correctly tests orthogonality for real matrices and unitarity for complex matrices. It verifies a 2D rotation matrix (orthogonal) and a small complex unitary, then shows how tiny perturbations can break the property numerically.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 template<typename T> 5 using Matrix = vector<vector<T>>; 6 7 Matrix<double> transpose(const Matrix<double>& A) { 8 size_t m = A.size(), n = A[0].size(); 9 Matrix<double> AT(n, vector<double>(m)); 10 for (size_t i = 0; i < m; ++i) 11 for (size_t j = 0; j < n; ++j) 12 AT[j][i] = A[i][j]; 13 return AT; 14 } 15 16 // Dot product of two real vectors 17 double dot(const vector<double>& a, const vector<double>& b) { 18 double s = 0.0; size_t n = a.size(); 19 for (size_t i = 0; i < n; ++i) s += a[i]*b[i]; 20 return s; 21 } 22 23 // 2-norm of a real vector 24 double norm2(const vector<double>& a) { 25 return sqrt(dot(a,a)); 26 } 27 28 // Modified Gram–Schmidt: A (m×n) -> Q (m×n), R (n×n) 29 pair<Matrix<double>, Matrix<double>> mgs_qr(const Matrix<double>& A) { 30 size_t m = A.size(), n = A[0].size(); 31 Matrix<double> Q = A; // will store orthonormal columns 32 Matrix<double> R(n, vector<double>(n, 0.0)); 33 34 for (size_t k = 0; k < n; ++k) { 35 // r_kk = ||q_k|| 36 double rkk = norm2(Q[k < Q.size() ? 0 : 0]); // dummy use to silence warnings 37 // Compute r_kk and normalize column k 38 vector<double> v(m); 39 for (size_t i = 0; i < m; ++i) v[i] = Q[i][k]; 40 double r_kk = norm2(v); 41 if (r_kk == 0.0) throw runtime_error("Linearly dependent columns"); 42 for (size_t i = 0; i < m; ++i) Q[i][k] /= r_kk; 43 R[k][k] = r_kk; 44 // Orthogonalize remaining columns against q_k 45 for (size_t j = k+1; j < n; ++j) { 46 double r_kj = 0.0; 47 for (size_t i = 0; i < m; ++i) r_kj += Q[i][k] * Q[i][j]; 48 R[k][j] = r_kj; 49 for (size_t i = 0; i < m; ++i) Q[i][j] -= r_kj * Q[i][k]; 50 } 51 } 52 return {Q, R}; 53 } 54 55 Matrix<double> matmul(const Matrix<double>& A, const Matrix<double>& B) { 56 size_t m = A.size(), k = A[0].size(), n = B[0].size(); 57 Matrix<double> C(m, vector<double>(n, 0.0)); 58 for (size_t i = 0; i < m; ++i) 59 for (size_t p = 0; p < k; ++p) 60 for (size_t j = 0; j < n; ++j) 61 C[i][j] += A[i][p]*B[p][j]; 62 return C; 63 } 64 65 Matrix<double> identity(size_t n) { 66 Matrix<double> I(n, vector<double>(n, 0.0)); 67 for (size_t i = 0; i < n; ++i) I[i][i] = 1.0; 68 return I; 69 } 70 71 double frob(const Matrix<double>& A) { 72 long double s=0.0L; 73 for (auto& r: A) for (auto x: r) s += (long double)x*(long double)x; 74 return sqrt((double)s); 75 } 76 77 Matrix<double> diff(const Matrix<double>& A, const Matrix<double>& B) { 78 size_t m=A.size(), n=A[0].size(); Matrix<double> C(m, vector<double>(n)); 79 for (size_t i=0;i<m;++i) for (size_t j=0;j<n;++j) C[i][j]=A[i][j]-B[i][j]; 80 return C; 81 } 82 83 int main(){ 84 // Build a random tall matrix A (m >= n) 85 size_t m=5, n=3; 86 std::mt19937_64 rng(42); 87 std::normal_distribution<double> N(0.0,1.0); 88 Matrix<double> A(m, vector<double>(n)); 89 for (size_t i=0;i<m;++i) for (size_t j=0;j<n;++j) A[i][j]=N(rng); 90 91 auto [Q,R] = mgs_qr(A); 92 93 // Check orthonormal columns: Q^T Q ≈ I 94 Matrix<double> QT = transpose(Q); 95 Matrix<double> I = identity(n); 96 double err = frob(diff(matmul(QT,Q), I)); 97 98 cout.setf(std::ios::fixed); cout<<setprecision(6); 99 cout << "Frobenius error ||Q^T Q - I|| = " << err << "\n"; 100 101 // Reconstruct A ≈ Q R and report error 102 double rec_err = frob(diff(matmul(Q,R), A)); 103 cout << "Reconstruction error ||QR - A|| = " << rec_err << "\n"; 104 return 0; 105 } 106
This program implements Modified Gram–Schmidt to factor a real m×n matrix A as A = QR with Q having orthonormal columns and R upper triangular. It checks orthonormality via Q^T Q ≈ I and reconstruction via ||QR − A||. MGS is more stable than classical GS and is appropriate when Householder QR is not required.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 template<typename T> 5 using Matrix = vector<vector<T>>; 6 7 using cd = complex<double>; 8 9 cd my_conj(const cd& z){ return conj(z); } 10 double real_norm2(const cd& z){ return norm(z); } // |z|^2 11 12 Matrix<cd> conj_transpose(const Matrix<cd>& A){ 13 size_t m=A.size(), n=A[0].size(); 14 Matrix<cd> AH(n, vector<cd>(m)); 15 for (size_t i=0;i<m;++i) for (size_t j=0;j<n;++j) AH[j][i]=conj(A[i][j]); 16 return AH; 17 } 18 19 Matrix<cd> matmul(const Matrix<cd>& A, const Matrix<cd>& B){ 20 size_t m=A.size(), k=A[0].size(), n=B[0].size(); 21 Matrix<cd> C(m, vector<cd>(n, cd(0,0))); 22 for (size_t i=0;i<m;++i) 23 for (size_t p=0;p<k;++p) 24 for (size_t j=0;j<n;++j) 25 C[i][j]+=A[i][p]*B[p][j]; 26 return C; 27 } 28 29 Matrix<cd> identity(size_t n){ Matrix<cd> I(n, vector<cd>(n, cd(0,0))); for(size_t i=0;i<n;++i) I[i][i]=cd(1,0); return I; } 30 31 double frob(const Matrix<cd>& A){ long double s=0.0L; for(auto& r:A) for(auto x:r) s+= (long double)norm(x); return sqrt((double)s); } 32 33 Matrix<cd> diff(const Matrix<cd>& A, const Matrix<cd>& B){ size_t m=A.size(), n=A[0].size(); Matrix<cd> C(m, vector<cd>(n)); for(size_t i=0;i<m;++i) for(size_t j=0;j<n;++j) C[i][j]=A[i][j]-B[i][j]; return C; } 34 35 // Complex inner product: <x,y> = y^* x (conjugate-linear in first argument here via conj) 36 cd inner(const vector<cd>& x, const vector<cd>& y){ cd s(0,0); size_t n=x.size(); for(size_t i=0;i<n;++i) s += conj(x[i]) * y[i]; return s; } 37 38 tuple<Matrix<cd>, Matrix<cd>> mgs_qr_complex(const Matrix<cd>& A){ 39 size_t m=A.size(), n=A[0].size(); 40 Matrix<cd> Q=A; Matrix<cd> R(n, vector<cd>(n, cd(0,0))); 41 for(size_t k=0;k<n;++k){ 42 // r_kk = ||v|| 43 cd zero(0,0); 44 vector<cd> v(m); for(size_t i=0;i<m;++i) v[i]=Q[i][k]; 45 double rkk = 0.0; for(size_t i=0;i<m;++i) rkk += norm(v[i]); rkk = sqrt(rkk); 46 if(rkk==0.0) throw runtime_error("Linearly dependent columns"); 47 for(size_t i=0;i<m;++i) Q[i][k] /= rkk; // normalize 48 R[k][k] = cd(rkk,0); 49 for(size_t j=k+1;j<n;++j){ 50 cd rkj = cd(0,0); 51 for(size_t i=0;i<m;++i) rkj += conj(Q[i][k]) * Q[i][j]; 52 R[k][j] = rkj; 53 for(size_t i=0;i<m;++i) Q[i][j] -= rkj * Q[i][k]; 54 } 55 } 56 return {Q,R}; 57 } 58 59 int main(){ 60 size_t n=4; // build random complex square matrix, then unitary via MGS 61 mt19937_64 rng(123); 62 normal_distribution<double> N(0.0,1.0); 63 Matrix<cd> A(n, vector<cd>(n)); 64 for(size_t i=0;i<n;++i) for(size_t j=0;j<n;++j) A[i][j]=cd(N(rng), N(rng)); 65 66 auto [Q,R] = mgs_qr_complex(A); // Q should be unitary 67 68 Matrix<cd> I = identity(n); 69 double err = frob(diff(matmul(conj_transpose(Q), Q), I)); 70 cout.setf(std::ios::fixed); cout<<setprecision(6); 71 cout << "||Q^*Q - I||_F = " << err << "\n"; 72 73 // Check norm preservation on a random vector x 74 vector<cd> x(n); for(size_t i=0;i<n;++i) x[i]=cd(N(rng), N(rng)); 75 // Compute y = Qx 76 vector<cd> y(n, cd(0,0)); 77 for(size_t i=0;i<n;++i) for(size_t j=0;j<n;++j) y[i]+=Q[i][j]*x[j]; 78 auto nrm = [](const vector<cd>& v){ long double s=0.0L; for(auto z:v) s+= (long double)norm(z); return sqrt((double)s); }; 79 cout << "||x||_2 = " << nrm(x) << ", ||Qx||_2 = " << nrm(y) << "\n"; 80 return 0; 81 } 82
This program constructs a unitary matrix Q by orthonormalizing the columns of a random complex matrix using Modified Gram–Schmidt adapted to the complex inner product. It verifies unitarity by computing ||Q^* Q − I||_F and demonstrates norm preservation by comparing ||x|| and ||Qx|| for a random vector x.