⚙️AlgorithmIntermediate

Sparse Matrices & Computation

Key Points

•
A sparse matrix stores only its nonzero entries, saving huge amounts of memory when most entries are zero.
•
The three core formats are COO (triplets), CSR (row-compressed), and CSC (column-compressed), each optimized for different operations.
•
CSR is excellent for fast sparse matrix–vector multiplication (SpMV), while CSC is great for operations that are column-centric or for solving with column-oriented factorizations.
•
Converting between COO, CSR, and CSC mostly involves counting nonzeros per row/column and computing prefix sums.
•
Careful handling of indices, duplicates, and sorting within rows/columns is crucial for correctness and performance.
•
SpMV with CSR runs in O( $nnz$ ) time and uses O( $nnz$ + m) space, where m is the number of rows.
•
Sparse iterative solvers like Conjugate Gradient require only SpMV, dot products, and axpy operations, making CSR ideal for large SPD systems.
•
Common pitfalls include mixing 0-based/1-based indices, leaving explicit zeros, unsorted columns within rows, and integer overflow in index arrays.

Prerequisites

→Arrays and memory layouts — Understanding contiguous storage and indexing is essential to grasp CSR/CSC pointer and index arrays.
→Big-O notation — To analyze time and space complexity of sparse operations like SpMV and conversions.
→Basic C++ vectors and iterators — Implementing sparse formats relies on std::vector, iteration, and standard algorithms.
→Linear algebra fundamentals — Concepts like matrix–vector products, transpose, and norms are prerequisites for sparse computations.
→Prefix sums (scan) — Constructing CSR/CSC from counts requires computing cumulative sums.
→Sorting and stability — Merging duplicates and canonicalizing rows/columns depend on sorting subarrays.
→Floating-point arithmetic — Sparse computations involve summation order, cancellation, and tolerance handling in iterative solvers.

Detailed Explanation

Tap terms for definitions

01Overview

A sparse matrix is a matrix in which most entries are zero. Instead of storing all m×n entries, sparse formats store only the nonzero values and their positions, dramatically reducing memory and often accelerating computations. Three widely used storage schemes are COO (Coordinate), CSR (Compressed Sparse Row), and CSC (Compressed Sparse Column). COO stores each nonzero as a triplet (row, column, value) and is simple to construct but inefficient for repeated arithmetic. CSR groups nonzeros by row with three arrays: row_ptr (size m+1), col_idx (size nnz), and val (size nnz). CSC is the column-analog: col_ptr (size n+1), row_idx, and val. With these compressed formats, we can perform key operations like matrix–vector products, transposes, and iterative solves much faster and using far less memory than dense matrices.

02Intuition & Analogies

Imagine a huge seating chart for a stadium with a billion seats (cells). If only a few seats are reserved (nonzero entries), copying the entire chart is wasteful. Instead, you’d keep a short list of reservations: seat row, seat column, and the name (value). That list is the COO format. Now suppose you frequently need to find all reservations in a given row (e.g., print all people in section A). You’d store the list grouped by row, with a quick index to jump to each row’s block—this is CSR. Similarly, if most of your queries ask for reservations by column (e.g., aisle numbers), you’d group by column—this is CSC. The pointer arrays (row_ptr or col_ptr) are like a table-of-contents telling you where each row or column’s list starts and ends. When multiplying a sparse matrix by a vector, you only touch the reserved seats (nonzeros), skipping the empties, so the work scales with the number of reservations rather than the stadium size. This is the key advantage: avoid processing zeros you don’t care about. But to make this efficient, your reservations in each row/column should be sorted by seat number (column/row), duplicates should be merged, and indices must be consistent (all 0-based or all 1-based).

03Formal Definition

Let A

\in

R^{m \times n}

be sparse with

nnz

nonzero entries, where

nnz

≪

mn. The COO format stores triples (

i_{k}

j_{k}

v_{k}

) for

k = 1

\dots

nnz

, with 0

\leq

i_{k} < m

and 0

\leq

j_{k} < n

. The CSR format uses three arrays: row\_ptr

\in

N^{m + 1}

, col\_idx

\in

N^{nnz}

, and val

\in

R^{nnz}

such that for each row i, nonzeros of A in row i are stored contiguously in positions p

\in

[

row_ptr

[i],

row_ptr

[i+1]). For each such position p, col\_idx[p] gives the column index, and val[p] the corresponding value. CSC is defined analogously with col\_ptr

\in

N^{n + 1}

, row\_idx, and val, grouping by columns. Basic operations are defined by iterating over these compressed segments; e.g.,

y = A

x in CSR is y[i] =

\sum_{p = row_ptr [i]}^{row_ptr [i + 1] - 1}

val

[p]

\cdot

col_idx

[p]]. Correctness and efficiency often require that indices within each row (CSR) or column (CSC) be strictly increasing, and that multiple entries with the same (i,j) be merged.

04When to Use

Use sparse formats when your matrix has a small fraction of nonzeros (e.g., s = \mathrm{nnz}/(mn) \le 0.1 or far less) and you repeatedly perform linear algebra like matrix–vector products, solves, or graph traversals. CSR is the go-to for iterative methods (Conjugate Gradient, GMRES) where fast SpMV dominates runtime. CSC is preferred when algorithms are column-oriented (e.g., some factorizations, column sampling, or when interacting with libraries expecting CSC). COO is ideal as a construction format: streaming in (i,j,v) data from files, assembling stiffness matrices in finite element methods, or graph edges, then converting to CSR/CSC for computation. In graph algorithms, CSR encodes the adjacency list compactly and supports fast neighbor iteration; in recommendation systems and NLP, sparse feature matrices benefit from CSR/CSC storage. If your matrix is moderately dense or you need frequent random access to arbitrary A[i,j], dense storage may be simpler and faster due to cache locality. Also consider the cost of conversions: if you only do one small operation, conversion overhead might dominate; if you do many SpMVs or iterations, converting once pays off.

⚠️Common Mistakes

Mixing 0-based and 1-based indexing leads to out-of-bounds errors or incorrect results. Pick one (C++ typically 0-based) and be consistent throughout.
Leaving explicit zeros in COO or CSR wastes space and time; filter them out during assembly and merge duplicates (same (i,j)) by summing values.
Not sorting columns within each CSR row (or rows within each CSC column) can degrade performance and break algorithms that assume sorted structure (e.g., binary search, merge-based ops).
Forgetting that row_ptr has length m+1 (and col_ptr length n+1) causes off-by-one errors when iterating segments.
Integer overflow in indices for very large matrices (\mathrm{nnz} > 2^{31}-1). Use 64-bit indices (std::int64_t or size_t) when needed.
Assuming SpMV or conversions are thread-safe without care; writing into shared arrays during parallel fill requires per-thread buffers or atomic operations.
Using CSR for operations that inherently need fast column access (or CSC for row access) without conversion can lead to O(mn) scans; convert formats or transpose instead.
Neglecting numerical issues: in iterative solvers, poor scaling or ill-conditioned matrices slow convergence; apply diagonal scaling or preconditioning.

Key Formulas

Sparsity

s = \frac{nnz}{mn}

Explanation: This fraction measures how full the matrix is. Small s means the matrix is very sparse, which usually benefits from sparse formats.

CSR SpMV

SpMV: y_{i} = p = row_ptr [i] \sum row_ptr [i + 1] - 1 val [p] \cdot x_{col_idx [p]}

Explanation: For each row i, iterate from ro $w_{p} t r$ [i] to ro $w_{p} t r$ [i+1]-1 and accumulate products into $y_{i}$ . This touches each nonzero exactly once.

SpMV Time Complexity

T_{SpMV} (m, n, nnz) = Θ (nnz)

Explanation: The time scales linearly with the number of nonzeros. Each nonzero contributes one multiply and one add (plus memory access).

CSR Storage Count

Storage_{CSR} = (m + 1) + nnz + nnz

Explanation: CSR uses m+1 entries for ro $w_{p} t r$ , nnz for co $l_{i} d x$ , and nnz for values. In bytes, multiply by the size of each type.

Prefix Sum for CSR

row_ptr [i + 1] = row_ptr [i] + c_{i}, c_{i} = # {j ∣ A_{ij} \neq = 0}

Explanation: The next row’s start equals the current start plus the count of nonzeros in this row. This is how we build ro $w_{p} t r$ from counts.

Frobenius Norm

∥ A ∥_{F} = i = 1 \sum m j = 1 \sum n a_{ij}^{2}

Explanation: A common way to measure matrix size that’s easy to compute from sparse storage by summing squares of stored values.

SpMV Operation Count

ops_{SpMV} \approx nnz mults + (nnz - z) adds

Explanation: There is one multiplication per nonzero and at most one addition per nonzero, minus the number of row starts z. This highlights low arithmetic intensity.

Condition Number (SPD)

κ (A) = \frac{λ _{m a x} ( A )}{λ _{m i n} ( A )}

Explanation: For SPD matrices, CG convergence depends on the condition number. A larger kappa implies slower convergence.

CG Convergence Bound

\frac{∥ e _{k} ∥ _{A}}{∥ e _{0} ∥ _{A}} \leq 2 (\frac{κ - 1}{κ + 1})^{k}

Explanation: The A-norm of the error after k iterations decays geometrically with a rate determined by the condition number. Preconditioning reduces kappa.

COO to CSR Complexity

Conversion COO \to CSR : T = O (m + nnz)

Explanation: Counting per-row nonzeros and computing a prefix sum is linear in rows plus nonzeros, followed by a single pass to place entries.

Complexity Analysis

Sparse matrix performance is governed by the number of nonzeros (nnz) rather than the full dimensions. In CSR, the core kernel SpMV

y = A

x runs in O(nnz) time: we visit each nonzero once, performing one multiply and typically one add. The constant factors are driven by memory bandwidth because SpMV has low arithmetic intensity—loads of co

l_{i} d x

and val dominate runtime. Space usage for CSR is O(m + nnz): ro

w_{p} t r

requires m+1 integers, and co

l_{i} d x

and val require nnz integers and nnz scalars. CSC has analogous costs with n replacing m. Constructing CSR from COO takes O(m + nnz) to count and prefix-sum per-row nonzeros plus O(nnz) to scatter entries into their row segments. If rows are then sorted and duplicates merged, the additional work is bounded by the sum over rows of

r_{i}

log

r_{i}

, where

r_{i}

is nnz in row i; in the worst case this is O(nnz log nnz), but typically it is much less due to short rows. Converting CSR to CSC (or computing a transpose) is also O(n + nnz) using a counting sort by column indices (or row indices for the transpose). Iterative solvers like Conjugate Gradient have per-iteration costs dominated by a small number of SpMVs plus vector operations, yielding O(nnz) per iteration and O(nnz · k) total for k iterations. When nnz is large enough that 32-bit indices overflow, using 64-bit index types increases memory but avoids overflow. Cache locality improves when indices within rows are sorted and when vectors fit in cache; blocked or tiled formats (e.g., CSR5, BCSR) can further optimize constants but do not change asymptotic complexity.

Code Examples

Build CSR from COO and perform SpMV (y = A x)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct CSR {
5     int nrows = 0, ncols = 0;
6     vector<int> row_ptr;     // size = nrows + 1
7     vector<int> col_idx;     // size = nnz
8     vector<double> val;      // size = nnz
9 
10     // Multiply y = A x
11     vector<double> matvec(const vector<double>& x) const {
12         if ((int)x.size() != ncols) throw runtime_error("Dimension mismatch in matvec");
13         vector<double> y(nrows, 0.0);
14         for (int i = 0; i < nrows; ++i) {
15             double sum = 0.0;
16             for (int p = row_ptr[i]; p < row_ptr[i+1]; ++p) {
17                 sum += val[p] * x[col_idx[p]];
18             }
19             y[i] = sum;
20         }
21         return y;
22     }
23 
24     // Construct CSR from COO triplets (0-based). Duplicates are summed; rows are sorted by column.
25     static CSR fromCOO(int m, int n, const vector<int>& I, const vector<int>& J, const vector<double>& V) {
26         if (I.size() != J.size() || I.size() != V.size()) throw runtime_error("COO arrays must be same length");
27         int nnz = (int)I.size();
28         CSR A; A.nrows = m; A.ncols = n; A.row_ptr.assign(m+1, 0);
29         // 1) Count nonzeros per row
30         for (int k = 0; k < nnz; ++k) {
31             int r = I[k];
32             if (r < 0 || r >= m) throw runtime_error("Row index out of range");
33             int c = J[k];
34             if (c < 0 || c >= n) throw runtime_error("Column index out of range");
35             ++A.row_ptr[r];
36         }
37         // 2) Prefix sum to get row_ptr
38         int cumsum = 0;
39         for (int i = 0; i < m; ++i) {
40             int cnt = A.row_ptr[i];
41             A.row_ptr[i] = cumsum;
42             cumsum += cnt;
43         }
44         A.row_ptr[m] = cumsum; // total nnz (may include duplicates)
45         A.col_idx.assign(nnz, 0);
46         A.val.assign(nnz, 0.0);
47         // 3) Temp copy of row_ptr to track current insertion positions
48         vector<int> next(A.row_ptr.begin(), A.row_ptr.end());
49         // 4) Scatter COO entries into CSR buckets (unsorted within each row)
50         for (int k = 0; k < nnz; ++k) {
51             int r = I[k];
52             int pos = next[r]++;
53             A.col_idx[pos] = J[k];
54             A.val[pos] = V[k];
55         }
56         // 5) For each row: sort by column and sum duplicates; compact in-place
57         vector<int> new_row_ptr(m+1, 0);
58         vector<int> new_col;
59         vector<double> new_val;
60         new_row_ptr[0] = 0;
61         for (int i = 0; i < m; ++i) {
62             int start = A.row_ptr[i], end = A.row_ptr[i+1];
63             int len = end - start;
64             vector<pair<int,double>> row; row.reserve(len);
65             for (int p = start; p < end; ++p) row.emplace_back(A.col_idx[p], A.val[p]);
66             sort(row.begin(), row.end(), [](auto& a, auto& b){ return a.first < b.first; });
67             // merge duplicates
68             for (int t = 0; t < (int)row.size(); ) {
69                 int cj = row[t].first;
70                 double acc = 0.0;
71                 int u = t;
72                 while (u < (int)row.size() && row[u].first == cj) { acc += row[u].second; ++u; }
73                 if (acc != 0.0) { // drop explicit zeros
74                     new_col.push_back(cj);
75                     new_val.push_back(acc);
76                 }
77                 t = u;
78             }
79             new_row_ptr[i+1] = (int)new_col.size();
80         }
81         A.row_ptr.swap(new_row_ptr);
82         A.col_idx.swap(new_col);
83         A.val.swap(new_val);
84         return A;
85     }
86 };
87 
88 int main() {
89     // Example COO for a 4x5 matrix
90     int m = 4, n = 5;
91     // Triplets (I, J, V), 0-based
92     vector<int> I = {0,0,1,2,2,2,3,3};
93     vector<int> J = {0,3,1,0,2,4,1,4};
94     vector<double> V = {10.0, 2.0, 3.0, 4.0, 5.0, -1.0, 7.0, 1.5};
95 
96     CSR A = CSR::fromCOO(m, n, I, J, V);
97 
98     // Vector x of length n
99     vector<double> x = {1, 2, 3, 4, 5};
100     vector<double> y = A.matvec(x);
101 
102     cout << fixed << setprecision(2);
103     cout << "y = A x = [";
104     for (int i = 0; i < (int)y.size(); ++i) {
105         if (i) cout << ", ";
106         cout << y[i];
107     }
108     cout << "]\n";
109 
110     // Print CSR internals for verification
111     cout << "row_ptr: ";
112     for (int v : A.row_ptr) cout << v << ' '; cout << "\n";
113     cout << "col_idx: ";
114     for (int v : A.col_idx) cout << v << ' '; cout << "\n";
115     cout << "val:     ";
116     for (double v : A.val) cout << v << ' '; cout << "\n";
117     return 0;
118 }
119

We assemble a sparse matrix from COO triplets into CSR by counting per-row nonzeros, prefix-summing to get row_ptr, scattering triplets, then sorting and merging duplicates within each row. We then perform SpMV by iterating each row’s contiguous block and accumulating val[p] * x[col_idx[p]]. This program demonstrates the core data structures and a key kernel used in iterative methods.

Time: - Construction: O(m + nnz) for counting/scatter plus O(\sum_i r_i \log r_i) for per-row sorting and merging. Worst case O(nnz \log nnz). - SpMV: O(nnz).Space: O(m + nnz) for CSR arrays; temporary O(r_max) per row during sorting (amortized O(nnz)).

Convert CSR to CSC (and verify by transposing twice)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct CSR { int m=0, n=0; vector<int> row_ptr, col_idx; vector<double> val; };
5 struct CSC { int m=0, n=0; vector<int> col_ptr, row_idx; vector<double> val; };
6 
7 // CSR -> CSC using counting by column (stable). Equivalent to computing (CSR(A))^T in CSC of A^T.
8 CSC csr_to_csc(const CSR& A) {
9     CSC B; B.m = A.m; B.n = A.n; int m = A.m, n = A.n; int nnz = (int)A.col_idx.size();
10     B.col_ptr.assign(n+1, 0);
11     // Count entries per column
12     for (int p = 0; p < nnz; ++p) {
13         int j = A.col_idx[p];
14         if (j < 0 || j >= n) throw runtime_error("Bad column index");
15         ++B.col_ptr[j];
16     }
17     // Prefix sum to starting offsets
18     int cumsum = 0;
19     for (int j = 0; j < n; ++j) {
20         int cnt = B.col_ptr[j];
21         B.col_ptr[j] = cumsum;
22         cumsum += cnt;
23     }
24     B.col_ptr[n] = cumsum;
25     B.row_idx.assign(nnz, 0);
26     B.val.assign(nnz, 0.0);
27     // Temp positions per column
28     vector<int> next(B.col_ptr.begin(), B.col_ptr.end());
29     // Fill CSC arrays by scanning CSR rows
30     for (int i = 0; i < m; ++i) {
31         for (int p = A.row_ptr[i]; p < A.row_ptr[i+1]; ++p) {
32             int j = A.col_idx[p];
33             int q = next[j]++;
34             B.row_idx[q] = i;
35             B.val[q] = A.val[p];
36         }
37     }
38     return B;
39 }
40 
41 // CSC -> CSR by the same logic (count by row)
42 CSR csc_to_csr(const CSC& B) {
43     CSR A; A.m = B.m; A.n = B.n; int m = B.m, n = B.n; int nnz = (int)B.row_idx.size();
44     A.row_ptr.assign(m+1, 0);
45     for (int q = 0; q < nnz; ++q) {
46         int i = B.row_idx[q];
47         if (i < 0 || i >= m) throw runtime_error("Bad row index");
48         ++A.row_ptr[i];
49     }
50     int cumsum = 0; for (int i = 0; i < m; ++i) { int cnt = A.row_ptr[i]; A.row_ptr[i] = cumsum; cumsum += cnt; } A.row_ptr[m] = cumsum;
51     A.col_idx.assign(nnz, 0); A.val.assign(nnz, 0.0);
52     vector<int> next(A.row_ptr.begin(), A.row_ptr.end());
53     for (int j = 0; j < n; ++j) {
54         for (int q = B.col_ptr[j]; q < B.col_ptr[j+1]; ++q) {
55             int i = B.row_idx[q];
56             int p = next[i]++;
57             A.col_idx[p] = j;
58             A.val[p] = B.val[q];
59         }
60     }
61     return A;
62 }
63 
64 int main() {
65     // A simple 3x4 CSR matrix
66     CSR A; A.m = 3; A.n = 4;
67     A.row_ptr = {0, 2, 3, 5};
68     A.col_idx = {0, 2, 1, 1, 3};
69     A.val     = {5.0, 1.0, 2.0, 4.0, -3.0};
70 
71     CSC B = csr_to_csc(A);    // convert to CSC
72     CSR A2 = csc_to_csr(B);   // back to CSR
73 
74     // Verify equality of A and A2 (same pattern and values, order within rows may change)
75     auto canonicalize = [](CSR C){
76         for (int i = 0; i < C.m; ++i) {
77             int s = C.row_ptr[i], e = C.row_ptr[i+1];
78             vector<pair<int,double>> row;
79             for (int p = s; p < e; ++p) row.emplace_back(C.col_idx[p], C.val[p]);
80             sort(row.begin(), row.end());
81             for (int p = s, t = 0; p < e; ++p, ++t) {
82                 C.col_idx[p] = row[t].first; C.val[p] = row[t].second;
83             }
84         }
85         return C;
86     };
87 
88     CSR A_can = canonicalize(A), A2_can = canonicalize(A2);
89     bool same = (A_can.m == A2_can.m) && (A_can.n == A2_can.n) && (A_can.row_ptr == A2_can.row_ptr)
90                 && (A_can.col_idx == A2_can.col_idx);
91 
92     cout << (same ? "Conversion CSR->CSC->CSR preserved structure." : "Mismatch after round-trip conversion!") << "\n";
93 
94     // Print CSC pointers for inspection
95     cout << "CSC col_ptr: "; for (int v : B.col_ptr) cout << v << ' '; cout << "\n";
96     cout << "CSC row_idx: "; for (int v : B.row_idx) cout << v << ' '; cout << "\n";
97     cout << "CSC val:     "; for (double v : B.val) cout << v << ' '; cout << "\n";
98     return 0;
99 }
100

This example converts CSR to CSC using a counting sort by column and then back from CSC to CSR by counting rows. Both conversions are O(n + nnz) and are the standard way to transpose or switch access orientation. We verify the round trip preserves the matrix (up to within-row order).

Time: O(n + nnz) for CSR->CSC and O(m + nnz) for CSC->CSR, dominated by counting and a single pass to scatter entries.Space: O(n + nnz) or O(m + nnz) for the destination plus O(n) or O(m) for temporary offsets.

Conjugate Gradient (CG) solver using CSR SpMV for SPD systems

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct CSR {
5     int nrows = 0, ncols = 0;
6     vector<int> row_ptr, col_idx; vector<double> val;
7     vector<double> matvec(const vector<double>& x) const {
8         if ((int)x.size() != ncols) throw runtime_error("Dimension mismatch");
9         vector<double> y(nrows, 0.0);
10         for (int i = 0; i < nrows; ++i) {
11             double s = 0.0;
12             for (int p = row_ptr[i]; p < row_ptr[i+1]; ++p) s += val[p] * x[col_idx[p]];
13             y[i] = s;
14         }
15         return y;
16     }
17 };
18 
19 // Conjugate Gradient for A x = b, A is SPD and stored in CSR. Returns x.
20 vector<double> cg_solve(const CSR& A, const vector<double>& b, int max_iter=1000, double tol=1e-8) {
21     int n = A.nrows;
22     if (A.nrows != A.ncols) throw runtime_error("CG requires square matrix");
23     if ((int)b.size() != n) throw runtime_error("Dimension mismatch");
24 
25     vector<double> x(n, 0.0);              // initial guess x0 = 0
26     vector<double> r = b;                  // r0 = b - A x0 = b
27     vector<double> p = r;                  // p0 = r0
28 
29     auto dot = [](const vector<double>& u, const vector<double>& v){
30         return inner_product(u.begin(), u.end(), v.begin(), 0.0);
31     };
32 
33     double rsold = dot(r, r);
34     if (sqrt(rsold) < tol) return x;
35 
36     for (int k = 0; k < max_iter; ++k) {
37         vector<double> Ap = A.matvec(p);
38         double alpha = rsold / max(1e-300, dot(p, Ap));
39         // x_{k+1} = x_k + alpha p_k ; r_{k+1} = r_k - alpha A p_k
40         for (int i = 0; i < n; ++i) {
41             x[i] += alpha * p[i];
42             r[i] -= alpha * Ap[i];
43         }
44         double rsnew = dot(r, r);
45         double rnorm = sqrt(rsnew);
46         if (rnorm < tol) {
47             cerr << "CG converged in " << k+1 << " iterations. Residual = " << rnorm << "\n";
48             break;
49         }
50         double beta = rsnew / max(1e-300, rsold);
51         for (int i = 0; i < n; ++i) p[i] = r[i] + beta * p[i];
52         rsold = rsnew;
53     }
54     return x;
55 }
56 
57 int main() {
58     // Example SPD matrix (tridiagonal) 3x3:
59     // [ 4 -1  0 ]
60     // [ -1 4 -1 ]
61     // [ 0 -1  3 ]
62     CSR A; A.nrows = 3; A.ncols = 3;
63     A.row_ptr = {0, 2, 5, 7};
64     A.col_idx = {0, 1, 0, 1, 2, 1, 2};
65     A.val     = {4.0,-1.0,-1.0,4.0,-1.0,-1.0,3.0};
66 
67     vector<double> b = {1.0, 2.0, 3.0};
68     vector<double> x = cg_solve(A, b, 100, 1e-10);
69 
70     cout << fixed << setprecision(8);
71     cout << "Solution x = [";
72     for (int i = 0; i < (int)x.size(); ++i) { if (i) cout << ", "; cout << x[i]; }
73     cout << "]\n";
74 
75     // Check residual norm ||Ax - b||
76     vector<double> r = A.matvec(x);
77     for (int i = 0; i < (int)r.size(); ++i) r[i] -= b[i];
78     double rn = sqrt(inner_product(r.begin(), r.end(), r.begin(), 0.0));
79     cout << "Residual norm = " << rn << "\n";
80     return 0;
81 }
82

Conjugate Gradient is an iterative solver for symmetric positive definite systems that uses only SpMV, dot products, and vector updates. CSR’s fast SpMV makes CG efficient for large sparse problems. We demonstrate solving a small tridiagonal SPD system and report the residual norm to verify accuracy.

Time: Per iteration O(nnz) for SpMV plus O(n) for vector ops; total O(nnz · k) for k iterations (often k ≪ n with good conditioning/preconditioning).Space: O(nnz + n) for matrix and vectors; additional O(n) temporaries (p, r, Ap).

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	struct CSR {
5	int nrows = 0, ncols = 0;
6	vector<int> row_ptr; // size = nrows + 1
7	vector<int> col_idx; // size = nnz
8	vector<double> val; // size = nnz
9
10	// Multiply y = A x
11	vector<double> matvec(const vector<double>& x) const {
12	if ((int)x.size() != ncols) throw runtime_error("Dimension mismatch in matvec");
13	vector<double> y(nrows, 0.0);
14	for (int i = 0; i < nrows; ++i) {
15	double sum = 0.0;
16	for (int p = row_ptr[i]; p < row_ptr[i+1]; ++p) {
17	sum += val[p] * x[col_idx[p]];
18	}
19	y[i] = sum;
20	}
21	return y;
22	}
23
24	// Construct CSR from COO triplets (0-based). Duplicates are summed; rows are sorted by column.
25	static CSR fromCOO(int m, int n, const vector<int>& I, const vector<int>& J, const vector<double>& V) {
26	if (I.size() != J.size() \|\| I.size() != V.size()) throw runtime_error("COO arrays must be same length");
27	int nnz = (int)I.size();
28	CSR A; A.nrows = m; A.ncols = n; A.row_ptr.assign(m+1, 0);
29	// 1) Count nonzeros per row
30	for (int k = 0; k < nnz; ++k) {
31	int r = I[k];
32	if (r < 0 \|\| r >= m) throw runtime_error("Row index out of range");
33	int c = J[k];
34	if (c < 0 \|\| c >= n) throw runtime_error("Column index out of range");
35	++A.row_ptr[r];
36	}
37	// 2) Prefix sum to get row_ptr
38	int cumsum = 0;
39	for (int i = 0; i < m; ++i) {
40	int cnt = A.row_ptr[i];
41	A.row_ptr[i] = cumsum;
42	cumsum += cnt;
43	}
44	A.row_ptr[m] = cumsum; // total nnz (may include duplicates)
45	A.col_idx.assign(nnz, 0);
46	A.val.assign(nnz, 0.0);
47	// 3) Temp copy of row_ptr to track current insertion positions
48	vector<int> next(A.row_ptr.begin(), A.row_ptr.end());
49	// 4) Scatter COO entries into CSR buckets (unsorted within each row)
50	for (int k = 0; k < nnz; ++k) {
51	int r = I[k];
52	int pos = next[r]++;
53	A.col_idx[pos] = J[k];
54	A.val[pos] = V[k];
55	}
56	// 5) For each row: sort by column and sum duplicates; compact in-place
57	vector<int> new_row_ptr(m+1, 0);
58	vector<int> new_col;
59	vector<double> new_val;
60	new_row_ptr[0] = 0;
61	for (int i = 0; i < m; ++i) {
62	int start = A.row_ptr[i], end = A.row_ptr[i+1];
63	int len = end - start;
64	vector<pair<int,double>> row; row.reserve(len);
65	for (int p = start; p < end; ++p) row.emplace_back(A.col_idx[p], A.val[p]);
66	sort(row.begin(), row.end(), [](auto& a, auto& b){ return a.first < b.first; });
67	// merge duplicates
68	for (int t = 0; t < (int)row.size(); ) {
69	int cj = row[t].first;
70	double acc = 0.0;
71	int u = t;
72	while (u < (int)row.size() && row[u].first == cj) { acc += row[u].second; ++u; }
73	if (acc != 0.0) { // drop explicit zeros
74	new_col.push_back(cj);
75	new_val.push_back(acc);
76	}
77	t = u;
78	}
79	new_row_ptr[i+1] = (int)new_col.size();
80	}
81	A.row_ptr.swap(new_row_ptr);
82	A.col_idx.swap(new_col);
83	A.val.swap(new_val);
84	return A;
85	}
86	};
87
88	int main() {
89	// Example COO for a 4x5 matrix
90	int m = 4, n = 5;
91	// Triplets (I, J, V), 0-based
92	vector<int> I = {0,0,1,2,2,2,3,3};
93	vector<int> J = {0,3,1,0,2,4,1,4};
94	vector<double> V = {10.0, 2.0, 3.0, 4.0, 5.0, -1.0, 7.0, 1.5};
95
96	CSR A = CSR::fromCOO(m, n, I, J, V);
97
98	// Vector x of length n
99	vector<double> x = {1, 2, 3, 4, 5};
100	vector<double> y = A.matvec(x);
101
102	cout << fixed << setprecision(2);
103	cout << "y = A x = [";
104	for (int i = 0; i < (int)y.size(); ++i) {
105	if (i) cout << ", ";
106	cout << y[i];
107	}
108	cout << "]\n";
109
110	// Print CSR internals for verification
111	cout << "row_ptr: ";
112	for (int v : A.row_ptr) cout << v << ' '; cout << "\n";
113	cout << "col_idx: ";
114	for (int v : A.col_idx) cout << v << ' '; cout << "\n";
115	cout << "val: ";
116	for (double v : A.val) cout << v << ' '; cout << "\n";
117	return 0;
118	}
119

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	struct CSR { int m=0, n=0; vector<int> row_ptr, col_idx; vector<double> val; };
5	struct CSC { int m=0, n=0; vector<int> col_ptr, row_idx; vector<double> val; };
6
7	// CSR -> CSC using counting by column (stable). Equivalent to computing (CSR(A))^T in CSC of A^T.
8	CSC csr_to_csc(const CSR& A) {
9	CSC B; B.m = A.m; B.n = A.n; int m = A.m, n = A.n; int nnz = (int)A.col_idx.size();
10	B.col_ptr.assign(n+1, 0);
11	// Count entries per column
12	for (int p = 0; p < nnz; ++p) {
13	int j = A.col_idx[p];
14	if (j < 0 \|\| j >= n) throw runtime_error("Bad column index");
15	++B.col_ptr[j];
16	}
17	// Prefix sum to starting offsets
18	int cumsum = 0;
19	for (int j = 0; j < n; ++j) {
20	int cnt = B.col_ptr[j];
21	B.col_ptr[j] = cumsum;
22	cumsum += cnt;
23	}
24	B.col_ptr[n] = cumsum;
25	B.row_idx.assign(nnz, 0);
26	B.val.assign(nnz, 0.0);
27	// Temp positions per column
28	vector<int> next(B.col_ptr.begin(), B.col_ptr.end());
29	// Fill CSC arrays by scanning CSR rows
30	for (int i = 0; i < m; ++i) {
31	for (int p = A.row_ptr[i]; p < A.row_ptr[i+1]; ++p) {
32	int j = A.col_idx[p];
33	int q = next[j]++;
34	B.row_idx[q] = i;
35	B.val[q] = A.val[p];
36	}
37	}
38	return B;
39	}
40
41	// CSC -> CSR by the same logic (count by row)
42	CSR csc_to_csr(const CSC& B) {
43	CSR A; A.m = B.m; A.n = B.n; int m = B.m, n = B.n; int nnz = (int)B.row_idx.size();
44	A.row_ptr.assign(m+1, 0);
45	for (int q = 0; q < nnz; ++q) {
46	int i = B.row_idx[q];
47	if (i < 0 \|\| i >= m) throw runtime_error("Bad row index");
48	++A.row_ptr[i];
49	}
50	int cumsum = 0; for (int i = 0; i < m; ++i) { int cnt = A.row_ptr[i]; A.row_ptr[i] = cumsum; cumsum += cnt; } A.row_ptr[m] = cumsum;
51	A.col_idx.assign(nnz, 0); A.val.assign(nnz, 0.0);
52	vector<int> next(A.row_ptr.begin(), A.row_ptr.end());
53	for (int j = 0; j < n; ++j) {
54	for (int q = B.col_ptr[j]; q < B.col_ptr[j+1]; ++q) {
55	int i = B.row_idx[q];
56	int p = next[i]++;
57	A.col_idx[p] = j;
58	A.val[p] = B.val[q];
59	}
60	}
61	return A;
62	}
63
64	int main() {
65	// A simple 3x4 CSR matrix
66	CSR A; A.m = 3; A.n = 4;
67	A.row_ptr = {0, 2, 3, 5};
68	A.col_idx = {0, 2, 1, 1, 3};
69	A.val = {5.0, 1.0, 2.0, 4.0, -3.0};
70
71	CSC B = csr_to_csc(A); // convert to CSC
72	CSR A2 = csc_to_csr(B); // back to CSR
73
74	// Verify equality of A and A2 (same pattern and values, order within rows may change)
75	auto canonicalize = [](CSR C){
76	for (int i = 0; i < C.m; ++i) {
77	int s = C.row_ptr[i], e = C.row_ptr[i+1];
78	vector<pair<int,double>> row;
79	for (int p = s; p < e; ++p) row.emplace_back(C.col_idx[p], C.val[p]);
80	sort(row.begin(), row.end());
81	for (int p = s, t = 0; p < e; ++p, ++t) {
82	C.col_idx[p] = row[t].first; C.val[p] = row[t].second;
83	}
84	}
85	return C;
86	};
87
88	CSR A_can = canonicalize(A), A2_can = canonicalize(A2);
89	bool same = (A_can.m == A2_can.m) && (A_can.n == A2_can.n) && (A_can.row_ptr == A2_can.row_ptr)
90	&& (A_can.col_idx == A2_can.col_idx);
91
92	cout << (same ? "Conversion CSR->CSC->CSR preserved structure." : "Mismatch after round-trip conversion!") << "\n";
93
94	// Print CSC pointers for inspection
95	cout << "CSC col_ptr: "; for (int v : B.col_ptr) cout << v << ' '; cout << "\n";
96	cout << "CSC row_idx: "; for (int v : B.row_idx) cout << v << ' '; cout << "\n";
97	cout << "CSC val: "; for (double v : B.val) cout << v << ' '; cout << "\n";
98	return 0;
99	}
100