🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
∑MathBeginner

Graph Fundamentals

Key Points

  • •
    A graph models relationships between items using vertices (nodes) and edges (links).
  • •
    An adjacency matrix is an n-by-n table A where Aij​ indicates whether vertex i is connected to vertex j; it uses O(n2) space and is good for dense graphs.
  • •
    The degree of a vertex is the number of edges touching it; the degree matrix D is a diagonal matrix with degrees on the diagonal.
  • •
    For undirected simple graphs, the sum of all degrees equals twice the number of edges (Handshaking Lemma).
  • •
    Adjacency lists store only existing neighbors, using O(n + m) space and enabling O(n + m) traversals like BFS/DFS on sparse graphs.
  • •
    BFS on an adjacency matrix takes O(n2), while BFS on adjacency lists takes O(n + m).
  • •
    The graph Laplacian L = D - A connects graph structure to linear algebra and is widely used in spectral clustering and network analysis.
  • •
    Choose representations based on density and operations: matrices for quick edge queries and algebraic methods; lists for fast traversal on sparse graphs.

Prerequisites

  • →Big-O notation and basic algorithm analysis — To compare adjacency matrix vs list performance and reason about traversal complexities.
  • →Arrays, vectors, and 2D arrays in C++ — Graph matrices are stored as 2D arrays; adjacency lists use vectors of vectors.
  • →Queues and BFS idea — BFS relies on queue operations to explore layers in unweighted graphs.
  • →Basic linear algebra (matrices) — Adjacency and degree matrices, and the Laplacian L = D − A, are matrix concepts.
  • →Set vs multiset semantics — To decide whether to allow multiple edges and how to count degrees.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Think of a social network where people are nodes and friendships are links. If you can model who is connected to whom, you can answer questions like “Who can reach whom in two steps?” or “Who is the most connected?” Concept: Graphs give a clean language for such problems. A graph consists of vertices (also called nodes) and edges (connections). There are two common ways to store a graph on a computer: adjacency matrices and adjacency lists. In this lesson we focus on the matrix viewpoint and the closely related degree matrix. Example: In a class of 4 students, if Alice is friends with Bob and Carol, and Bob is also friends with Dave, we can write a 4×4 table where a 1 at row i, column j means student i is friends with student j. Counting 1s across a row tells you how many friends that student has—their degree—and those counts become the diagonal entries of the degree matrix. These basic ingredients power many algorithms, from simple traversals like BFS to advanced tools like spectral clustering via the Laplacian L = D − A. Understanding when to use each representation and how to compute degrees correctly will help you implement graph algorithms efficiently and accurately.

02Intuition & Analogies

Hook: Imagine a city map. Intersections are places you can stand (vertices), and streets are the paths you can take (edges). If you had a huge billboard-sized table listing all pairs of intersections and checked a box when a street connects them, that’s an adjacency matrix. Concept: The adjacency matrix is like a big seating chart for connections: each row is a “from” seat, each column a “to” seat, and a 1 (or a weight) says “these two are linked.” The degree of an intersection is simply how many streets flow out of it—the number of neighbors you can immediately reach. The degree matrix writes those counts on a diagonal, like labeling each row with the number of exits before reading the rest. Example: Consider 5 bus stops. Stop 0 connects to 1, 2; stop 1 connects to 2; stop 3 connects to 4. Draw a 5×5 table; put a 1 wherever two stops are directly connected. Row sums tell you how many direct rides leave each stop. If the roads are one-way, your row sums are out-degrees; column sums are in-degrees. If the roads are two-way, the table is symmetric, because if i connects to j, then j connects to i. Why two representations? If every stop connects to almost every other (a dense network), a full table is fine and makes “is there a road between i and j?” an instant lookup. But if connections are rare (a sparse network), keeping only neighbor lists saves space and speeds up traversals that only follow existing roads. The Laplacian L = D − A acts like a balance sheet: how much can leave a node (degree) minus what’s tied up in specific connections (adjacency), enabling powerful analyses like measuring how tightly knit different regions are.

03Formal Definition

A graph is an ordered pair G = (V, E), where V is a finite set of vertices, ∣V∣ = n, and E ⊆ V × V is a set of edges. In simple undirected graphs, edges are unordered pairs {u, v}, no self-loops (u=v), and no parallel edges. In directed graphs (digraphs), edges are ordered pairs (u, v). The adjacency matrix A ∈ Rn×n of a graph with vertex ordering v1​, …, vn​ is defined by Aij​=1 if (vi​, vj​) ∈ E and 0 otherwise for unweighted graphs; for weighted graphs, Aij​=wij​ where wij​≥0 is the edge weight. For undirected graphs, A is symmetric, i.e., A=A⊤. The degree of a vertex vi​ in an undirected graph is di​ = ∑j=1n​ Aij​. In directed graphs, the out-degree is di​^{out} = ∑j=1n​ Aij​, and the in-degree is di​^{in} = ∑j=1n​ Aji​. The degree matrix D ∈ Rn×n is diagonal with Dii​=di​ (or Dii​=diout​ for a chosen convention in digraphs). The combinatorial Laplacian is L=D − A. For weighted graphs, degrees generalize to Dii​ = ∑j=1n​ Aij​ (sum of weights). Key identities include the Handshaking Lemma for undirected graphs ∑i=1n​ di​=2∣E∣ and, for digraphs, ∑i​ di​^{out} = ∑i​ di​^{in} = ∣E∣. Matrix powers carry walk counts: (Ak)_{ij} equals the number of walks of length k from vi​ to vj​ in an unweighted graph.

04When to Use

Use an adjacency matrix when your graph is dense (m is close to n^{2}), when you need O(1) edge-existence queries A_{ij}, or when you plan to apply linear-algebraic techniques (e.g., computing L = D − A, eigenvalues for spectral clustering, counting k-length walks via A^{k}). It also simplifies some implementations when n is small to moderate, because memory is predictable and indexing is straightforward. Use an adjacency list when your graph is sparse (m ≪ n^{2}), or when your algorithms iterate over neighbors (BFS, DFS, Dijkstra on sparse graphs). Neighbor iteration from lists is proportional to the actual number of edges encountered, giving O(n + m) traversals. For streaming or incremental graph building, lists are usually preferable because you can append neighbors without touching O(n^{2}) memory. For weighted graphs with mostly absent edges, lists also avoid storing many zeros. For directed graphs, choose representations that support the queries you need: if you frequently need in-neighbors, maintain reverse adjacency lists or compute column sums of A. For educational or debugging purposes, matrices are nice to print and reason about; for performance in large, sparse real-world networks (social graphs, road networks, web graphs), lists are typically the practical choice.

⚠️Common Mistakes

  • Off-by-one indexing: Many inputs are 1-based, but C++ vectors are 0-based. Always convert consistently; otherwise you may write outside bounds or misplace edges. Use assertions to catch u, v in [0, n).
  • Forgetting symmetry in undirected graphs: If you set A[u][v] = 1 but forget A[v][u] = 1 (or push both ways in lists), degrees and traversals will be wrong.
  • Misinterpreting degrees in digraphs: Row sums give out-degrees; column sums give in-degrees. Mixing them leads to incorrect analyses.
  • Ignoring self-loops or multiedges: If your graph is simple, do not add u = v or duplicate edges. If multigraphs are allowed, degrees must count multiplicities; matrices may store counts or weights.
  • Using an adjacency matrix for very sparse large graphs: O(n^{2}) memory can exceed limits quickly (e.g., n = 10^5 implies 10^{10} entries). Prefer lists for sparsity.
  • Confusing 0/1 with weights: In weighted graphs, A_{ij} holds w_{ij}, not just 0/1. Degrees become weighted sums, not counts.
  • BFS/DFS pitfalls: Forgetting to mark visited before enqueuing can add duplicates; in a matrix-based BFS, scanning all n columns per row changes complexity to O(n^{2}).
  • Integer overflow in degree sums: With large m or weights, use 64-bit integers (long long) to store degrees and totals.
  • Not clearing or reinitializing structures between test cases in competitive programming, causing leftover edges to pollute results.

Key Formulas

Handshaking Lemma (Undirected)

i=1∑n​di​=2m

Explanation: In an undirected simple graph with n vertices and m edges, adding all vertex degrees counts each edge twice (once for each endpoint). This identity helps verify data structures and degree computations.

Directed Degree Sums

i=1∑n​diout​=mandi=1∑n​diin​=m

Explanation: In a directed graph, each edge contributes exactly one to an out-degree and one to an in-degree. Therefore, total out-degree equals total in-degree and equals the number of edges.

Edge Bound (Simple Undirected)

m≤2n(n−1)​

Explanation: A simple undirected graph has at most one edge per unordered pair, so the number of edges cannot exceed n choose 2. This sets the maximum density for such graphs.

Degree from Adjacency Matrix

di​=j=1∑n​Aij​

Explanation: For unweighted undirected graphs, the degree of vertex i is the sum across row i of the adjacency matrix. In weighted graphs, this becomes the weighted degree (sum of weights).

Symmetry of Undirected Graphs

A=A⊤

Explanation: The adjacency matrix of an undirected graph is symmetric because if i is connected to j, then j is connected to i. This property is frequently used to halve storage or checks.

Degree Matrix Definition

Dii​=j=1∑n​Aij​,Dij​=0 for i=j

Explanation: The degree matrix places each vertex’s degree on the diagonal and zeros elsewhere. It summarizes local connectivity per vertex.

Graph Laplacian

L=D−A

Explanation: The combinatorial Laplacian subtracts adjacency from degree, capturing flow-like differences. Its spectrum is central in clustering, partitioning, and measuring connectivity.

Walk Counting via Powers

(A^{k})_{ij} = \text{# of walks of length } k \text{ from } i \text{ to } j

Explanation: Raising the adjacency matrix to power k and reading entry (i, j) counts the number of k-step walks between i and j in unweighted graphs. This connects combinatorics with linear algebra.

Storage Complexity

Space(matrix)=O(n2),Space(list)=O(n+m)

Explanation: Adjacency matrices store an entry for every potential pair of vertices, while lists store only actual edges. This guides representation choice based on sparsity.

BFS Complexity by Representation

TBFS-list​(n,m)=O(n+m),TBFS-matrix​(n)=O(n2)

Explanation: BFS using adjacency lists visits each vertex and edge once, giving O(n + m). Using an adjacency matrix forces scanning all n entries per row, leading to O(n2).

Complexity Analysis

Let n = ∣V∣ and m = ∣E∣. An adjacency matrix requires O(n2) space, regardless of m, since it stores a value for every ordered pair of vertices. Initializing a zero matrix typically costs O(n2) time. Insertion of a single edge is O(1) by setting A[u][v] (and A[v][u] for undirected). Edge existence queries A[u][v] are O(1). However, iterating over neighbors of a vertex i requires scanning an entire row, taking O(n) time, even if i has only a few neighbors. A full traversal like BFS or DFS with a matrix thus costs O(n2) time. Adjacency lists require O(n + m) space, storing one list per vertex plus one entry per edge (two entries if undirected). Building from m edges takes O(n + m) time (assuming amortized O(1) pushb​ack). Checking edge existence (u, v) is O(deg(u)) by scanning u’s neighbor list (unless additional hash sets are used). Iterating neighbors of u is O(deg(u)), so BFS/DFS over the whole graph is O(n + m). For sparse graphs (m ≪ n2), this is typically far faster and more memory-efficient. Degree computation from an adjacency matrix is O(n2) by summing rows; from adjacency lists, it is O(n + m) by taking list sizes. Forming the degree matrix D is O(n). Computing the Laplacian L=D − A is O(n2) if stored as a dense matrix; for sparse storage (CSR/adjacency lists), it is O(n + m). Matrix-power methods (e.g., Ak) for walk counts use O(n3 log k) with fast exponentiation if dense; sparse algorithms can be faster but depend on fill-in. Therefore, choose data structures based on density, query patterns, and algorithmic needs: lists for sparse traversal, matrices for dense graphs and linear-algebraic analysis.

Code Examples

Build adjacency matrix, degree matrix, and Laplacian for an undirected graph
1#include <bits/stdc++.h>
2using namespace std;
3
4// This program reads an undirected, unweighted graph and builds:
5// - Adjacency matrix A (n x n)
6// - Degree matrix D (n x n, diagonal)
7// - Laplacian L = D - A
8// Input format:
9// n m
10// m lines of edges: u v (0-based indices)
11// If your input is 1-based, subtract 1 from u and v when reading.
12
13int main() {
14 ios::sync_with_stdio(false);
15 cin.tie(nullptr);
16
17 int n, m;
18 if (!(cin >> n >> m)) return 0;
19
20 // Initialize n x n matrix with zeros (O(n^2) space and time to zero it)
21 vector<vector<int>> A(n, vector<int>(n, 0));
22
23 for (int e = 0; e < m; ++e) {
24 int u, v; cin >> u >> v;
25 if (u < 0 || u >= n || v < 0 || v >= n) {
26 cerr << "Edge out of range: (" << u << "," << v << ")\n";
27 return 0;
28 }
29 if (u == v) {
30 // Simple graphs often forbid self-loops; skip or handle as needed
31 // Here we allow it: it adds 1 on the diagonal
32 A[u][v] = 1; // For multigraphs, consider incrementing instead of setting
33 } else {
34 A[u][v] = 1;
35 A[v][u] = 1; // ensure symmetry for undirected graphs
36 }
37 }
38
39 // Compute degrees as row sums
40 vector<long long> deg(n, 0);
41 for (int i = 0; i < n; ++i) {
42 long long s = 0;
43 for (int j = 0; j < n; ++j) s += A[i][j];
44 deg[i] = s;
45 }
46
47 // Build Degree matrix D and Laplacian L = D - A
48 vector<vector<long long>> D(n, vector<long long>(n, 0));
49 vector<vector<long long>> L(n, vector<long long>(n, 0));
50 for (int i = 0; i < n; ++i) {
51 D[i][i] = deg[i];
52 }
53 for (int i = 0; i < n; ++i) {
54 for (int j = 0; j < n; ++j) {
55 L[i][j] = (i == j ? D[i][i] : 0) - A[i][j];
56 }
57 }
58
59 // Optional: verify Handshaking Lemma (for undirected without self-loops)
60 long long sum_deg = 0;
61 for (int i = 0; i < n; ++i) sum_deg += deg[i];
62 cerr << "Sum of degrees = " << sum_deg << " (should be 2*m if no self-loops)\n";
63
64 // Print matrices
65 cout << "Adjacency matrix A:\n";
66 for (int i = 0; i < n; ++i) {
67 for (int j = 0; j < n; ++j) cout << A[i][j] << (j+1==n?'\n':' ');
68 }
69
70 cout << "Degree matrix D:\n";
71 for (int i = 0; i < n; ++i) {
72 for (int j = 0; j < n; ++j) cout << D[i][j] << (j+1==n?'\n':' ');
73 }
74
75 cout << "Laplacian L = D - A:\n";
76 for (int i = 0; i < n; ++i) {
77 for (int j = 0; j < n; ++j) cout << L[i][j] << (j+1==n?'\n':' ');
78 }
79
80 return 0;
81}
82

We read an undirected graph, build its adjacency matrix A, compute vertex degrees as row sums, assemble the diagonal degree matrix D, and derive the Laplacian L = D − A. We also print a quick diagnostic (sum of degrees) to check against 2m when the graph has no self-loops. This demonstrates how matrix representations summarize connectivity and how D and L are constructed mechanically from A.

Time: Building A: O(n^2) to zero + O(m) to add edges; computing degrees: O(n^2); forming L: O(n^2). Overall O(n^2 + m).Space: O(n^2) for A, D, and L.
BFS using adjacency lists (unweighted shortest paths)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Input format:
5// n m\n
6// m lines: u v (0-based), undirected
7// s (source vertex)
8// Output: distance from s to every vertex (−1 if unreachable)
9
10int main() {
11 ios::sync_with_stdio(false);
12 cin.tie(nullptr);
13
14 int n, m; if (!(cin >> n >> m)) return 0;
15 vector<vector<int>> adj(n);
16 for (int e = 0; e < m; ++e) {
17 int u, v; cin >> u >> v;
18 if (u < 0 || u >= n || v < 0 || v >= n) return 0;
19 adj[u].push_back(v);
20 adj[v].push_back(u); // undirected
21 }
22
23 int s; cin >> s;
24 if (s < 0 || s >= n) return 0;
25
26 const int INF = -1;
27 vector<int> dist(n, INF);
28 queue<int> q;
29
30 // Initialize
31 dist[s] = 0;
32 q.push(s);
33
34 while (!q.empty()) {
35 int u = q.front(); q.pop();
36 for (int v : adj[u]) {
37 if (dist[v] == INF) { // not visited yet
38 dist[v] = dist[u] + 1; // one more edge
39 q.push(v);
40 }
41 }
42 }
43
44 // Output distances
45 for (int i = 0; i < n; ++i) {
46 cout << dist[i] << (i+1==n?'\n':' ');
47 }
48 return 0;
49}
50

This is a standard BFS using adjacency lists. It computes, for an unweighted undirected graph, the shortest path distance (in number of edges) from a source vertex s to every vertex. Using lists ensures we only traverse existing edges, yielding O(n + m) time on sparse graphs.

Time: O(n + m) because each vertex is visited once and each undirected edge is examined at most twice.Space: O(n + m) for the adjacency lists and O(n) for the queue and distance array.
BFS using an adjacency matrix (for comparison)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Input format:
5// n m\n
6// m lines: u v (0-based), undirected
7// s (source)
8// Output: distance from s to every vertex (−1 if unreachable)
9
10int main() {
11 ios::sync_with_stdio(false);
12 cin.tie(nullptr);
13
14 int n, m; if (!(cin >> n >> m)) return 0;
15 vector<vector<int>> A(n, vector<int>(n, 0));
16 for (int e = 0; e < m; ++e) {
17 int u, v; cin >> u >> v;
18 if (u < 0 || u >= n || v < 0 || v >= n) return 0;
19 A[u][v] = 1;
20 A[v][u] = 1; // undirected
21 }
22 int s; cin >> s; if (s < 0 || s >= n) return 0;
23
24 const int INF = -1;
25 vector<int> dist(n, INF);
26 queue<int> q;
27
28 dist[s] = 0; q.push(s);
29
30 while (!q.empty()) {
31 int u = q.front(); q.pop();
32 // Scan entire row u to find neighbors (costly when sparse)
33 for (int v = 0; v < n; ++v) {
34 if (A[u][v] && dist[v] == INF) {
35 dist[v] = dist[u] + 1;
36 q.push(v);
37 }
38 }
39 }
40
41 for (int i = 0; i < n; ++i) {
42 cout << dist[i] << (i+1==n?'\n':' ');
43 }
44 return 0;
45}
46

This BFS uses the adjacency matrix. For each dequeued vertex u, it scans the whole row A[u][·] to find neighbors, which costs O(n) per vertex. This highlights the practical difference in performance between matrix and list representations on sparse graphs.

Time: O(n^2) due to scanning n entries for each of O(n) dequeues in the worst case.Space: O(n^2) for the adjacency matrix plus O(n) for BFS state.
#graph#vertex#edge#adjacency matrix#degree matrix#laplacian#bfs#adjacency list#sparse graph#dense graph#undirected#directed#degree#walks#handshaking lemma