⚙️AlgorithmAdvanced

DP with Probability

Key Points

  • DP with probability models how chance flows between states over time by repeatedly redistributing mass according to transition probabilities.
  • You can simulate T steps with iterative DP [j] += p[i] * P[i][j], which is O(T · edges) and works well when T is small or the graph is sparse.
  • For huge T, use matrix exponentiation to compute p^(T) = p^(0) · in O( log T), where P is the transition matrix and p is a row vector.
  • Absorbing states model “endings” that once reached keep you there; absorbing Markov chain theory uses the fundamental matrix N = (I − Q)^{-1} to get absorption probabilities and expected steps.
  • Steady-state (stationary) distributions π satisfy = π and describe long-run behavior of ergodic chains; power iteration often finds
  • Be careful with floating-point error, normalization (rows sum to 1), and whether your matrix is row-stochastic or column-stochastic.
  • DP with expectations leads to linear equations: E[s] = 1 + Σ_t P[s,t]·E[t] for non-absorbing s, which you can solve with Gaussian elimination.
  • Choose the tool by scale: iterative DP (small T), matrix power (huge T), or linear-system methods .

Prerequisites

  • Basic probability (conditional probability, total probability)Transitions and updates p_next[j] = Σ p[i]·P[i][j] directly use these rules.
  • Linear algebra (vectors, matrices, inverses)Matrix exponentiation, stationary distributions, and the fundamental matrix require matrix operations.
  • Dynamic programming fundamentalsThe core update is a DP transition that accumulates values over states across steps.
  • Graph representations (adjacency lists/matrices)States and transitions form a directed graph used in implementations.
  • Gaussian elimination / solving linear systemsExpected values and absorbing-chain analysis reduce to linear equations.
  • Floating-point numerics and stabilityProbability DPs are sensitive to rounding; careful normalization and pivoting are needed.
  • Asymptotic analysis (Big-O)Choosing between iterative DP, matrix power, or linear solves depends on time/space complexity.
  • Markov chain basics (ergodicity, stationary distribution)Understanding long-run behavior and convergence requires these concepts.

Detailed Explanation

Tap terms for definitions

01Overview

Imagine you pour one unit of water across several cups (states). Each second, every cup leaks its water into other cups according to fixed percentages. Dynamic programming (DP) with probability tracks how that water (probability mass) moves over time. Instead of counting ways, we sum probabilities. On each step, the probability of being in a state is the total incoming probability from all predecessors, weighted by transition probabilities. This is exactly the Markov chain update rule and can be implemented as a simple loop. The hook is practical: many problems—random walks on graphs, dice/coin processes, card draws, or simulations with absorbing outcomes—reduce to repeatedly applying the law of total probability. Conceptually, we represent the current distribution as a vector p, and transitions as a row-stochastic matrix P (each row sums to 1). A single step is p ← pP. Repeating T steps is either iterating T times or computing P^T by fast exponentiation for large T. When some states are absorbing (once entered, you stay), the process eventually stops there; absorbing Markov chain formulas give both the probability of ending in each absorber and the expected number of steps until absorption. Together, these techniques let you solve a wide range of competitive programming tasks involving randomness with strong guarantees on correctness and performance.

02Intuition & Analogies

Hook: Picture a board game where your token moves between tiles based on a weighted spinner. Each spin you redistribute your belief about the token’s location: if there’s a 70% chance to move to tile A and 30% to tile B from the current tile, your belief shifts accordingly. Repeat this, and the belief “flows” along the game board’s arrows. Concept: That flow is probability mass. DP with probability is just tracking the movement of mass. A state keeps the portion it’s supposed to keep and sends the rest along its outgoing edges, proportionally to their probabilities. If a tile is an absorbing tile (a sink), once your token lands there, all the mass that arrives stays forever—like a drain. Example: Suppose you start at tile 0 with probability 1. From 0, you go to 1 with probability 0.3 and to 2 with 0.7. Tiles 1 and 2 are absorbing (self-loops with probability 1). After one spin, p = [0, 0.3, 0.7]. After two spins, it’s still [0, 0.3, 0.7] because mass that reached 1 or 2 stays there. If instead some tiles bounce you around (non-absorbing), your mass can circulate until it drains into absorbers. For huge numbers of spins, it’s tedious to iterate; power tools like matrix exponentiation jump directly to the T-th step. For endless play without absorbers, the mass may settle into a steady pattern (stationary distribution), where another spin doesn’t change the beliefs at all.

03Formal Definition

We model a system with n states S = {1, …, n} and a row-stochastic transition matrix P ∈ where and for all i. Let be a row vector whose j-th entry is the probability of being in state j after t steps. The Markov update is p^{(t+1)} = p^{(t)} P, or component-wise (j) = (i) . If A ⊂ S is the set of absorbing states, then for any a ∈ A, and for . After reordering states so that transient states T come before absorbing states A, the matrix has block form P = [ Q R ; 0 I ], where Q is × among transients, R is × from transients to absorbers, and I is an identity matrix on absorbers. The fundamental matrix is N = (I − Q)^{-1}. The absorption probability matrix is R, where is the probability that starting from transient i, absorption occurs at absorbing a. The expected number of steps to absorption from transient states is · 1, where 1 is an all-ones column vector. A stationary distribution π is a row vector with π ≥ 0, π_, and = For irreducible, aperiodic finite chains, → π as t → ∞.

04When to Use

  • Bounded steps, sparse transitions: Use iterative DP when T is modest (e.g., up to 1e6 if edges are few) and you only need the distribution after T steps or a probability of some event at/within T steps.
  • Huge time horizons: Use matrix exponentiation to compute p^{(T)} = p^{(0)} P^{T} when T is large (e.g., 1e12) and the number of states n is small to medium (typically n ≤ 100–300 in contests) and P is dense or manageable.
  • Absorbing outcomes: Use absorbing Markov chain formulas when you need (1) probability of ending in each absorbing state, or (2) expected time to absorption, regardless of T (i.e., the limit behavior). This reduces to linear algebra with (I − Q)^{-1}.
  • Expected value queries: When asked for expected steps or expected reward until an event, set up linear equations E[s] = reward(s) + \sum_t P_{s,t} E[t] and solve. This is robust even with cycles.
  • Steady-state/long-run behavior: For ergodic chains where T is “very large,” compute the stationary distribution via power iteration or solve πP = π with the constraint \sum_i π_i = 1. This is common in ranking, load balancing, and mixing time analyses.

⚠️Common Mistakes

  • Mixing orientations: Row-stochastic vs. column-stochastic. If you treat p as a row vector (common in programming), use p' = pP and P’s rows must sum to 1. If you use a column vector, the correct update is p' = P^T p or P column-stochastic. Be consistent.
  • Probabilities not summing to 1: Floating-point accumulation can slightly break normalization. Periodically renormalize rows of P and p to keep \sum p_i ≈ 1.
  • Using DAG-style DP on cycles: Probability flows can cycle forever unless there are absorbers or mixing. For expected values in cyclic graphs, set up and solve linear equations instead of trying to topologically order states.
  • Precision loss: Subtracting nearly equal numbers in Gaussian elimination or doing many iterative steps can introduce error. Use partial pivoting, double (or long double), and check conditioning; avoid unnecessary iterations.
  • Inefficient for large T: Iterating T steps when T is huge will TLE. Use matrix exponentiation (O(n^3 log T)) or power iteration if you only need the steady state.
  • Forgetting absorbing self-loops: In code, make absorbers explicit with P[a][a] = 1 and no other outgoing mass; otherwise probability may “leak.”
  • Incorrect termination checks: For convergence, compare distributions with a tolerance (e.g., L1 norm < 1e-12), not exact equality. Also ensure loops won’t spin forever without a stopping condition.

Key Formulas

Markov Update

Explanation: The distribution after one step equals the previous distribution times the transition matrix. Use this for iterative DP updates.

Component-wise Update

Explanation: The probability of being in state j next step is the sum of current probabilities of all states i times their chance to go to j. This is the law of total probability in matrix form.

T-step Distribution

Explanation: The distribution after T steps is the initial distribution multiplied by the T-th power of the transition matrix. Compute with fast matrix exponentiation when T is large.

Normalization

Explanation: Total probability must always be 1 at every step. It is a good invariant to check in implementations.

Absorbing Chain Block Form

Explanation: After ordering states so transients come first, the transition matrix splits into transient-to-transient (Q), transient-to-absorbing (R), and identity on absorbers.

Fundamental Matrix

Explanation: N captures expected numbers of visits to transient states. It is the inverse of I minus the transient submatrix.

Absorption Probabilities

Explanation: Starting from transient states, the probability to end in each absorber is given by multiplying the fundamental matrix by R.

Expected Steps to Absorption

Explanation: Multiply N by an all-ones vector to get expected time-to-absorption for each transient state.

Stationary Distribution

Explanation: A vector π that is unchanged by P and sums to 1 represents the long-run fraction of time spent in each state for ergodic chains.

Expectation Recurrence

Explanation: Set up a linear equation for the expected number of steps or cost from each state. Solve the resulting linear system.

Convergence Check

Explanation: Use a norm-based tolerance to decide when iterative methods (like power iteration) have converged sufficiently.

Complexity Analysis

Let n be the number of states and m the number of directed transitions (nonzero probabilities). The simplest DP update runs for T steps and, for each step, pushes probability along outgoing edges: [j] += p[i]·P[i][j]. If the graph is stored sparsely, each step costs O(m), for a total of O(T·m) time and O(n + m) space. This is ideal when T is modest (e.g., up to 1e6 with small m) or when you can stop early (e.g., after convergence to absorbers). When T is huge, iterating T times is infeasible. Matrix exponentiation computes in O( log T) time with O() space for dense matrices, after which multiplying the initial row vector costs O(). This is practical when n is small to medium and T is very large (e.g., 1e12). If P is sparse and n is large, specialized sparse matrix exponentiation or repeated squaring with sparsity-aware multiplication can reduce constants, but asymptotics remain cubic per dense multiply. For absorbing chains, computing the fundamental matrix N = (I − Q)^{-1} requires solving a linear system or inverting a × matrix. Gaussian elimination with partial pivoting is O(^3) time and O(^2) space. Once N is available, absorption probabilities R cost O(^2·), and expected steps costs O(^2). To find a stationary distribution via power iteration, each iteration costs O() for dense P (or O(m) sparse), and convergence typically occurs in O(1/ iterations where γ depends on the spectral gap (1 − | Memory usage is O() for dense P or O(m) for sparse representations, plus O(n) for vectors. Floating-point stability may necessitate long double and normalization steps, but it does not change asymptotic bounds.

Code Examples

Iterative probability DP on a small graph with absorbing states
1#include <bits/stdc++.h>
2using namespace std;
3
4// Demonstrates iterative DP: p_next[j] += p[i] * P[i][j]
5// Example graph: state 0 -> state 1 with 0.3, -> state 2 with 0.7; states 1 and 2 are absorbing.
6
7int main() {
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 int n = 3; // states 0,1,2
12 vector<vector<pair<int,double>>> out(n);
13
14 // Transitions from state 0
15 out[0].push_back({1, 0.3});
16 out[0].push_back({2, 0.7});
17
18 // Make states 1 and 2 absorbing explicitly
19 out[1].push_back({1, 1.0});
20 out[2].push_back({2, 1.0});
21
22 auto check_row_sums = [&](void){
23 for (int i = 0; i < n; ++i) {
24 double s = 0.0; for (auto [j,p]: out[i]) s += p;
25 if (fabs(s - 1.0) > 1e-12) {
26 cerr << "Warning: row " << i << " sums to " << setprecision(17) << s << "\n";
27 }
28 }
29 };
30 check_row_sums();
31
32 // Start at state 0 with probability 1
33 vector<double> p(n, 0.0), nextp(n, 0.0);
34 p[0] = 1.0;
35
36 int T = 5; // simulate 5 steps
37 for (int t = 0; t < T; ++t) {
38 fill(nextp.begin(), nextp.end(), 0.0);
39 for (int i = 0; i < n; ++i) {
40 if (p[i] == 0.0) continue;
41 for (auto [j, prob] : out[i]) {
42 nextp[j] += p[i] * prob;
43 }
44 }
45 // Optional renormalization to control drift
46 double sum = 0.0; for (double x : nextp) sum += x;
47 for (double &x : nextp) x /= sum;
48 p.swap(nextp);
49 }
50
51 cout.setf(std::ios::fixed); cout << setprecision(9);
52 cout << "Distribution after T steps:\n";
53 for (int i = 0; i < n; ++i) cout << "state " << i << ": " << p[i] << "\n";
54
55 // Probability of eventual absorption at 1 equals p[1] after large enough T here
56 // (since mass that reaches 1 or 2 stays there in this example).
57 return 0;
58}
59

We model transitions as adjacency lists with probabilities. We start with p concentrated at state 0 and repeatedly ‘push’ mass along edges. Absorbing states have a self-loop of probability 1, so any mass that enters them remains. After enough steps, all probability resides in the absorbers. The code includes a sanity check for row sums and a light renormalization to control floating-point drift.

Time: O(T · m), where m is the number of nonzero transitions (edges).Space: O(n + m) for the graph and O(n) for the probability vectors.
Markov chain distribution after huge T using matrix exponentiation
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Mat {
5 int n; vector<vector<long double>> a;
6 Mat(int n=0, bool ident=false): n(n), a(n, vector<long double>(n, 0)) {
7 if (ident) for (int i = 0; i < n; ++i) a[i][i] = 1.0L;
8 }
9};
10
11Mat mul(const Mat &A, const Mat &B) {
12 int n = A.n; Mat C(n);
13 for (int i = 0; i < n; ++i) {
14 for (int k = 0; k < n; ++k) if (A.a[i][k] != 0.0L) {
15 long double aik = A.a[i][k];
16 for (int j = 0; j < n; ++j) if (B.a[k][j] != 0.0L) {
17 C.a[i][j] += aik * B.a[k][j];
18 }
19 }
20 }
21 return C;
22}
23
24Mat mpow(Mat base, long long e) {
25 int n = base.n; Mat res(n, true);
26 while (e > 0) {
27 if (e & 1LL) res = mul(res, base);
28 base = mul(base, base);
29 e >>= 1LL;
30 }
31 return res;
32}
33
34// Multiply row vector p (size n) by matrix P (n x n): returns p * P
35vector<long double> mul_vec_mat(const vector<long double> &p, const Mat &P) {
36 int n = P.n; vector<long double> r(n, 0.0L);
37 for (int j = 0; j < n; ++j) {
38 long double s = 0.0L;
39 for (int i = 0; i < n; ++i) if (p[i] != 0.0L && P.a[i][j] != 0.0L) s += p[i] * P.a[i][j];
40 r[j] = s;
41 }
42 // Renormalize lightly to combat drift
43 long double sum = 0.0L; for (auto x : r) sum += x; for (auto &x : r) x /= sum;
44 return r;
45}
46
47int main(){
48 ios::sync_with_stdio(false);
49 cin.tie(nullptr);
50
51 // Same 3-state example as before
52 int n = 3; Mat P(n);
53 // Row-stochastic P: from state i to j = P.a[i][j]
54 P.a[0][1] = 0.3L; P.a[0][2] = 0.7L;
55 P.a[1][1] = 1.0L; // absorbing
56 P.a[2][2] = 1.0L; // absorbing
57
58 vector<long double> p0(n, 0.0L); p0[0] = 1.0L; // start in state 0
59
60 long long T = 50; // a large number of steps
61 Mat PT = mpow(P, T);
62 vector<long double> pT = mul_vec_mat(p0, PT);
63
64 cout.setf(std::ios::fixed); cout << setprecision(12);
65 cout << "Distribution after T steps via P^T: \n";
66 for (int i = 0; i < n; ++i) cout << "state " << i << ": " << (double)pT[i] << "\n";
67
68 return 0;
69}
70

We construct the transition matrix P (row-stochastic) and compute P^T using binary exponentiation. Multiplying the initial row vector p0 by P^T yields the exact T-step distribution without iterating T times. Long doubles help reduce accumulated error. This approach is effective when T is huge and n is modest.

Time: O(n^3 log T) for exponentiation, plus O(n^2) for the final vector-matrix multiply.Space: O(n^2) to store matrices and O(n) for vectors.
Absorbing Markov chain: absorption probabilities and expected steps via (I−Q)^{-1}
1#include <bits/stdc++.h>
2using namespace std;
3
4// Build an absorbing chain with 4 states: 0,1 transient; 2,3 absorbing.
5// P (row-stochastic):
6// 0 -> 0 (0.2), 1 (0.5), 2 (0.3)
7// 1 -> 0 (0.4), 3 (0.6)
8// 2 -> 2 (1.0), 3 -> 3 (1.0)
9
10vector<vector<long double>> invert(const vector<vector<long double>>& A) {
11 int n = (int)A.size();
12 vector<vector<long double>> aug(n, vector<long double>(2*n, 0.0L));
13 for (int i = 0; i < n; ++i) {
14 for (int j = 0; j < n; ++j) aug[i][j] = A[i][j];
15 aug[i][n+i] = 1.0L; // identity on the right
16 }
17 // Gaussian elimination with partial pivoting
18 for (int col = 0, row = 0; col < n && row < n; ++col, ++row) {
19 int piv = row;
20 for (int r = row+1; r < n; ++r) if (fabsl(aug[r][col]) > fabsl(aug[piv][col])) piv = r;
21 if (fabsl(aug[piv][col]) < 1e-18L) throw runtime_error("Singular matrix (near-singular)");
22 swap(aug[piv], aug[row]);
23 long double diag = aug[row][col];
24 for (int j = 0; j < 2*n; ++j) aug[row][j] /= diag; // make pivot 1
25 for (int r = 0; r < n; ++r) if (r != row) {
26 long double f = aug[r][col];
27 if (f == 0.0L) continue;
28 for (int j = 0; j < 2*n; ++j) aug[r][j] -= f * aug[row][j];
29 }
30 }
31 vector<vector<long double>> inv(n, vector<long double>(n, 0.0L));
32 for (int i = 0; i < n; ++i)
33 for (int j = 0; j < n; ++j)
34 inv[i][j] = aug[i][n+j];
35 return inv;
36}
37
38int main(){
39 ios::sync_with_stdio(false);
40 cin.tie(nullptr);
41
42 const int n = 4;
43 vector<vector<long double>> P(n, vector<long double>(n, 0.0L));
44 P[0][0]=0.2L; P[0][1]=0.5L; P[0][2]=0.3L;
45 P[1][0]=0.4L; P[1][3]=0.6L;
46 P[2][2]=1.0L; // absorbing
47 P[3][3]=1.0L; // absorbing
48
49 // Identify transient (T) and absorbing (A) sets; here T={0,1}, A={2,3}
50 vector<int> T = {0,1};
51 vector<int> A = {2,3};
52 int t = (int)T.size();
53 int a = (int)A.size();
54
55 // Build Q (t x t) and R (t x a)
56 vector<vector<long double>> Q(t, vector<long double>(t, 0.0L));
57 vector<vector<long double>> R(t, vector<long double>(a, 0.0L));
58
59 for (int i = 0; i < t; ++i) {
60 for (int j = 0; j < t; ++j) Q[i][j] = P[T[i]][T[j]];
61 for (int j = 0; j < a; ++j) R[i][j] = P[T[i]][A[j]];
62 }
63
64 // Compute N = (I - Q)^{-1}
65 vector<vector<long double>> I(t, vector<long double>(t, 0.0L));
66 for (int i = 0; i < t; ++i) I[i][i] = 1.0L;
67 vector<vector<long double>> IminusQ = I;
68 for (int i = 0; i < t; ++i)
69 for (int j = 0; j < t; ++j)
70 IminusQ[i][j] -= Q[i][j];
71
72 vector<vector<long double>> N = invert(IminusQ);
73
74 // Compute B = N * R (t x a)
75 vector<vector<long double>> B(t, vector<long double>(a, 0.0L));
76 for (int i = 0; i < t; ++i)
77 for (int k = 0; k < t; ++k) if (N[i][k] != 0.0L)
78 for (int j = 0; j < a; ++j)
79 B[i][j] += N[i][k] * R[k][j];
80
81 // Expected steps tvec = N * 1
82 vector<long double> tvec(t, 0.0L);
83 for (int i = 0; i < t; ++i)
84 for (int k = 0; k < t; ++k)
85 tvec[i] += N[i][k] * 1.0L;
86
87 cout.setf(std::ios::fixed); cout << setprecision(9);
88 cout << "Absorption probabilities (rows: start in 0 or 1; cols: absorb at 2 or 3):\n";
89 for (int i = 0; i < t; ++i) {
90 for (int j = 0; j < a; ++j) cout << (double)B[i][j] << (j+1==a?'\n':' ');
91 }
92 cout << "Expected steps to absorption from states 0 and 1:\n";
93 for (int i = 0; i < t; ++i) cout << (double)tvec[i] << (i+1==t?'\n':' ');
94
95 return 0;
96}
97

We reorder states conceptually into transient (0,1) and absorbing (2,3), form Q and R, then compute the fundamental matrix N = (I−Q)^{-1} via Gaussian elimination with partial pivoting. Absorption probabilities B = N R give the chance to end in each absorbing state from a transient start. Expected time to absorption is t = N·1. This avoids simulating many steps and is exact up to floating-point error.

Time: O(t^3) to invert (I−Q), plus O(t^2·a) to compute B and O(t^2) for t = N·1.Space: O(t^2) for N, Q, and O(t·a) for R and B.