DP with Probability
Key Points
- •DP with probability models how chance flows between states over time by repeatedly redistributing mass according to transition probabilities.
- •You can simulate T steps with iterative DP [j] += p[i] * P[i][j], which is O(T · edges) and works well when T is small or the graph is sparse.
- •For huge T, use matrix exponentiation to compute p^(T) = p^(0) · in O( log T), where P is the transition matrix and p is a row vector.
- •Absorbing states model “endings” that once reached keep you there; absorbing Markov chain theory uses the fundamental matrix N = (I − Q)^{-1} to get absorption probabilities and expected steps.
- •Steady-state (stationary) distributions π satisfy = π and describe long-run behavior of ergodic chains; power iteration often finds
- •Be careful with floating-point error, normalization (rows sum to 1), and whether your matrix is row-stochastic or column-stochastic.
- •DP with expectations leads to linear equations: E[s] = 1 + Σ_t P[s,t]·E[t] for non-absorbing s, which you can solve with Gaussian elimination.
- •Choose the tool by scale: iterative DP (small T), matrix power (huge T), or linear-system methods .
Prerequisites
- →Basic probability (conditional probability, total probability) — Transitions and updates p_next[j] = Σ p[i]·P[i][j] directly use these rules.
- →Linear algebra (vectors, matrices, inverses) — Matrix exponentiation, stationary distributions, and the fundamental matrix require matrix operations.
- →Dynamic programming fundamentals — The core update is a DP transition that accumulates values over states across steps.
- →Graph representations (adjacency lists/matrices) — States and transitions form a directed graph used in implementations.
- →Gaussian elimination / solving linear systems — Expected values and absorbing-chain analysis reduce to linear equations.
- →Floating-point numerics and stability — Probability DPs are sensitive to rounding; careful normalization and pivoting are needed.
- →Asymptotic analysis (Big-O) — Choosing between iterative DP, matrix power, or linear solves depends on time/space complexity.
- →Markov chain basics (ergodicity, stationary distribution) — Understanding long-run behavior and convergence requires these concepts.
Detailed Explanation
Tap terms for definitions01Overview
Imagine you pour one unit of water across several cups (states). Each second, every cup leaks its water into other cups according to fixed percentages. Dynamic programming (DP) with probability tracks how that water (probability mass) moves over time. Instead of counting ways, we sum probabilities. On each step, the probability of being in a state is the total incoming probability from all predecessors, weighted by transition probabilities. This is exactly the Markov chain update rule and can be implemented as a simple loop. The hook is practical: many problems—random walks on graphs, dice/coin processes, card draws, or simulations with absorbing outcomes—reduce to repeatedly applying the law of total probability. Conceptually, we represent the current distribution as a vector p, and transitions as a row-stochastic matrix P (each row sums to 1). A single step is p ← pP. Repeating T steps is either iterating T times or computing P^T by fast exponentiation for large T. When some states are absorbing (once entered, you stay), the process eventually stops there; absorbing Markov chain formulas give both the probability of ending in each absorber and the expected number of steps until absorption. Together, these techniques let you solve a wide range of competitive programming tasks involving randomness with strong guarantees on correctness and performance.
02Intuition & Analogies
Hook: Picture a board game where your token moves between tiles based on a weighted spinner. Each spin you redistribute your belief about the token’s location: if there’s a 70% chance to move to tile A and 30% to tile B from the current tile, your belief shifts accordingly. Repeat this, and the belief “flows” along the game board’s arrows. Concept: That flow is probability mass. DP with probability is just tracking the movement of mass. A state keeps the portion it’s supposed to keep and sends the rest along its outgoing edges, proportionally to their probabilities. If a tile is an absorbing tile (a sink), once your token lands there, all the mass that arrives stays forever—like a drain. Example: Suppose you start at tile 0 with probability 1. From 0, you go to 1 with probability 0.3 and to 2 with 0.7. Tiles 1 and 2 are absorbing (self-loops with probability 1). After one spin, p = [0, 0.3, 0.7]. After two spins, it’s still [0, 0.3, 0.7] because mass that reached 1 or 2 stays there. If instead some tiles bounce you around (non-absorbing), your mass can circulate until it drains into absorbers. For huge numbers of spins, it’s tedious to iterate; power tools like matrix exponentiation jump directly to the T-th step. For endless play without absorbers, the mass may settle into a steady pattern (stationary distribution), where another spin doesn’t change the beliefs at all.
03Formal Definition
04When to Use
- Bounded steps, sparse transitions: Use iterative DP when T is modest (e.g., up to 1e6 if edges are few) and you only need the distribution after T steps or a probability of some event at/within T steps.
- Huge time horizons: Use matrix exponentiation to compute p^{(T)} = p^{(0)} P^{T} when T is large (e.g., 1e12) and the number of states n is small to medium (typically n ≤ 100–300 in contests) and P is dense or manageable.
- Absorbing outcomes: Use absorbing Markov chain formulas when you need (1) probability of ending in each absorbing state, or (2) expected time to absorption, regardless of T (i.e., the limit behavior). This reduces to linear algebra with (I − Q)^{-1}.
- Expected value queries: When asked for expected steps or expected reward until an event, set up linear equations E[s] = reward(s) + \sum_t P_{s,t} E[t] and solve. This is robust even with cycles.
- Steady-state/long-run behavior: For ergodic chains where T is “very large,” compute the stationary distribution via power iteration or solve πP = π with the constraint \sum_i π_i = 1. This is common in ranking, load balancing, and mixing time analyses.
⚠️Common Mistakes
- Mixing orientations: Row-stochastic vs. column-stochastic. If you treat p as a row vector (common in programming), use p' = pP and P’s rows must sum to 1. If you use a column vector, the correct update is p' = P^T p or P column-stochastic. Be consistent.
- Probabilities not summing to 1: Floating-point accumulation can slightly break normalization. Periodically renormalize rows of P and p to keep \sum p_i ≈ 1.
- Using DAG-style DP on cycles: Probability flows can cycle forever unless there are absorbers or mixing. For expected values in cyclic graphs, set up and solve linear equations instead of trying to topologically order states.
- Precision loss: Subtracting nearly equal numbers in Gaussian elimination or doing many iterative steps can introduce error. Use partial pivoting, double (or long double), and check conditioning; avoid unnecessary iterations.
- Inefficient for large T: Iterating T steps when T is huge will TLE. Use matrix exponentiation (O(n^3 log T)) or power iteration if you only need the steady state.
- Forgetting absorbing self-loops: In code, make absorbers explicit with P[a][a] = 1 and no other outgoing mass; otherwise probability may “leak.”
- Incorrect termination checks: For convergence, compare distributions with a tolerance (e.g., L1 norm < 1e-12), not exact equality. Also ensure loops won’t spin forever without a stopping condition.
Key Formulas
Markov Update
Explanation: The distribution after one step equals the previous distribution times the transition matrix. Use this for iterative DP updates.
Component-wise Update
Explanation: The probability of being in state j next step is the sum of current probabilities of all states i times their chance to go to j. This is the law of total probability in matrix form.
T-step Distribution
Explanation: The distribution after T steps is the initial distribution multiplied by the T-th power of the transition matrix. Compute with fast matrix exponentiation when T is large.
Normalization
Explanation: Total probability must always be 1 at every step. It is a good invariant to check in implementations.
Absorbing Chain Block Form
Explanation: After ordering states so transients come first, the transition matrix splits into transient-to-transient (Q), transient-to-absorbing (R), and identity on absorbers.
Fundamental Matrix
Explanation: N captures expected numbers of visits to transient states. It is the inverse of I minus the transient submatrix.
Absorption Probabilities
Explanation: Starting from transient states, the probability to end in each absorber is given by multiplying the fundamental matrix by R.
Expected Steps to Absorption
Explanation: Multiply N by an all-ones vector to get expected time-to-absorption for each transient state.
Stationary Distribution
Explanation: A vector π that is unchanged by P and sums to 1 represents the long-run fraction of time spent in each state for ergodic chains.
Expectation Recurrence
Explanation: Set up a linear equation for the expected number of steps or cost from each state. Solve the resulting linear system.
Convergence Check
Explanation: Use a norm-based tolerance to decide when iterative methods (like power iteration) have converged sufficiently.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Demonstrates iterative DP: p_next[j] += p[i] * P[i][j] 5 // Example graph: state 0 -> state 1 with 0.3, -> state 2 with 0.7; states 1 and 2 are absorbing. 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int n = 3; // states 0,1,2 12 vector<vector<pair<int,double>>> out(n); 13 14 // Transitions from state 0 15 out[0].push_back({1, 0.3}); 16 out[0].push_back({2, 0.7}); 17 18 // Make states 1 and 2 absorbing explicitly 19 out[1].push_back({1, 1.0}); 20 out[2].push_back({2, 1.0}); 21 22 auto check_row_sums = [&](void){ 23 for (int i = 0; i < n; ++i) { 24 double s = 0.0; for (auto [j,p]: out[i]) s += p; 25 if (fabs(s - 1.0) > 1e-12) { 26 cerr << "Warning: row " << i << " sums to " << setprecision(17) << s << "\n"; 27 } 28 } 29 }; 30 check_row_sums(); 31 32 // Start at state 0 with probability 1 33 vector<double> p(n, 0.0), nextp(n, 0.0); 34 p[0] = 1.0; 35 36 int T = 5; // simulate 5 steps 37 for (int t = 0; t < T; ++t) { 38 fill(nextp.begin(), nextp.end(), 0.0); 39 for (int i = 0; i < n; ++i) { 40 if (p[i] == 0.0) continue; 41 for (auto [j, prob] : out[i]) { 42 nextp[j] += p[i] * prob; 43 } 44 } 45 // Optional renormalization to control drift 46 double sum = 0.0; for (double x : nextp) sum += x; 47 for (double &x : nextp) x /= sum; 48 p.swap(nextp); 49 } 50 51 cout.setf(std::ios::fixed); cout << setprecision(9); 52 cout << "Distribution after T steps:\n"; 53 for (int i = 0; i < n; ++i) cout << "state " << i << ": " << p[i] << "\n"; 54 55 // Probability of eventual absorption at 1 equals p[1] after large enough T here 56 // (since mass that reaches 1 or 2 stays there in this example). 57 return 0; 58 } 59
We model transitions as adjacency lists with probabilities. We start with p concentrated at state 0 and repeatedly ‘push’ mass along edges. Absorbing states have a self-loop of probability 1, so any mass that enters them remains. After enough steps, all probability resides in the absorbers. The code includes a sanity check for row sums and a light renormalization to control floating-point drift.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Mat { 5 int n; vector<vector<long double>> a; 6 Mat(int n=0, bool ident=false): n(n), a(n, vector<long double>(n, 0)) { 7 if (ident) for (int i = 0; i < n; ++i) a[i][i] = 1.0L; 8 } 9 }; 10 11 Mat mul(const Mat &A, const Mat &B) { 12 int n = A.n; Mat C(n); 13 for (int i = 0; i < n; ++i) { 14 for (int k = 0; k < n; ++k) if (A.a[i][k] != 0.0L) { 15 long double aik = A.a[i][k]; 16 for (int j = 0; j < n; ++j) if (B.a[k][j] != 0.0L) { 17 C.a[i][j] += aik * B.a[k][j]; 18 } 19 } 20 } 21 return C; 22 } 23 24 Mat mpow(Mat base, long long e) { 25 int n = base.n; Mat res(n, true); 26 while (e > 0) { 27 if (e & 1LL) res = mul(res, base); 28 base = mul(base, base); 29 e >>= 1LL; 30 } 31 return res; 32 } 33 34 // Multiply row vector p (size n) by matrix P (n x n): returns p * P 35 vector<long double> mul_vec_mat(const vector<long double> &p, const Mat &P) { 36 int n = P.n; vector<long double> r(n, 0.0L); 37 for (int j = 0; j < n; ++j) { 38 long double s = 0.0L; 39 for (int i = 0; i < n; ++i) if (p[i] != 0.0L && P.a[i][j] != 0.0L) s += p[i] * P.a[i][j]; 40 r[j] = s; 41 } 42 // Renormalize lightly to combat drift 43 long double sum = 0.0L; for (auto x : r) sum += x; for (auto &x : r) x /= sum; 44 return r; 45 } 46 47 int main(){ 48 ios::sync_with_stdio(false); 49 cin.tie(nullptr); 50 51 // Same 3-state example as before 52 int n = 3; Mat P(n); 53 // Row-stochastic P: from state i to j = P.a[i][j] 54 P.a[0][1] = 0.3L; P.a[0][2] = 0.7L; 55 P.a[1][1] = 1.0L; // absorbing 56 P.a[2][2] = 1.0L; // absorbing 57 58 vector<long double> p0(n, 0.0L); p0[0] = 1.0L; // start in state 0 59 60 long long T = 50; // a large number of steps 61 Mat PT = mpow(P, T); 62 vector<long double> pT = mul_vec_mat(p0, PT); 63 64 cout.setf(std::ios::fixed); cout << setprecision(12); 65 cout << "Distribution after T steps via P^T: \n"; 66 for (int i = 0; i < n; ++i) cout << "state " << i << ": " << (double)pT[i] << "\n"; 67 68 return 0; 69 } 70
We construct the transition matrix P (row-stochastic) and compute P^T using binary exponentiation. Multiplying the initial row vector p0 by P^T yields the exact T-step distribution without iterating T times. Long doubles help reduce accumulated error. This approach is effective when T is huge and n is modest.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Build an absorbing chain with 4 states: 0,1 transient; 2,3 absorbing. 5 // P (row-stochastic): 6 // 0 -> 0 (0.2), 1 (0.5), 2 (0.3) 7 // 1 -> 0 (0.4), 3 (0.6) 8 // 2 -> 2 (1.0), 3 -> 3 (1.0) 9 10 vector<vector<long double>> invert(const vector<vector<long double>>& A) { 11 int n = (int)A.size(); 12 vector<vector<long double>> aug(n, vector<long double>(2*n, 0.0L)); 13 for (int i = 0; i < n; ++i) { 14 for (int j = 0; j < n; ++j) aug[i][j] = A[i][j]; 15 aug[i][n+i] = 1.0L; // identity on the right 16 } 17 // Gaussian elimination with partial pivoting 18 for (int col = 0, row = 0; col < n && row < n; ++col, ++row) { 19 int piv = row; 20 for (int r = row+1; r < n; ++r) if (fabsl(aug[r][col]) > fabsl(aug[piv][col])) piv = r; 21 if (fabsl(aug[piv][col]) < 1e-18L) throw runtime_error("Singular matrix (near-singular)"); 22 swap(aug[piv], aug[row]); 23 long double diag = aug[row][col]; 24 for (int j = 0; j < 2*n; ++j) aug[row][j] /= diag; // make pivot 1 25 for (int r = 0; r < n; ++r) if (r != row) { 26 long double f = aug[r][col]; 27 if (f == 0.0L) continue; 28 for (int j = 0; j < 2*n; ++j) aug[r][j] -= f * aug[row][j]; 29 } 30 } 31 vector<vector<long double>> inv(n, vector<long double>(n, 0.0L)); 32 for (int i = 0; i < n; ++i) 33 for (int j = 0; j < n; ++j) 34 inv[i][j] = aug[i][n+j]; 35 return inv; 36 } 37 38 int main(){ 39 ios::sync_with_stdio(false); 40 cin.tie(nullptr); 41 42 const int n = 4; 43 vector<vector<long double>> P(n, vector<long double>(n, 0.0L)); 44 P[0][0]=0.2L; P[0][1]=0.5L; P[0][2]=0.3L; 45 P[1][0]=0.4L; P[1][3]=0.6L; 46 P[2][2]=1.0L; // absorbing 47 P[3][3]=1.0L; // absorbing 48 49 // Identify transient (T) and absorbing (A) sets; here T={0,1}, A={2,3} 50 vector<int> T = {0,1}; 51 vector<int> A = {2,3}; 52 int t = (int)T.size(); 53 int a = (int)A.size(); 54 55 // Build Q (t x t) and R (t x a) 56 vector<vector<long double>> Q(t, vector<long double>(t, 0.0L)); 57 vector<vector<long double>> R(t, vector<long double>(a, 0.0L)); 58 59 for (int i = 0; i < t; ++i) { 60 for (int j = 0; j < t; ++j) Q[i][j] = P[T[i]][T[j]]; 61 for (int j = 0; j < a; ++j) R[i][j] = P[T[i]][A[j]]; 62 } 63 64 // Compute N = (I - Q)^{-1} 65 vector<vector<long double>> I(t, vector<long double>(t, 0.0L)); 66 for (int i = 0; i < t; ++i) I[i][i] = 1.0L; 67 vector<vector<long double>> IminusQ = I; 68 for (int i = 0; i < t; ++i) 69 for (int j = 0; j < t; ++j) 70 IminusQ[i][j] -= Q[i][j]; 71 72 vector<vector<long double>> N = invert(IminusQ); 73 74 // Compute B = N * R (t x a) 75 vector<vector<long double>> B(t, vector<long double>(a, 0.0L)); 76 for (int i = 0; i < t; ++i) 77 for (int k = 0; k < t; ++k) if (N[i][k] != 0.0L) 78 for (int j = 0; j < a; ++j) 79 B[i][j] += N[i][k] * R[k][j]; 80 81 // Expected steps tvec = N * 1 82 vector<long double> tvec(t, 0.0L); 83 for (int i = 0; i < t; ++i) 84 for (int k = 0; k < t; ++k) 85 tvec[i] += N[i][k] * 1.0L; 86 87 cout.setf(std::ios::fixed); cout << setprecision(9); 88 cout << "Absorption probabilities (rows: start in 0 or 1; cols: absorb at 2 or 3):\n"; 89 for (int i = 0; i < t; ++i) { 90 for (int j = 0; j < a; ++j) cout << (double)B[i][j] << (j+1==a?'\n':' '); 91 } 92 cout << "Expected steps to absorption from states 0 and 1:\n"; 93 for (int i = 0; i < t; ++i) cout << (double)tvec[i] << (i+1==t?'\n':' '); 94 95 return 0; 96 } 97
We reorder states conceptually into transient (0,1) and absorbing (2,3), form Q and R, then compute the fundamental matrix N = (I−Q)^{-1} via Gaussian elimination with partial pivoting. Absorption probabilities B = N R give the chance to end in each absorbing state from a transient start. Expected time to absorption is t = N·1. This avoids simulating many steps and is exact up to floating-point error.