Monday, January 28, 2013

Markov Chains: Stationary Distributions

In my first post on Markov chains, I introduced a particular type of stochastic process with the basic first-order Markov property (in discrete time).  We call this a Markov chain. Examples explaining what are 1-step and 2-step transition probabilities and how to compute a marginal or unconditional probability of being in a particular state at time $n$ were all provided. This post will be a bit more technical, but ultimately I hope to end with a concrete example (with R code) to help make it more clear.

In this post, I will introduce stationary distributions by answering the following questions:
  1. What is a stationary distribution? 
  2. What properties are needed to guarantee the unique existence of the stationary distribution? 
  3. How to compute the stationary distribution? 
Q1: A stationary distribution $\pi$ is a probability distribution such that when the Markov chain reaches the stationary distribution, then it remains in that probability distribution forever. This means whenever you are interested in asking questions like "In the long-run, what it the proportion of time the Markov chain will be in a particular state?" (such as the rain or no rain example), you're interested in computing the stationary distribution (if it's possible).

Let's start with a bit of notation.  Define $\pi_j$ as the limiting probability that the process will be in state $j$ at time $n$ or
\[ \pi_j = \lim_{n \rightarrow \infty} P_{ij}^n \]
\lim_{n \rightarrow \infty} P[X_{n+1} = j] & = & lim_{n \rightarrow \infty}  \sum_{i = 0}^{\infty} P[X_{n+1} = j | X_n = i] P[X_n = i] \\
& = & \sum_{i = 0}^{\infty}  lim_{n \rightarrow \infty}  P_{ij} P[X_n = i] \hspace{.2in} (Fubini's \hspace{.05in}Theorem)
which defines the stationary distribution:
\[ \pi_j = \sum_{i=0}^{\infty} P_{ij} \pi_i \]

Q2: This question is more difficult to answer.
Short answer: The Markov chain needs to be a (1) finite (2) aperiodic and (3) irreducible chain.
Longer answer: Assume $P$ is the transition probability matrix for a Markov chain with the following properties:

  1. finite = finite number of states
  2. aperiodic = state $i$ has period $d$ if $P_{ii}^n = 0$ whenever $n$ is not divisible by $d$ and $d$ is the largest integer with this property
  3. irreducible = two states that communicate are in the same class. classes are disjoint or identical. If there exists only 1 class, then the Markov chain is irreducible

Then, there exists a unique probability distribution satisfying
\[ \pi_j = \sum_{i = 0}^{\infty} \pi_i P_{ij} \hspace{1in} j \geq 0, \hspace{.5in} \sum_j \pi_j = 1 \]
In vector notation $\pi^{'} = \pi^{'} P$ and $\pi^{'} 1 = 1$ where $\pi^{'} = (\pi_0, \ldots, \pi_s)$.

This theorem also gives us the fact that if $P$ is finite, aperiodic and irreducible, then not only does the unique stationary distribution exist, but also $\pi_j$ = the limiting probability that the process will be in state $j$ at time $n$ equals the long-run proportion of time that the process will be in state $j$ [Sheldon Ross, 2007]. Or

\[ P^n \rightarrow \left[ \begin{array}{ccc}

\pi_0 & \ldots & \pi_s \\
\pi_0 & \ldots & \pi_s \\
\vdots & & \\
\pi_0 & \ldots & \pi_s
\end{array} \right]  \]

Q3: Back to our rain example. The transition probability matrix $P$ was given by

\[ P = \left[ \begin{array}{cc}

\alpha & 1-\alpha  \\
\beta & 1- \beta \\
\end{array} \right]  \]
where $\alpha$ = 0.7 and $\beta$= 0.4.  As we consider the 2-step transition probability matrix ($P_{ij}^2$), we can consider the 4-step ($P_{ij}^4$)or $n$-step transition probability matrix ($P_{ij}^n$) as $n \rightarrow \infty$.


trans.mat <- matrix(c(0.7, 0.3, 0.4, 0.6), 2,2, byrow = TRUE)
> trans.mat %*% trans.mat %*% trans.mat
      [,1]  [,2]
[1,] 0.583 0.417
[2,] 0.556 0.444

> trans.mat %*% trans.mat %*% trans.mat %*% trans.mat
       [,1]   [,2]
[1,] 0.5749 0.4251
[2,] 0.5668 0.4332
> trans.mat %*% trans.mat %*% trans.mat %*% trans.mat %*% trans.mat
        [,1]    [,2]
[1,] 0.57247 0.42753
[2,] 0.57004 0.42996
> trans.mat %*% trans.mat %*% trans.mat %*% trans.mat %*% trans.mat %*% trans.mat
         [,1]     [,2]
[1,] 0.571741 0.428259
[2,] 0.571012 0.428988

We see each row in starting to converge to what's called the 'stationary distribution' as $n \rightarrow \infty$.  Thus, let's find the stationary distribution for Markov chain.  Another way of asking this is in the long run, what is the proportion of time the process is in the rain or no rain state?

Because our Markov chain is finite, aperiodic, and irreducible, we can use the set of equations $\pi_j = \sum_{i = 0}^{\infty} \pi_i P_{ij}$ and $ \sum_j \pi_j = 1$.

\pi_0 & = & \alpha \pi_0 + \beta \pi_1 \\
\pi_1 & = & (1-\alpha) \pi_0 + (1-\beta) \pi_1 \\
\pi_0 + \pi_1 & = & 1 \\
Solving for $\pi_0$ and $\pi_1$ yields the solutions
\pi_0 & = &  \frac{\beta}{ 1 + \beta - \alpha} \\
\pi_1 & = & \frac{1-\alpha}{1 + \beta - \alpha}
Plugging in $\alpha = 0.7$ and $\beta = 0.4$, we see the long-run proportion of time it will rain is $\pi_0 = 0.571$ and the long-run proportion of time it won't rain is $\pi_1 = 0.428$. This matches the limiting probabilities we calculated.

Hopefully this was helpful in understanding what a stationary distribution is and how to compute one.  Stay tuned for second-order Markov chains and a great example using the RHmm package in R.


  1. Hey Stephanie!
    I have some questions following your 2 posts on markov chains. How would you do a second order markov chain say for DNA bases ATCG? Thanks in advance.

  2. Janyce: You would set up a second-order transition probability matrix where each transition depended on the previous two states (not just one). Hope this helps!