When Probability Meets Complex Variables: From Binomial Distribution to Poisson Distribution

By 苏剑林 | January 13, 2015

The Poisson distribution is suitable for describing the probability distribution of the number of random events occurring per unit of time, such as the number of service requests received by a service facility within a certain period, or the number of passengers waiting at a bus stop. [Wikipedia] The Poisson distribution can also serve as an approximation for the Binomial distribution when the probability of success is small. Its derivation process is covered in most general probability textbooks. However, the proofs provided in typical textbooks are not always so aesthetically pleasing, such as the proof on page 98 of "A Course in Probability and Mathematical Statistics" (2nd Edition, edited by Mao Shisong et al.). So, which derivation process is more worthy of praise? In my opinion, it is the one utilizing the generating function.

The probability generating function of the Binomial distribution is:

\begin{equation}(q+px)^n,\quad q=1-p\end{equation}

When the number of trials $n$ is very large and the probability $p$ is very small, we can consider its approximation. At this time, $\lambda = pn$ is a reasonably sized number, so $p = \frac{\lambda}{n}$ is very small. Following the probability formula, we have:

\begin{equation}(q+px)^n=\left(1+\frac{\lambda}{n}(x-1)\right)^n\end{equation}

Since

\begin{equation}\lim_{n\to\infty}\left(1+\frac{x}{n}\right)^n=e^x\end{equation}

When $n$ is sufficiently large but not infinite, the difference between the above expression and $e^x$ is quite small, thus yielding the approximation:

\begin{equation}\left(1+\frac{\lambda}{n}(x-1)\right)^n\approx e^{\lambda x-\lambda}\end{equation}

This is the generating function for the Poisson distribution.

It is worth noting that $e^{\lambda x-\lambda}|_{x=1}=1$. This is a necessary condition that must be satisfied for a probability generating function, which is a very fortunate result. Because we made an approximation, the function before the approximation was indeed a probability generating function, and yet the function after the approximation also happens to be exactly another probability generating function (without requiring any normalization). It must be said that this is a very beautiful coincidence.

Expanding this directly into a power series gives us the probability for each term:

\begin{equation}e^{\lambda x-\lambda}=e^{-\lambda}\sum_{k=0}^{\infty}\frac{\lambda^k}{k !}x^k\end{equation}

In other words,

\begin{equation}P(X=k)=e^{-\lambda}\frac{\lambda^k}{k !}\end{equation}

Thus, we have obtained the probability formula for the Poisson distribution.

Below is a recently found diagram illustrating the connections between various probability distributions to share with readers.

Connections between various probability distributions

Connections between various probability distributions

Original image source: http://www.math.wm.edu/~leemis/2008amstat.pdf