A Probability Inequality: Staring at it Until it Becomes Obviously True!

By 苏剑林 | April 30, 2025

Two days ago, a member in a QQ group shared an inequality for proof:

A probability-related inequality, from "There is no fast single hashing algorithm"

The concise problem statement, combined with the hint that it is "easily" checked, makes one feel that this is an obviously true result. However, the person who asked mentioned they had tried for a long time without success. So, what is the actual situation? Is it truly obviously true?

Preliminary Attempt

The problem is equivalent to proving \begin{equation}\sum_{i=0}^j p^i \leq \sum_{i=0}^j \left(\log\frac{1}{1-p}\right)^i/i!,\qquad p\in[0, 1)\label{eq:q}\end{equation} The given hint is \begin{equation}\log\frac{1}{1-p} = \sum_{i=1}^{\infty} \frac{p^i}{i} = p + \frac{p^2}{2} + \frac{p^3}{3} + \cdots \label{eq:log-1-p}\end{equation} This hint seems to guide us toward replacing $\log\frac{1}{1-p}$ on the right side of Eq. $\eqref{eq:q}$ and then expanding it into a power series in $p$ for term-by-term comparison. Since all coefficients in the expansion of $\log\frac{1}{1-p}$ are positive, every coefficient in the power series expansion of Eq. $\eqref{eq:q}$ must also be positive. Therefore, we only need to prove that the coefficients of the first $j$ terms are all greater than or equal to 1.

The logic seems sound, but expanding the $i$-th power of a power series back into a power series is a very tedious task, making it difficult to execute in practice. I tried several other approaches, such as setting $x=\log\frac{1}{1-p}$ to convert it into an inequality regarding $x$, which seemed promising but eventually got stuck and couldn't complete the full proof.

Thus, the initial attempt was declared a failure.

Taylor Brute Force

After going in circles, I returned to the starting point. After trying for half a day, it still felt like the most reliable path was expanding the right side of Eq. $\eqref{eq:q}$ into a power series. If direct exponentiation of Eq. $\eqref{eq:log-1-p}$ doesn't work, we must return to the original method of Taylor expansion: taking derivatives.

For simplicity, we define \begin{equation}f_j(x) = \sum_{i=0}^j \frac{x^i}{i!}\end{equation} The original problem is to prove \begin{equation}f_j(x) \geq \sum_{i=0}^j p^i,\qquad x = \log\frac{1}{1-p}\end{equation} We convert this to proving \begin{equation}\left.\frac{d^k}{dp^k}f_j(x)\right|_{p=0} \,\,\geq\,\, \left\{\begin{aligned}&\, 1 \times k!,\,\, k \leq j \\[5pt] &\,0,\,\, k > j\end{aligned}\right.\end{equation} Note that when $j \geq 0$, $f_j(0)=1$, and when $j < 0$, we define $f_j(x)\equiv 0$. Since $f_j'(x)= f_{j-1}(x)$, we have: \begin{gather}\frac{d}{dp}f_j(x) = f_{j-1}(x) \frac{1}{1-p} \\ \frac{d^2}{dp^2}f_j(x) = [f_{j-1}(x) + f_{j-2}(x)]\frac{1}{(1-p)^2} \\ \frac{d^3}{dp^3}f_j(x) = [2f_{j-1}(x) + 3f_{j-2}(x) + f_{j-3}(x)]\frac{1}{(1-p)^3} \\ \vdots \notag\\ \frac{d^k}{dp^k}f_j(x) = [\alpha_1 f_{j-1}(x) + \alpha_2 f_{j-2}(x) + \cdots + \alpha_k f_{j-k}(x)]\frac{1}{(1-p)^k} \\[8pt] \frac{d^{k+1}}{dp^{k+1}}f_j(x) = \left[\begin{aligned} \alpha_1 f_{j-2}(x) + \alpha_2 f_{j-3}(x) + \cdots + \alpha_k f_{j-k-1}(x) \\ + k(\alpha_1 f_{j-1}(x) + \alpha_2 f_{j-2}(x) + \cdots + \alpha_k f_{j-k}(x)) \end{aligned}\right]\frac{1}{(1-p)^{k+1}} \end{gather} The pattern: $\frac{d^k}{dp^k}f_j(x)$ is the product of a linear combination of $f_{j-1}(x), f_{j-2}(x), \dots, f_{j-k}(x)$ and $\frac{1}{(1-p)^k}$. Notice we only care about the value of $\left.\frac{d^k}{dp^k}f_j(x)\right|_{p=0}$. After substituting $p=0$, it equals the sum of the non-zero coefficients of $f_{j-1}(0), f_{j-2}(0), \dots, f_{j-k}(0)$ (noting that $f_m(0)$ is either 1 or 0). Based on the derivative rules of $f_j(x)$, we can conclude that when $k \leq j$: \begin{equation}\left.\frac{d^k}{dp^k}f_j(x)\right|_{p=0} = k \times \left.\frac{d^{k-1}}{dp^{k-1}}f_j(x)\right|_{p=0}\end{equation} Thus $\left.\frac{d^k}{dp^k}f_j(x)\right|_{p=0} = k!$. According to the Taylor expansion, the coefficient of the $p^k$ term is $k!/k!=1$. When $k > j$, since $f_{j-k}(0)=0$ at that point, $\left.\frac{d^k}{dp^k}f_j(x)\right|_{p=0} < k!$, so the coefficient of $p^k$ is less than 1, yet still greater than zero. Therefore, the inequality is proven.

Obviously True

In that case, the original problem wasn't as "obviously true" as the author claimed? I thought so too, until another group member provided an extremely simple perspective that made everything clear—this problem really is obviously true!

This perspective fully utilizes reverse thinking as follows: \begin{equation}\begin{aligned} \sum_{i=0}^j \left(\log\frac{1}{1-p}\right)^i/i! =&\, \sum_{i=0}^{\infty} \left(\log\frac{1}{1-p}\right)^i/i! - \sum_{i=j+1}^{\infty} \left(\log\frac{1}{1-p}\right)^i/i! \\ =&\, \exp\left(\log\frac{1}{1-p}\right) - \sum_{i=j+1}^{\infty} \left(\log\frac{1}{1-p}\right)^i/i! \\ =&\, \frac{1}{1-p} - \sum_{i=j+1}^{\infty} \left(\log\frac{1}{1-p}\right)^i/i! \\ =&\, \sum_{i=0}^{\infty} p^i - \sum_{i=j+1}^{\infty} \left(\log\frac{1}{1-p}\right)^i/i! \\ \end{aligned}\end{equation} Note that $\log\frac{1}{1-p} \sim p$. Therefore, the trailing $\sum_{i=j+1}^{\infty}$ part only contributes to terms of $p^{j+1}$ and above. Thus, it becomes almost immediately obvious that the coefficients for the first $j+1$ terms of the expansion on the right side are all 1! The proof itself is not hard to understand; the truly difficult part is having the courage to extend the right-hand summation to infinity and noticing that the error term only contributes to $p^{j+1}$ and higher. This is classic reverse thinking.

Mathematics is truly wonderful~