By 苏剑林 | Nov 13, 2019
Yesterday, there was a discussion in the group about some counter-intuitive phenomena regarding $n$-dimensional vectors. One topic was that "generally, in $n$-dimensional space, two random vectors are almost always perpendicular," which markedly differs from our perception of 2D and 3D space. To understand this conclusion theoretically, we can consider the distribution of the angle $\theta$ between two random vectors and calculate its mean and variance.
Probability Density
First, let's derive the probability density function of $\theta$. In fact, there is no need for a lengthy derivation, as it is a direct consequence of $n$-dimensional hyperspherical coordinates. To find the distribution of the angle between two random vectors, it is clear that due to isotropy, we only need to consider unit vectors. Similarly, due to isotropy, we can fix one vector and let the other vary randomly. Without loss of generality, let the random vector be
\begin{equation}\boldsymbol{x}=(x_1,x_2,\dots,x_n)\end{equation}
and the fixed vector be
\begin{equation}\boldsymbol{y}=(1,0,\dots,0)\end{equation}
Transforming $\boldsymbol{x}$ into hyperspherical coordinates (knowledge about $n$-dimensional spheres can be found on Wikipedia):
\begin{equation}
\left\{\begin{aligned}
x_{1}&=\cos(\varphi_{1})\\
x_{2}&=\sin(\varphi_{1})\cos(\varphi_{2})\\
x_{3}&=\sin(\varphi_{1})\sin(\varphi_{2})\cos(\varphi_{3})\\
&\,\,\vdots \\
x_{n-1}&=\sin(\varphi_{1})\cdots \sin(\varphi_{n-2})\cos(\varphi_{n-1})\\
x_{n}&=\sin(\varphi_{1})\cdots \sin(\varphi_{n-2})\sin(\varphi_{n-1})
\end{aligned}\right.
\end{equation}
where $\varphi_{n−1}\in [0, 2\pi)$ and the remaining $\varphi$ ranges are $[0, \pi]$. In this case, the angle between $\boldsymbol{x}$ and $\boldsymbol{y}$ is:
\begin{equation}\arccos \langle \boldsymbol{x},\boldsymbol{y}\rangle = \arccos \cos(\varphi_{1}) = \varphi_{1}
\end{equation}
In other words, the angle between the two is exactly $\varphi_1$. Then, the probability that the angle between $\boldsymbol{x}$ and $\boldsymbol{y}$ does not exceed $\theta$ is:
\begin{equation}P_n(\varphi_1\leq\theta) = \frac{\text{Integral on } n\text{-dimensional hypersphere surface where } \varphi_1 \text{ does not exceed } \theta}{\text{Total integral on } n\text{-dimensional hypersphere surface}}
\end{equation}
The volume element on the $n$-dimensional hypersphere surface is $\sin^{n-2}(\varphi_{1})\sin^{n-3}(\varphi_{2})\cdots \sin(\varphi_{n-2})\,d\varphi_{1}\,d\varphi_{2}\cdots d\varphi_{n-1}$ (available on Wikipedia), so
\begin{equation}\begin{aligned}
P_n(\varphi_1\leq\theta) =& \frac{\int_0^{2\pi}\cdots\int_0^{\pi}\int_0^{\theta}\sin^{n-2}(\varphi_{1})\sin^{n-3}(\varphi_{2})\cdots \sin(\varphi_{n-2})\,d\varphi_{1}\,d\varphi_{2}\cdots d\varphi_{n-1}}{\int_0^{2\pi}\cdots\int_0^{\pi}\int_0^{\pi}\sin^{n-2}(\varphi_{1})\sin^{n-3}(\varphi_{2})\cdots \sin(\varphi_{n-2})\,d\varphi_{1}\,d\varphi_{2}\cdots d\varphi_{n-1}}\\
=&\frac{(n-1)\text{-dimensional unit hypersphere surface area}\times\int_0^{\theta}\sin^{n-2}\varphi_{1} d\varphi_1}{n\text{-dimensional unit hypersphere surface area}}\\
=&\frac{\Gamma\left(\frac{n}{2}\right)}{\Gamma\left(\frac{n-1}{2}\right)\sqrt{\pi}} \int_0^{\theta}\sin^{n-2}\varphi_1 d\varphi_1
\end{aligned}
\end{equation}
This shows that the probability density function (PDF) of $\theta$ is
\begin{equation}
p_n(\theta) = \frac{\Gamma\left(\frac{n}{2}\right)}{\Gamma\left(\frac{n-1}{2}\right)\sqrt{\pi}}\sin^{n-2} \theta
\label{eq:theta}\end{equation}
Sometimes we are interested in the distribution of $\eta=\cos\theta$. In this case, we need to perform a change of variables for the probability density:
\begin{equation}\begin{aligned}
p_n(\eta)=&\frac{\Gamma\left(\frac{n}{2}\right)}{\Gamma\left(\frac{n-1}{2}\right)\sqrt{\pi}}\sin^{n-2} (\arccos\eta)\left|\frac{d\theta}{d\eta}\right|\\
=&\frac{\Gamma\left(\frac{n}{2}\right)}{\Gamma\left(\frac{n-1}{2}\right)\sqrt{\pi}}(1-\eta^2)^{(n-3)/2}\\
\end{aligned}\label{eq:cos}\end{equation}
Distribution Profiles
From $\eqref{eq:theta}$ and $\eqref{eq:cos}$, we can see that when $n=2$, the distribution of the angle $\theta$ is uniform, and when $n=3$, the distribution of the cosine of the angle $\cos\theta$ is uniform. These two results indicate that in the 2D and 3D spaces we can perceive, the distribution of angles is relatively uniform. But what happens when $n$ is large? For example, $n=20, 50$?
From the form $p_n(\theta)\sim\sin^{n-2}\theta$, one can find that when $n\geq 3$, the maximum probability occurs at $\theta=\frac{\pi}{2}$ (i.e., 90 degrees). Additionally, $\sin^{n-2}\theta$ is symmetric about $\theta=\frac{\pi}{2}$, so its mean is also $\frac{\pi}{2}$. However, this is not sufficient to describe the distribution; we also need to consider the variance
\begin{equation}
Var_n(\theta) = \frac{\Gamma\left(\frac{n}{2}\right)}{\Gamma\left(\frac{n-1}{2}\right)\sqrt{\pi}}\int_0^{\pi}\left(\theta-\frac{\pi}{2}\right)^2\sin^{n-2} \theta d\theta\end{equation}
This integral has an analytical solution, but the form is quite cumbersome (if you are interested, you can calculate it yourself using Mathematica). Let's just look at some numerical results:
| $n$ |
Variance |
| 3 | 0.467401 |
| 10 | 0.110661 |
| 20 | 0.0525832 |
| 50 | 0.0204053 |
| 100 | 0.0101007 |
| 200 | 0.00502508 |
| 1000 | 0.001001 |
As can be seen, as $n$ increases, the variance becomes smaller and smaller. This means that in high-dimensional space, the angle between any two vectors is almost concentrated around $\frac{\pi}{2}$. In other words, in high-dimensional space, any two vectors are almost always perpendicular.
Of course, this can also be seen from the plot:
[Plot of $p(\theta)$]
For readers who want an approximate analytical solution, consider using the Laplace method to approximate $p_n(\theta)$ with a Gaussian distribution: expanding $\ln \sin^{n-2}\theta$ at $\theta=\frac{\pi}{2}$:
\begin{equation}\ln \sin^{n-2}\theta=\frac{2-n}{2}\left(\theta - \frac{\pi}{2}\right)^2 + \mathcal{O}\left(\left(\theta - \frac{\pi}{2}\right)^4\right)\end{equation}
which gives
\begin{equation}\sin^{n-2}\theta\approx \exp\left[-\frac{n-2}{2}\left(\theta - \frac{\pi}{2}\right)^2\right]\end{equation}
From this approximate form, we can consider $\theta$ as approximately following a normal distribution with mean $\frac{\pi}{2}$ and variance $\frac{1}{n-2}$. That is, when $n$ is large, the variance is approximately $\frac{1}{n-2}$, which also shows that the larger $n$ is, the smaller the variance.
Summary
This article derives the distribution of angles in high-dimensional space as a memorandum and also as a reference for interested readers.