Dirac Delta Function: Series Approximation

By 苏剑林 | January 11, 2017

Weierstrass Theorem

Understanding the Dirac delta function as a limit of a sequence of functions can lead to very rich content, and this content is not far from rigorous proof. For example, define $$\delta_n(x)=\left\{\begin{aligned}&\frac{(1-x^2)^n}{I_n},x\in[-1,1]\\ &0,\text{otherwise}\end{aligned}\right.$$ where $I_n = \int_{-1}^1 (1-x^2)^n dx$. It is then not difficult to prove that $$\delta(x)=\lim_{n\to\infty}\delta_n(x)$$ Thus, for a continuous function $f(x)$ on $[a,b]$, we obtain $$f(x)=\int_{-1}^1 f(y)\delta(x-y)dy = \lim_{n\to\infty}\int_{-1}^1 f(y)\delta_n(x-y) dy$$ Here $-1 < a < b < 1$, and we have "non-rigorously" swapped the integral sign and the limit sign, but this is not particularly critical. What is important is the result: it can be seen that $$P_n(x)=\int_{-1}^1 f(y)\delta_n(x-y) dy$$ is a polynomial of degree $2n$ in $x$. Therefore, the above expression indicates that $f(x)$ is the limit of a polynomial of degree $2n$! This leads to the famous "Weierstrass Theorem":

Any continuous function on a closed interval can be uniformly approximated by polynomials.

Rigorous readers might shake their heads: what kind of proof is this? True, this is not a rigorous proof, but how far is it from one? For those interested in seeing a rigorous proof, you might consider reading Professor Qi Minyou's "Reviewing Calculus" (重温微积分). You will find that the rigorous proof is actually a more careful discussion and estimation (using inequalities) of the formulas above, and such a discussion is not difficult. Therefore, although the above is a non-rigorous guide, it is close to the rigorous proof. All of this can be credited to our non-rigorous understanding of $\delta(x)$ as the limit of $\delta_n(x)$, rather than understanding it strictly as a generalized function.

The above process actually demonstrates a framework that allows us to analyze what function bases can be used to approximate a function. Constructing a sequence of functions $\{\delta_n(x)\}$ such that it acts as the limit of $\delta(x)$ is not difficult: first is normalization, and second is that the parts away from $x=0$ must tend to 0 as $n\to\infty$, and that's it. For example, here is another case: $$\delta_n(x)=\left\{\begin{aligned}&\frac{\cos^n x}{I_n},x\in\left[-\frac{\pi}{2},\frac{\pi}{2}\right]\\ &0,\text{otherwise}\end{aligned}\right.$$ where $I_n = \int_{-\pi/2}^{\pi/2} \cos^n x dx$. Similarly, we have $$\delta(x)=\lim_{n\to\infty}\delta_n(x)$$ What results can this limit bring? Following the previous process for polynomials, suppose $-\frac{\pi}{2} < a < b < \frac{\pi}{2}$. Then for a continuous function $f(x)$ on $[a,b]$, we see that $$\int_{-\pi/2}^{\pi/2} f(y)\delta_n(x-y) dy = \frac{1}{I_n} \int_{-\pi/2}^{\pi/2} f(y)\cos^n (x-y) dy $$ By expanding $\cos^n(x-y)$, we find that the above expression is actually a linear combination of $\sin kx, \cos kx$ for $k=0,1,\dots,n$. Thus, we can also obtain:

Any continuous function on the closed interval $[-\pi/2, \pi/2]$ can be uniformly approximated by trigonometric series $\{\cos kx, \sin kx\}$.

This is also known as the Weierstrass Theorem. Similarly, the above discussion is not far from a rigorous proof.

Does something feel a bit off? The period of $\cos x, \sin x$ is $2\pi$. You have only approximated a interval of length $\pi$, namely $[-\pi/2, \pi/2]$. Isn't that a bit wasteful? In fact, by slightly modifying the process above, one can prove that

For any $\epsilon > 0$, any continuous function on the closed interval $[-\pi+\epsilon, \pi-\epsilon]$ can be uniformly approximated by trigonometric series $\{\cos kx, \sin kx\}$.

However, this no longer holds for $[-\pi, \pi]$ in terms of uniform convergence; it can only converge in measure or in the $L^2$ norm. As is well known, Fourier bases do not converge uniformly in all cases on that interval.

Orthonormal Basis

"Orthogonality" helps simplify problems. We prefer orthogonal coordinate systems for geometric problems, and we prefer orthogonal bases for function problems.

The two Weierstrass Theorems mentioned above tell us that continuous functions on closed intervals can be uniformly approximated by either polynomials or trigonometric series. At the same time, the process above actually provides a scheme for calculating the approximation coefficients for each term. However, from a practical calculation standpoint, the above method is not practical because going from an $n$-th order approximation to an $(n+1)$-th order approximation requires recalculating all coefficients (as a counterexample, look at Taylor series approximation, where going from $n$ to $n+1$ only requires calculating one more term $f^{(n+1)}(x)$, which is very economical; however, Taylor series have very strong requirement conditions). Therefore, we need to seek a more efficient calculation scheme.

We discuss this only within a closed interval $[a,b]$. For two functions $f(x), g(x)$, we define their inner product as $$\langle f, g\rangle=\int_a^b f(x)g(x)dx$$ Suppose the sequence of functions $\{e_k (x)\},\, k=0,1,2,\dots$ is a set of function bases and satisfies $$\langle e_m, e_n\rangle=\delta_{mn}$$ Then it is called an orthonormal basis. With these, we can describe the "optimal approximation" of a function.

Suppose there is a function $f(x)$ on the closed interval $[a,b]$, and we know that $f(x)$ can be uniformly approximated by the orthonormal basis $\{e_k (x)\},\, k=0,1,2,\dots$. Let's consider a sum of finite terms $$\sum_{k=0}^n \alpha_k e_k(x)$$ We want to use this finite series to optimally approximate $f(x)$, which means we hope to minimize the error, where the error is defined as $$E=\int_a^b \left(f(x)-\sum_{k=0}^n \alpha_k e_k(x)\right)^2 dx$$ The so-called "optimal" means adjusting $\alpha_0, \alpha_1, \dots, \alpha_n$ to minimize $E$. Since it is a finite series, we can directly take the partial derivatives with respect to each $\alpha_k$, and solve for $$\alpha_k = \int_a^b f(x)e_k (x)dx$$ to find when $E$ is minimized. Thus, we obtain the optimal approximation. It can be seen that in this case, if we move from an $n$-th order approximation to an $(n+1)$-th order approximation, we only need to calculate one additional term $\alpha_{n+1} = \int_a^b f(x)e_{n+1} (x)dx$ without recalculating all previous coefficients.

Note that if $\{e_k (x)\}$ is an arbitrary sequence of functions (not a basis, or not necessarily orthogonal), the optimization process above can still be performed, but there is no guarantee that the final result will truly approximate the original function. That is to say, it might happen that no matter how much you increase $n$, the final approximation error does not decrease. However, if it has already been proven that $\{e_k (x)\}$ can uniformly approximate the original function, then we can guarantee that the error of the above optimization process tends to 0—because the result is already the optimal solution; if the optimal solution does not tend to 0, then how could there be uniform convergence to the original function?

Fourier Series

The question now is, how do we obtain a set of orthonormal bases? According to the Weierstrass Theorem, we already know that the power function basis $1, x, x^2, \dots$ and the trigonometric function basis $1, \sin x, \cos x, \sin 2x, \cos 2x, \dots$ can both serve as bases. Let's focus on the latter first. In the interval $[-\pi, \pi]$, one can verify that $1, \sin x, \cos x, \sin 2x, \cos 2x, \dots$ are orthogonal, i.e., for any natural numbers $m, n$: $$\left\{\begin{aligned}&\int_{-\pi}^{\pi} \cos mx \cos nx dx = 0\,( m\neq n)\\ &\int_{-\pi}^{\pi} \sin mx \sin nx dx = 0\,( m\neq n)\\ &\int_{-\pi}^{\pi} \sin mx \cos nx dx = 0 \end{aligned}\right.$$ And we have $$\int_{-\pi}^{\pi} \sin^2 mx dx = \int_{-\pi}^{\pi} \cos^2 mx dx = \pi$$ Therefore, $$\sqrt{\frac{1}{2\pi}}, \sqrt{\frac{1}{\pi}}\sin x, \sqrt{\frac{1}{\pi}}\cos x, \sqrt{\frac{1}{\pi}}\sin 2x, \sqrt{\frac{1}{\pi}}\cos 2x, \dots$$ is a set of orthonormal bases (note: the constant term normalized is $1/\sqrt{2\pi}$). Then we have $$f(x) = \frac{a_0}{2} + \sum_{n=1}^{\infty} a_n \cos nx + \sum_{n=1}^{\infty} b_n \sin nx$$ specifically: $$\begin{aligned}\gamma=&\frac{1}{\pi}\int_{-\pi}^{\pi} f(x)dx,\\ \alpha_n=&\frac{1}{\pi}\int_{-\pi}^{\pi} f(x)\sin nx dx,\\ \beta_n=&\frac{1}{\pi}\int_{-\pi}^{\pi} f(x)\cos nx dx \end{aligned}$$ This is the Fourier series. But because we are considering the interval $[-\pi, \pi]$, the Fourier series only converges in measure or in $L^2$ norm, rather than uniformly. In the interval $[-\pi/2, \pi/2]$, the set $1, \sin x, \cos x, \sin 2x, \cos 2x, \dots$ is not orthogonal and would require an orthogonalization process.

Orthogonal Polynomials

We have explored the approximation problem of trigonometric series, leading to Fourier series. How should we handle the other basis from the Weierstrass Theorem—the power function basis? The natural orthogonality of trigonometric functions simplifies the problem considerably, whereas power functions lack orthogonality. Therefore, we must perform an orthogonalization process.

The most well-known orthogonalization process is the Gram-Schmidt process, which is described in basic linear algebra textbooks and will not be repeated here. Applying Schmidt orthogonalization to the basis $\{x^n\},\, n=0,1,2,\dots$ on the interval $[-1, 1]$ yields:

$n$	$P_n(x)$
0	1
1	$x$
2	$\frac{1}{2}(3x^2-1)$
3	$\frac{1}{2}(5x^3-3x)$
4	$\frac{1}{8}(35x^4-30x^2+3)$
$\vdots$	$\vdots$

These polynomials $P_n(x)$ are called "Legendre polynomials." They are orthogonal but not yet normalized. After normalizing them into $\hat{P}_n(x)$, we can write a result similar to the Fourier series: $$f(x)=\sum_{n=0}^{\infty} \alpha_n \hat{P}_n (x)$$ where $$\alpha_n = \int_{-1}^1 f(x)\hat{P}_n (x) dx$$

Other Series Approximations

Trigonometric approximation can also be viewed as an approximation using the complex exponential $e^{ikx},\, k=0,\pm 1, \pm 2, \dots$ as a basis. A natural question is: can we approximate using real exponentials $e^{kx}$?

Clearly, based on previous experience, this depends on whether we can use exponential functions to construct a Dirac delta sequence. We can consider the hyperbolic cosine $\cosh x$, which is an even function and behaves like an upward-opening parabola near $x=0$. If we consider the interval $[-1, 1]$, we might consider $$\delta_n (x) = \frac{(\cosh 1 - \cosh x)^n}{I_n},\quad I_n = \int_{-1}^1 (\cosh 1 - \cosh x)^n dx$$ It is not difficult to obtain $\lim_{n\to\infty} \delta_n (x) = \delta (x)$. Thus, we can consider the function sequence $$\frac{1}{I_n}\int_{-1}^1 f(y)[\cosh 1 - \cosh (x-y)]^n dy $$ to approximate $f(x)$. Expanding it directly shows that it is a linear combination of $\cosh kx, \sinh kx$ for $k = 0,1,2,\dots,n$, which is a linear combination of $e^{kx},\, k=0,\pm 1, \pm 2, \dots, \pm n$. Therefore, real exponentials can indeed be used for approximation.

It's strange then—since both complex and real exponentials can approximate, why is only complex exponential approximation (Fourier series) widely studied? I suspect the main reason is that real exponential approximation is not as "elegant" or practical. These approximations are within a finite interval, but Fourier series are periodic functions; studying a finite interval is equivalent to studying the whole space. In contrast, real exponentials study only one interval and offer no help for the global behavior; furthermore, they simultaneously contain $e^x$ and $e^{-x}$, which makes them diverge as $x\to+\infty$ and $x\to -\infty$, offering no advantage.