By 苏剑林 | November 16, 2016
Friends who have studied real analysis always know there is something called Lebesgue integration, which is claimed to be an improved version of Riemann integration. Although there is a saying that "you study real analysis ten times, and functional analysis still makes your heart turn cold," while learning real analysis, we are usually in a fog. However, by the end, under the "irrigation" of teachers, we become familiar with certain conclusions, such as "a Riemann-integrable function (on a finite interval) is also Lebesgue-integrable." Simply put, "Lebesgue integration is stronger than Riemann integration." So, here comes the question: exactly where is it stronger? Why is it stronger?
Riemann
Lebesgue
I didn't fully understand this question while studying real analysis and left it aside for a long time. It wasn't until recently, after carefully reading Revisiting Calculus, that I began to get a sense of it. By the way, Professor Qi Minyou's Revisiting Calculus is truly excellent and well worth reading.
Readers who have studied real analysis know that as two different theories describing integration, the most obvious difference between Lebesgue integration and Riemann integration is: Riemann integration partitions the domain, while Lebesgue integration partitions the range (codomain). At first glance, it looks as if Lebesgue is deliberately acting against Riemann—you say partition the domain, I insist on partitioning the range.
So, what is the truth? Is it really just "born of the same root, why so eager to torment each other"? Not at all. Partitioning the range indeed helps improve upon the deficiencies of Riemann integration. Why is partitioning the range stronger than partitioning the domain? The popular explanation is this: Riemann integration partitions the domain; however, for functions that oscillate wildly, even if the partition is very fine, the oscillation remains severe within a tiny interval (a typical example is the Dirichlet function). In such cases, Riemann integration cannot be defined. In other words, Riemann integration is suitable for locally smooth functions. In contrast, Lebesgue integration partitions the range. Thus, within a small interval of the range, there won't be large oscillations because the range itself is restricted, giving the function no chance to oscillate wildly. Therefore, it can integrate functions that oscillate very severely.
Lebesgue spoke up: "What you’re saying is somewhat on track, but you haven't yet reached my original intention. First, let me declare that I'm not trying to go against the great Riemann..." (This is purely my own fiction ~ ^_^)
In fact, the shortcomings of Riemann integration can be traced back two thousand years to the "method of exhaustion" in Ancient Greece. For example, to find the area of a circle, they used circumscribed $n$-sided polygons and inscribed $n$-sided polygons to obtain upper and lower bounds for the circle's area, then took the limit to find that both bounds were equal. Thus, the circle's area was determined. In other words, to find the area of an irregular shape, you divide it up, approximate each piece with a familiar shape (rectangle, triangle), calculate the approximate sum, and finally take the limit of the partition. The entire process is: finite partitioning — approximate summation — taking the limit.
The problem lies in "finite partitioning"! To obtain the upper bound of a shape's area, we need to cover it with a finite number of simple shapes. To get the lower bound, we need a finite number of simple shapes covered by the original shape. All in all, it's always "finite." This "finite" approach is reliable for continuous intervals but fails for general point sets. For example, consider the set of rational numbers in $[0,1]$. Readers can imagine the set of all rational points in the interval $[0,1]$ on the number line and consider its length. Readers who have studied set theory know that the cardinality of real numbers in the $[0,1]$ interval is uncountable, while the rational numbers are countable. That is, there are far more real numbers than rational numbers. Therefore, if we consider the length of the set of all real points in $[0,1]$ to be 1, it is natural that the length of the set of all rational points in $[0,1]$ should be 0. However, the finite partitioning used in Riemann integration cannot reach this conclusion.
If we use a finite number of intervals to cover all the rational numbers in the $[0,1]$ interval (covering to get an upper bound), because the rational numbers are dense, one can imagine that the total length of these finite intervals will not be less than 1. That is to say, the upper bound of the total length of all rational numbers in $[0,1]$ will not be less than 1. On the other hand, because any interval contains irrational numbers, the rational numbers cannot cover any interval (of any small length). From this perspective, the lower bound of the total length of the rational numbers will not be greater than 0. This indicates that the result does not converge! This is the problem brought about by finite partitioning.
Thus, Lebesgue was very clever; from the start, he abandoned finite partitioning and immediately used countable partitioning to define measure. Of course, I don’t intend to use the strict language of textbooks here, but rather to state the general idea: taking the 1D case as an example, for a subset of the real numbers, if it can be covered by countably many intervals, then the total length of these intervals is its outer measure (upper bound). If it can cover countably many intervals, then the total length of these intervals is its inner measure (lower bound). If the limits of the upper and lower bounds are equal, then it is a measurable set.
Actually, once we have countable partitioning, we can attempt to improve Riemann integration. Because with countable partitioning, we can divide a function into several parts to handle, such as normal points and abnormal points. The abnormal parts can further be divided into first-class abnormal, second-class abnormal, etc., and processed one by one. Countable partitioning is their foundation. For example, if we want to pick out all rational points and keep the total length of the intervals arbitrarily small, we must use countably many intervals; a finite number of intervals cannot do it.
The question arises: how do we distinguish whether a point is normal or abnormal? For Riemann integration, abnormal points are those where oscillation is very strong. In other words, we must consider the range! This is where Lebesgue's other clever insight manifests: instead of distinguishing the abnormal points in Riemann integration one by one, simply partition the range directly.
At first glance, partitioning the range seems like a very non-intuitive and clumsy method because it makes the subsets of the domain complex. In fact, this is precisely a "wise move," because countable partitioning has already been defined, and even the most complex domain subsets can be handled. Thus, by combining "countable partitioning" and "range partitioning," the theory of Lebesgue integration took shape; the rest is theoretical detail. Incidentally, because it uses countable partitioning, Lebesgue integration naturally possesses countable additivity, while Riemann integration only possesses finite additivity.
From the discussion in this section, we can see that Lebesgue measure and Lebesgue integration can handle those cases with countably many abnormal points (discontinuities). Therefore, it is obvious that if the number of abnormal points is uncountable, Lebesgue measure and Lebesgue integration will also fail. So, to construct an example of a Lebesgue non-measurable set, one must increase the number of abnormal points to be uncountable. Vitali sets, which are Lebesgue non-measurable, are constructed this way. They start from the fact that real numbers are uncountable and rational numbers are countable, using rational numbers as a base to divide the real numbers in $[0,1]$ into countably many sets. Each set is uncountable, but the question is, what is the measure of each piece? If it is 0, how can countably many zeros add up to 1? If it is not 0, then countably many identical positive numbers adding up can't possibly be 1 either. Thus, no matter what, countable additivity is not satisfied.
From this perspective, Lebesgue integration is indeed stronger than Riemann integration, because Lebesgue integration was originally designed specifically to target the "Achilles' heel" of Riemann integration.
In life, we often encounter examples where if the purpose is too strong, it results in being too clever for one's own good. Since Lebesgue integration is so pointedly designed to "counter" Riemann integration, will it conversely have some weaknesses that Riemann integration does not have?
First, an obvious point is that the linear properties of the Lebesgue integral become less apparent. In Lebesgue integration, a significant amount of space is devoted to proving that: \[ \int_a^b f(x)dx + \int_a^b g(x)dx=\int_a^b [f(x)+g(x)]dx \] While in Riemann integration, this is almost obviously true. This might be one of the reasons why Riemann integration is always used to define integration when first learning calculus—it's intuitive.
Furthermore, there are some insurmountable shortcomings. Using the Dirichlet function as inspiration, it is easy to construct examples that are Riemann non-integrable but Lebesgue integrable. Is there a reverse case? The answer is yes!
We can find the problem in the "countable partitioning" of Lebesgue integration mentioned earlier. Lebesgue integration immediately allows countably many intervals to approximate and then calculates the sum of the function over these countably many intervals. Since it directly uses countable partitioning and then sums them up without a specific order, this summation process does not consider order. We know that a series for which the order of summation does not matter is what we call an absolutely convergent series. Thus, we can sense that Lebesgue integration must be absolutely convergent (in the sense of Riemann integration).
However, many practical integrals are not absolutely convergent. For example, the integral: \[ \int_{-\infty}^{+\infty} \frac{\sin x}{x}dx \] is only conditionally convergent, not absolutely convergent. Therefore, in the sense of Lebesgue integration, it is not integrable! As for Riemann integration, interpreting it as: \[ \lim_{N\to\infty}\int_{-N}^{+N} \frac{\sin x}{x}dx \] yields a meaningful result. This feels somewhat ironic. "Countable partitioning" was originally introduced to improve upon the weaknesses of Riemann integration, but here, it becomes its own weakness. It seems that "extremes turn into their opposites," and there is no universal solution. Please look at the following example.
Let's talk about probability theory and consider the following question:
If you randomly pick a number from the natural numbers, what is the probability of picking 1?
Obviously 0? Our intuition says 0, but according to the modern axiomatic definition, this probability does not exist!! You cannot define this probability!!
Let's look at the axiomatic definition of probability:
For every event A, if the function P(A) satisfies the following conditions, then P(A) is the probability of A: 1. Non-negativity: P(A) is non-negative; 2. Normality: For the certain event S, P(S) = 1; 3. Countable additivity: The probability of the union of mutually exclusive events is the sum of the probabilities of each event.
The first two points aren't the issue; the key is the third point: countable additivity! If we believe the probability of picking 1 is 0, then the probability of picking 2 is also 0, and the probability of picking any specific natural number is 0. Since the sum of countably many 0s is still 0, the probability of picking a natural number from the set of natural numbers would be 0 rather than 1! In other words, countable additivity is not satisfied.
From this point of view, it seems one cannot talk about probability on a countable set. So, those mathematicians in the branch of number theory called "probabilistic number theory" are just talking nonsense... is that really the case?
Where did the axiomatic definition of probability, especially countable additivity, come from? You can compare it with the axiomatic definition of Lebesgue measure; you’ll find that except for the terminology, they are almost identical. Countable additivity is also a requirement for measure! In fact, probability defined this way is just a type of Lebesgue measure; or rather, the definition of probability basically copies the definition of measure. And the example above shows that countable additivity is truly a "love-hate" relationship.
However, we still feel it is necessary to discuss probability within the set of natural numbers. What should we do? Probabilistic number theory says it this way: we first discuss it within the range of natural numbers not exceeding $N$, and then let $N$ go to positive infinity. Well, doesn't this just return to the Riemann integration method of finite partitioning and then taking the limit...
It seems that it's really impossible to "overthrow Riemann"...