By 苏剑林 | December 07, 2017
This article introduces the method of characteristics for first-order partial differential equations in as clear and concise a manner as possible. Personally, I believe this is one of the simpler but often potentially confusing parts of partial differential equation theory. Therefore, I am attempting to introduce it in my own words. Of course, more accurately, this serves as a personal memorandum.
Consider the partial differential equation \begin{equation}\boldsymbol{\alpha}(\boldsymbol{x},u) \cdot \frac{\partial}{\partial \boldsymbol{x}} u = \beta(\boldsymbol{x},u)\end{equation} where $\boldsymbol{\alpha}$ is an $n$-dimensional vector function, $\beta$ is a scalar function, $\cdot$ denotes the dot product of vectors, $u \equiv u(\boldsymbol{x})$ is a function of $n$ variables, and $\boldsymbol{x}$ represents the independent variables.
The idea of the method of characteristics is to imagine $\boldsymbol{x}$ as a function of some parameter $s$. In this case, $\boldsymbol{x}(s)$ actually represents the parametric equation of a high-dimensional curve, which is the so-called characteristic line. Thus, $u$ also becomes a function of the parameter $s$. We then have \begin{equation}\frac{du}{ds}=\frac{\partial u}{\partial \boldsymbol{x}}\cdot\frac{d\boldsymbol{x}}{ds}\end{equation} Comparing this to the original partial differential equation $(1)$, we find that we can let \begin{equation}\frac{d\boldsymbol{x}}{ds}=\boldsymbol{\alpha}(\boldsymbol{x},u)\end{equation} Then we have \begin{equation}\frac{du}{ds}=\beta(\boldsymbol{x},u)\end{equation} Combining $(3)$ and $(4)$, we obtain a system of $n+1$ ordinary differential equations. \begin{equation}\left\{\begin{aligned}&\frac{d\boldsymbol{x}}{ds}=\boldsymbol{\alpha}(\boldsymbol{x},u)\\ &\frac{du}{ds}=\beta(\boldsymbol{x},u)\end{aligned}\right.\end{equation} Since $s$ is merely an extra variable introduced for the method, in principle, we can solve for results independent of $s$: \begin{equation}\boldsymbol{c}=\boldsymbol{f}(\boldsymbol{x},u)\end{equation} where $\boldsymbol{c}$ is an $n$-dimensional vector representing the integration constants of the system of ODEs, and $\boldsymbol{f}$ is an $n$-dimensional vector function. The remaining task is to determine the relationship between these integration constants based on the initial conditions. Of course, if a general solution expression is required, it takes the form: \begin{equation}G(\boldsymbol{f}(\boldsymbol{x},u))=0\end{equation} where $G$ is an arbitrary function of $n$ variables. In principle, we can solve for the expression of $u$ with respect to $\boldsymbol{x}$ from this.
The steps above are somewhat theoretical. In practical problem-solving, one can be more flexible. Let us solve: \begin{equation}\frac{\partial u}{\partial t} + x \frac{\partial u}{\partial x} = u^2,\quad u(x,0)=f(x)\end{equation} We obtain the characteristic equations: \begin{equation}dt = \frac{dx}{x}=\frac{du}{u^2}\end{equation} Solving these yields: \begin{equation}x=C_1 e^t, u = \frac{1}{C_2-t}\end{equation} When $t=0$, we have $x=C_1$ and $u=\frac{1}{C_2}=f(C_1)$, which leads to $C_2 = \frac{1}{f(C_1)}$. Since we have $C_1 = xe^{-t}$ and $C_2 = u^{-1} + t$, substituting these in gives: \begin{equation}u^{-1} + t = \frac{1}{f(xe^{-t})}\end{equation} That is: \begin{equation}u = \frac{f(xe^{-t})}{1-t\times f(xe^{-t})} \end{equation}
What is the actual meaning of the characteristic line? For beginners, the process described above might seem like a magic trick—first solving for constants, then eliminating them—without a clear grasp of the underlying logic. This was my own confusion when I first learned the method of characteristics.
In fact, we can consider that the characteristic line is already a "solution" to the partial differential equation, but only a solution along a single line, whereas the complete solution should be a high-dimensional surface. Clearly, "points move to form lines, and lines move to form surfaces." By making these lines "move," we can obtain the equation of the surface. In other words, the various integration constants $\boldsymbol{c}$ must "move."
Of course, they cannot move without constraints; if they moved randomly, they might cover the entire space. How they move is determined by the initial conditions. Therefore, we determine the constraint relationships between the integration constants based on the initial conditions. Once determined, we have already obtained the parametric equation of the surface. From a solving perspective, it isn't strictly necessary to eliminate the constants, but since we usually prefer explicit solutions, we find ways to eliminate them.
This is roughly the logic of the entire process.
Most textbooks limit the introduction of the method of characteristics to quasi-linear partial differential equations. In fact, for a general first-order partial differential equation: \begin{equation}F\left(\boldsymbol{x}, u, \frac{\partial u}{\partial \boldsymbol{x}}\right)=0\end{equation} the method of characteristics is also applicable, where $F$ is an arbitrary function of multiple variables.
This part of the work mainly refers to the English Wikipedia: https://en.wikipedia.org/wiki/Method_of_characteristics
For this purpose, we first denote: \begin{equation}\boldsymbol{p} = \frac{\partial u}{\partial\boldsymbol{x}}\end{equation} Then we differentiate $F\left(\boldsymbol{x}, u, \boldsymbol{p}\right)=0$ with respect to $s$, yielding: \begin{equation}\begin{aligned}0 =& \frac{\partial F}{\partial\boldsymbol{x}}\cdot\frac{d\boldsymbol{x}}{ds}+\frac{\partial F}{\partial u}\frac{\partial u}{\partial \boldsymbol{x}}\cdot\frac{d\boldsymbol{x}}{ds}+\frac{\partial F}{\partial\boldsymbol{p}}\cdot\frac{d\boldsymbol{p}}{ds}\\ &=\left(\frac{\partial F}{\partial\boldsymbol{x}}+\frac{\partial F}{\partial u}\boldsymbol{p}\right)\cdot\frac{d\boldsymbol{x}}{ds}+\frac{\partial F}{\partial\boldsymbol{p}}\cdot\frac{d\boldsymbol{p}}{ds} \end{aligned}\end{equation} We can see that the above consists of the dot product of two pairs of vectors summing to 0. An interesting solution is to let: \begin{equation}\frac{d\boldsymbol{x}}{ds}=\frac{\partial F}{\partial\boldsymbol{p}},\quad \frac{d\boldsymbol{p}}{ds}=-\frac{\partial F}{\partial\boldsymbol{x}}-\frac{\partial F}{\partial u}\boldsymbol{p}\end{equation} Additionally, we have: \begin{equation}\frac{du}{ds}=\frac{\partial u}{\partial\boldsymbol{x}}\cdot\frac{d\boldsymbol{x}}{ds}=\boldsymbol{p}\cdot\frac{\partial F}{\partial\boldsymbol{p}}\end{equation} Combining these, we obtain the system of ordinary differential equations: \begin{equation}\left\{\begin{aligned}&\frac{d\boldsymbol{x}}{ds}=\frac{\partial F}{\partial\boldsymbol{p}}\\ &\frac{d\boldsymbol{p}}{ds}=-\frac{\partial F}{\partial\boldsymbol{x}}-\frac{\partial F}{\partial u}\boldsymbol{p}\\ &\frac{du}{ds}=\boldsymbol{p}\cdot\frac{\partial F}{\partial\boldsymbol{p}}\\ &F\left(\boldsymbol{x}, u, \boldsymbol{p}\right)=0\end{aligned}\right.\end{equation} The subsequent steps are essentially the same as in the quasi-linear case, except that $n$ additional variables $\boldsymbol{p}$ are introduced. After solving this system, we obtain $2n$ integration constants (with respect to $s$), and then find the relationships between these constants based on the initial conditions. The difference is that, because there are $n$ extra variables $\boldsymbol{p}$, the partial derivatives of the initial conditions also need to be considered, which makes the solving process more complex. Please see the following example.
In terms of form, there is a large difference between the method of characteristics for the general case and the quasi-linear case. Can the general method be reduced to the quasi-linear case?
In fact, by substituting $F=\boldsymbol{\alpha}\cdot\boldsymbol{p}-\beta$ into equation $(18)$, one can obtain: \begin{equation}\left\{\begin{aligned}&\frac{d\boldsymbol{x}}{ds}=\boldsymbol{\alpha}\\ &\frac{du}{ds}=\boldsymbol{p}\cdot\boldsymbol{\alpha}\end{aligned}\right.\end{equation} The equation regarding $\boldsymbol{p}$ does not need to be written out because we have $\boldsymbol{p}\cdot\boldsymbol{\alpha}=\beta$. Substituting this into the equation above yields a closed system of equations. This effectively reduces to equation $(5)$.
Additionally, a particularly simple case is when $F$ is a function only of $\boldsymbol{p}$. In this situation, both $\frac{\partial F}{\partial\boldsymbol{x}}$ and $\frac{\partial F}{\partial u}$ are 0. Thus, in the characteristic equations, $\boldsymbol{p}$ is constant, which means $\boldsymbol{x}$ and $u$ are both linear functions of $s$, making the entire system fully solvable. After that, it becomes a purely algebraic problem.
Can the above characteristic techniques be extended to systems of first-order partial differential equations? Generally, this is not feasible, because solving a system of first-order PDEs is equivalent to solving any higher-order PDE. Clearly, we do not see such work (if it were possible, someone would have done it long ago).
However, if the partial derivative operators in the system are identical, the method of characteristics is still applicable. Specifically, consider the system: \begin{equation}\left(\boldsymbol{\alpha}(\boldsymbol{x},\boldsymbol{u}) \cdot \frac{\partial}{\partial \boldsymbol{x}}\right) \boldsymbol{u} = \boldsymbol{\beta}(\boldsymbol{x},\boldsymbol{u})\end{equation} In this case, $\boldsymbol{u}$ is also a vector, but the partial derivative operator on the left is shared, while the various components of $\boldsymbol{\beta}$ on the right can be different. Here, the method of characteristics can also be used to obtain: \begin{equation}\left\{\begin{aligned}&\frac{d\boldsymbol{x}}{ds}=\boldsymbol{\alpha}(\boldsymbol{x},u)\\ &\frac{d\boldsymbol{u}}{ds}=\boldsymbol{\beta}(\boldsymbol{x},u)\end{aligned}\right.\end{equation} Of course, this is merely a trivial generalization of the original method.