Orthogonal Matrix for Transforming One Unit Vector to Another

By 苏剑林 | June 05, 2021

In this article, we discuss a practical linear algebra problem:

Given two $d$-dimensional unit (column) vectors $\boldsymbol{a}, \boldsymbol{b}$, find an orthogonal matrix $\boldsymbol{T}$ such that $\boldsymbol{b} = \boldsymbol{T}\boldsymbol{a}$.

Since the two vectors have the same magnitude, it is clear that such an orthogonal matrix must exist. So, how do we find it?

Two Dimensions

It is not hard to imagine that this is essentially a problem of vector transformation (such as rotation or reflection) within the two-dimensional sub-plane formed by $\boldsymbol{a}$ and $\boldsymbol{b}$. Therefore, let us first consider the case where $d=2$.

Orthogonal decomposition diagram
Schematic of orthogonal decomposition

As shown in the figure above, through orthogonal decomposition, we can obtain a vector $\boldsymbol{b} - \boldsymbol{a}\cos\theta$ which is perpendicular to $\boldsymbol{a}$. After normalization, we can obtain an orthonormal basis:

\begin{equation}\boldsymbol{Q} = \begin{pmatrix}\boldsymbol{a} & \frac{\boldsymbol{b} - \boldsymbol{a}\cos\theta}{\Vert \boldsymbol{b} - \boldsymbol{a}\cos\theta\Vert}\end{pmatrix}\end{equation}

where $\theta$ is the angle between $\boldsymbol{a}$ and $\boldsymbol{b}$. In this coordinate basis, the coordinates of $\boldsymbol{a}$ are $(1,0)$ and the coordinates of $\boldsymbol{b}$ are $(\cos\theta, \sin\theta)$, i.e.,

\begin{equation}\boldsymbol{a}=\boldsymbol{Q}\begin{pmatrix}1 \\ 0\end{pmatrix},\quad \boldsymbol{b}=\boldsymbol{Q}\begin{pmatrix}\cos\theta \\ \sin\theta\end{pmatrix}\end{equation}

Therefore,

\begin{equation}\boldsymbol{b}=\boldsymbol{Q}\boldsymbol{R}\begin{pmatrix}1 \\ 0\end{pmatrix}=\boldsymbol{Q}\boldsymbol{R}\boldsymbol{Q}^{\top}\boldsymbol{a}\label{eq:ba}\end{equation}

Here, there are two choices for $\boldsymbol{R}$:

\begin{equation}\boldsymbol{R}=\begin{pmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{pmatrix}\quad \text{or} \quad\boldsymbol{R}=\begin{pmatrix}\cos\theta & \sin\theta \\ \sin\theta & -\cos\theta\end{pmatrix}\end{equation}

From a geometric perspective, the former corresponds to a rotation of the vector, while the latter corresponds to a reflection. Both lead to slightly different final results, but from a purely mathematical standpoint, they are both orthogonal matrices that satisfy the requirements. Equation $\eqref{eq:ba}$ implies that the sought orthogonal matrix is:

\begin{equation}\boldsymbol{T}=\boldsymbol{Q}\boldsymbol{R}\boldsymbol{Q}^{\top}\end{equation}

Multi-Dimensions

For readers who have understood the above process, the logic for the multi-dimensional case is already quite clear. We similarly choose an orthonormal basis first and then transform the problem into a simpler case. Since $d > 2$, $\boldsymbol{a}$ and $\frac{\boldsymbol{b} - \boldsymbol{a}\cos\theta}{\Vert \boldsymbol{b} - \boldsymbol{a}\cos\theta\Vert}$ are not enough to form a complete basis. However, theoretically, we can find $\boldsymbol{e}_3, \cdots, \boldsymbol{e}_d$ such that

\begin{equation}\tilde{\boldsymbol{Q}} = \begin{pmatrix}\boldsymbol{a} & \frac{\boldsymbol{b} - \boldsymbol{a}\cos\theta}{\Vert \boldsymbol{b} - \boldsymbol{a}\cos\theta\Vert} & \boldsymbol{e}_3 & \cdots & \boldsymbol{e}_d\end{pmatrix} = \begin{pmatrix}\boldsymbol{Q} & \boldsymbol{E}\end{pmatrix}\end{equation}

forms an orthonormal basis, where

\begin{equation}\boldsymbol{Q}=\begin{pmatrix}\boldsymbol{a} & \frac{\boldsymbol{b} - \boldsymbol{a}\cos\theta}{\Vert \boldsymbol{b} - \boldsymbol{a}\cos\theta\Vert} \end{pmatrix}\in\mathbb{R}^{d\times 2},\quad\boldsymbol{E}=\begin{pmatrix}\boldsymbol{e}_3 & \cdots & \boldsymbol{e}_d\end{pmatrix}\in\mathbb{R}^{d\times (d-2)}\end{equation}

In this case,

\begin{equation}\boldsymbol{a}=\tilde{\boldsymbol{Q}}\begin{pmatrix}1 \\ 0 \\ 0 \\ \vdots \\ 0\end{pmatrix},\quad \boldsymbol{b}=\tilde{\boldsymbol{Q}}\begin{pmatrix}\cos\theta \\ \sin\theta \\ 0 \\ \vdots \\ 0\end{pmatrix}=\tilde{\boldsymbol{Q}}\begin{pmatrix} \boldsymbol{R} & \boldsymbol{0}_{2\times(d-2)} \\ \boldsymbol{0}_{(d-2)\times 2} & \boldsymbol{I}_{(d-2)\times(d-2)}\end{pmatrix}\begin{pmatrix}1 \\ 0 \\ 0 \\ \vdots \\ 0\end{pmatrix}\end{equation}

The definition of $\boldsymbol{R}$ remains the same as before. Thus, the matrix we are looking for is:

\begin{equation}\boldsymbol{T}=\tilde{\boldsymbol{Q}}\begin{pmatrix} \boldsymbol{R} & \boldsymbol{0}_{2\times(d-2)} \\ \boldsymbol{0}_{(d-2)\times 2} & \boldsymbol{I}_{(d-2)\times(d-2)}\end{pmatrix}\tilde{\boldsymbol{Q}}^{\top}\label{eq:final-1}\end{equation}

Simplification

Expressing the matrix $\eqref{eq:final-1}$ using block matrices $\boldsymbol{Q}, \boldsymbol{R}, \boldsymbol{E}$, the result is $\boldsymbol{Q}\boldsymbol{R}\boldsymbol{Q}^{\top}+\boldsymbol{E}\boldsymbol{E}^{\top}$. Noticing that we also have $\tilde{\boldsymbol{Q}}\tilde{\boldsymbol{Q}}^{\top}=\boldsymbol{I}_{d\times d}$, this means $\boldsymbol{E}\boldsymbol{E}^{\top}=\boldsymbol{I}_{d\times d} - \boldsymbol{Q}\boldsymbol{Q}^{\top}$. Therefore, the transformation $\eqref{eq:final-1}$ can finally be written as:

\begin{equation}\boldsymbol{T}=\boldsymbol{Q}\boldsymbol{R}\boldsymbol{Q}^{\top}+\boldsymbol{I}_{d\times d} - \boldsymbol{Q}\boldsymbol{Q}^{\top}\end{equation}

One surprising point about this result is that $\boldsymbol{E}$, which carries a degree of randomness, is eliminated, resulting in a deterministic outcome. We can further substitute the concrete forms of $\boldsymbol{Q}$ and $\boldsymbol{R}$ to simplify the result significantly:

\begin{equation}\boldsymbol{T} = \left\{\begin{aligned}\boldsymbol{I}_{d\times d} + 2\boldsymbol{b}\boldsymbol{a}^{\top}- \frac{(\boldsymbol{a} + \boldsymbol{b})(\boldsymbol{a} + \boldsymbol{b})^{\top}}{1+\cos\theta},\quad &\text{when}\,\boldsymbol{R}=\begin{pmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{pmatrix} \\ \boldsymbol{I}_{d\times d} - \frac{(\boldsymbol{a} - \boldsymbol{b})(\boldsymbol{a} - \boldsymbol{b})^{\top}}{1-\cos\theta},\quad &\text{when}\,\boldsymbol{R}=\begin{pmatrix}\cos\theta & \sin\theta \\ \sin\theta & -\cos\theta\end{pmatrix} \end{aligned}\right.\label{eq:final-2}\end{equation}

It is worth noting that the second matrix is a symmetric orthogonal matrix (orthogonality is necessary, symmetry is not)! This means that for the same orthogonal matrix $\boldsymbol{T}$, it can transform $\boldsymbol{a}$ into $\boldsymbol{b}$ and also transform $\boldsymbol{b}$ into $\boldsymbol{a}$:

\begin{equation}\boldsymbol{b}=\boldsymbol{T}\boldsymbol{a},\quad\boldsymbol{a}=\boldsymbol{T}\boldsymbol{b}\end{equation}

This is quite an interesting result, so we select it as our final answer. Noting that $2(1-\cos\theta)=\Vert\boldsymbol{a} - \boldsymbol{b}\Vert^2$, this result can also be written as:

\begin{equation}\boldsymbol{T} = \boldsymbol{I}_{d\times d} - 2\left(\frac{\boldsymbol{a} - \boldsymbol{b}}{\Vert\boldsymbol{a} - \boldsymbol{b}\Vert}\right)\left(\frac{\boldsymbol{a} - \boldsymbol{b}}{\Vert\boldsymbol{a} - \boldsymbol{b}\Vert}\right)^{\top}\end{equation}

This is the Householder transformation with $\boldsymbol{a} - \boldsymbol{b}$ as the mirror plane. Therefore, if one is already familiar with the Householder transformation, this result can be derived easily.

Using the following logic, we can also obtain a more formally symmetric $\boldsymbol{T}$:

To obtain $\boldsymbol{T}$ such that $\boldsymbol{b}=\boldsymbol{T}\boldsymbol{a}$, one can first find $\tilde{\boldsymbol{T}}$ such that $-\boldsymbol{b}=\tilde{\boldsymbol{T}}\boldsymbol{a}$, and then let $\boldsymbol{T}=-\tilde{\boldsymbol{T}}$.

In other words, by substituting $\boldsymbol{b} \to -\boldsymbol{b}$ in result $\eqref{eq:final-2}$ and then negating the entire expression, we can also obtain a transformation that meets the requirements. Applying this logic to the second solution in $\eqref{eq:final-2}$, we get:

\begin{equation}\boldsymbol{T} = \frac{(\boldsymbol{a} + \boldsymbol{b})(\boldsymbol{a} + \boldsymbol{b})^{\top}}{1+\cos\theta} - \boldsymbol{I}_{d\times d}=\frac{(\boldsymbol{a} + \boldsymbol{b})(\boldsymbol{a} + \boldsymbol{b})^{\top}}{1+\boldsymbol{a}^{\top}\boldsymbol{b}} - \boldsymbol{I}_{d\times d}\label{eq:final-3}\end{equation}

This is what the author considers to be the simplest form of the solution. Note that this is a new solution, generally not equal to either of the two solutions in $\eqref{eq:final-2}$. This means we have provided three feasible solutions so far.

Code

Code verification gives us more confidence in the correctness of the theoretical results. Below is a reference verification code:

#! -*- coding: utf-8 -*-
import numpy as np

def orthonormal_matrix_for_a_to_b(a, b):
    """Find orthogonal matrix T such that Ta is in the same direction as b
    """
    a = a / np.linalg.norm(a)
    b = b / np.linalg.norm(b)
    ab = (a + b).reshape((-1, 1))
    return ab.dot(ab.T) / (1 + a.dot(b)) - np.eye(a.shape[0])

a = np.array([1, 2, 3, 4, 5])
b = np.array([9, 8, 7, 6, 5])
T = orthonormal_matrix_for_a_to_b(a, b)
assert np.allclose(T.dot(T.T), np.eye(a.shape[0])) # Verify orthogonality
r = T.dot(a) / b
assert np.allclose(r, r[0]) # Verify if parallel
assert r[0] > 0 # Verify if in same direction
r = T.dot(b) / a
assert np.allclose(r, r[0]) # Verify if parallel
assert r[0] > 0 # Verify if in same direction

The experimental results show that result $\eqref{eq:final-3}$ is indeed correct. Of course, one can also start from $\eqref{eq:final-3}$ and directly calculate $\boldsymbol{T}\boldsymbol{T}^{\top}=\boldsymbol{I}_{d\times d}$ as well as $\boldsymbol{b}=\boldsymbol{T}\boldsymbol{a}, \boldsymbol{a}=\boldsymbol{T}\boldsymbol{b}$ to ensure the result is correct.

Summary

In this article, we solved a linear algebra exercise: finding an orthogonal matrix that transforms one unit vector into another. We ultimately arrived at a rather simple and interesting result. This transformation can often simplify coordinate-independent problems into a specific case for handling, making it quite valuable in practice.