Considering a linear equation

$$ \begin{align} Ax=b,\label{le} \end{align} $$

where $A\in \mathbb{R}^{n\times m}, x\in\mathbb{R}^{m}$ and $b\in\mathbb{R}^{n}$. If $m < n$, we call the equation over-determinated. Else if $m>n$, we call the equation under-determinated.

For over-determinated system, it usually does not have a solution, if $b$ lies out of the space spaned by column vectors $a_i$ of $A$. Therefore, to solve the problem, we first project both side of the equation into the space spaned by $a_i$. So, we do

$$ \begin{align} A^{\textsf{T}}Ax=A^{\textsf{T}}b.\label{pls} \end{align} $$

If $A^{\textsf{T}}A$ is invertible, we have

$$ \begin{align*} x=(A^{\textsf{T}}A)^{-1}A^{\textsf{T}}b. \end{align*} $$

We call $(A^{\textsf{T}}A)^{-1}A^{\textsf{T}}$ Moore-Penrose Pseudoinverse.

On least squares perspective, it is equavalent to the following problem

$$ \begin{align*} \min_x f(x)=\|Ax-b\|^2. \end{align*} $$

Via setting $\nabla f=0$, we also have equation \eqref{pls}. If $A^{\textsf{T}}A$ is not invertible, we will treat it in the same way as in under-determinated case.

For under-determinated system, the equation \eqref{le} usually has infinite solutions. How to define a pragmetic and special solution? A promising way is the introduction of regularization

$$ \begin{align} \min_{Ax=b}\|x\|.\label{reg} \end{align} $$

However, we can not sovle the problem directly. Considering the following optimization problem

$$ \begin{align*} \min_{x}g(x)=\|Ax-b\|^2+\lambda\|x\|^2. \end{align*} $$

Via setting $\nabla g=0$, we have

$$ \begin{align*} x=(A^{\textsf{T}}A+\lambda I)^{-1}A^{\textsf{T}}b. \end{align*} $$

Due to the invertibilty of $A^{\textsf{T}}A+\lambda I$ when $\lambda\to 0$, we have the solution of problem \eqref{reg}

$$ \begin{align*} x=\lim_{\lambda\to 0}(A^{\textsf{T}}A+\lambda I)^{-1}A^{\textsf{T}}b. \end{align*} $$

In order to write down the solution explicitly, we resort to SVD decomposition. Set $A=U\Sigma V^{\textsf{T}}$, we have

$$ \begin{align*} x=V(\Sigma^{-1})^{\textsf{T}}U^{\textsf{T}}b=\sum_i v_i\sigma_i^{-1}u_i^{\textsf{T}}b, \end{align*} $$

where $\Sigma^{-1}$ generated by replace non-zero $\sigma_i$ in $\Sigma$ with $\sigma_i^{-1}$ and $v_i, u_i$ are column vectors of $V, U$, respctively.

Citation

Cited as:

Zhao, Tong. (Nov 2024). “Connections among Over-determination, Under-Determination, Least-Squares, Pseudo-Inverse, and SVD Decomposition.”. Tong’Log.
https://zhaotong94.github.io/blog/codudlspisvd/.

@article{zhao2024connections,
    title   = "Connections among Over-determination, Under-Determination, Least-Squares, Pseudo-Inverse, and SVD Decomposition.",
    author  = "Zhao, Tong",
    journal = "zhaotong94.github.io",
    year    = "2024",
    month   = "Nov",
    url     = "https://zhaotong94.github.io/blog/codudlspisvd/"}