Recursive Least Squares
Recursive least squares (RLS) is an adaptive filtering algorithm that recursively finds the coefficients to minimize a weighted linear least squares cost function relating to the input signals. Let and be the -th input and output signals which satisfy the following linear relationship:
where is a random noise and is the parameter that we want to find. Assume we have signal pairs. We denote and . . Then, we can write a compact form
Note: some reference denote such that we have . We use our notations.
Our objective is to find an estimated parameter to minimize the least square error of signal pairs. Define the error . Our objective is
The solution to the above optimization problem is .
Now we assume we have a new signal pair comes. It augments the existing data set to and . We want to update to minimize the least square loss of and . We can completely forget and formulate a new problem to find . But it can be exhaustive. We want to do it quickly and reduce the computational burden. Can we leverage to update the solution? The answer is yes. We know that and and . Therefore, we have
Using the Matrix Inversion Lemma (the Sherman-Morrison-Woodbury formula)
we let , , , and . We have
Let , then we have have the update
Then, we can write
Note from the definition that and , we can further simplify as
We can initialize , w_0 = 0. For , we have
Initialization
We should note that RLS can start from a zero data set. We need to set and . It is easy to get confused when solving linear systems. Suppose we want to solve , where . When , the system is under-determined, i.e., there are more variables than equations. Therefore, infinite solutions can exist if .
In RLS, when there is one data, the system is under-determined. So can we solve for only one data point from the optimization? The answer is no because does not exist for a single point. Therefore, at the beginning, when the data is less than the number of decision variables . We may be unable to use the formula to find the optimal . This formula only holds when . Therefore, we need to assume an invertible and follow the rules to update.
In fact, when , only has two possibilities. If , we have infinitely many solutions. If , there is no solution.
Decayed Error
We can add decay factors to the objective. The past data contribute less to the loss than the current data. The objective becomes
for . The update role is the same, but is evolved.
Multi-dimensional Signals
The idea can naturally extend to for . In this case, the input signal is generally a matrix . We have . The formulation does not change.
Nonlinear RLS
Previously, we used a linear filter to achieve the least square estimation. The input-output signal can also follow some nonlinear rules
In this case, we use a nonlinear filter parameterized by to minimize the square loss
However, the general nonlinear functions can be hard to optimize, we in fact first linearize at some and then minimize the linearized loss. We want to perform it incrementally (or recursively) to reduce the computational burden. Assume we already have some . The optimal on the data set and solves the following problem:
Now we let and . We can formulate a new data matrix and . At step , we receive and we can compute and . Then, we can use the same method in linear RLS to update . The difference is that we need to compute and at in every step.
The convergence issue of nonlinear RLS can be found in Bertsekas’ book Nonlinear Programming.
Useful reference
Recursive Least Squares