The search functionality is under construction.
The search functionality is under construction.

Open Access
Data-Reuse Extended NLMS Algorithm Based on Optimized Time-Varying Step-Size for System Identification

Hakan BERCAG, Osman KUKRER, Aykut HOCANIN

  • Full Text Views

    204

  • Cite this
  • Free PDF (1.4MB)

Summary :

A new extended normalized least-mean-square (ENLMS) algorithm is proposed. A novel non-linear time-varying step-size (NLTVSS) formula is derived. The convergence rate of ENLMS increases due to NLTVSS as the number of data-reuse L is increased. ENLMS does not involve matrix inversion, and, thus, avoids numerical instability issues.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E107-A No.8 pp.1369-1373
Publication Date
2024/08/01
Publicized
2024/01/11
Online ISSN
1745-1337
DOI
10.1587/transfun.2023EAL2072
Type of Manuscript
LETTER
Category
Analog Signal Processing

1.  Introduction

System identification (SI) based on adaptive filtering is a significant application which has been motivating researchers to find better methods and solutions to the issues such as convergence rate, misalignment, computational complexity (CC), ease of implementation, and numerical instability.

Least-mean-square (LMS) algorithm is one of the main algorithms which have low computational complexity and ease of implementation [1]-[3] developed and used in SI, but its convergence rate is low. Later, many variants of LMS [4]-[7] such as the LMS-Newton algorithm [8] and NLMS [2], [5], [9] algorithm are developed to improve the convergence rate of LMS. Other algorithms among which recursive least-squares (RLS) algorithm [1], [2], [10] and its variants, affine projection algorithm (APA) and its variants [9]-[11] which have high convergence rates, are developed at the expense of numerical instability problems, and implementation difficulties. This letter introduces the ENLMS algorithm with the following contributions: \(i)\) The novel NLTVSS method enhances the convergence rate of the proposed ENLMS algorithm as the number of past data \(L\) used in the coefficient updating is increased; \(ii)\) ENLMS does not use matrix inversion, and, thus, eliminates numerical instability problems; \(iii)\) ENLMS is easy to implement with only one parameter \(L\) which is simple to tune.

The problem considered in this letter is SI, and the unknown system is considered to be an \(N\)-tap finite impulse response (FIR) model. The adaptive filter estimates the unknown impulse response (IR) coefficients \(\textbf{h}=[h_1,h_2,\ldots,h_N]^T\) via tap-weights \(\textbf{w}(k)=[w_0(k),w_1(k),\ldots,w_{N-1}(k)]^T\), where \(N\) is the filter length, \([\cdot]^T\) denotes transposition operator, and \(k\) is the discrete-time index. The plant output, i.e, the desired response is \(d(k)=\textbf{h}^T\textbf{x}(k)+v(k)\), where \(\textbf{x}(k)=[x(k),x(k-1),\ldots,x(k-N+1)]^T\) is the tap-input vector, and \(v(k)\) is the additive measurement noise. The filter output is \(y(k)=\textbf{x}^T(k)\textbf{w}(k-1)\), and the estimation error is \(e(k)=d(k)-y(k)\).

2.  Derivation of ENLMS

The Wiener-Hopf equation [1] leads to the optimum tap-weight vector \(\textbf{w}_o=\textbf{R}^{-1}\textbf{p}\) for the FIR coefficients. Recursive inverse (RI) algorithm [12], [13] is developed to obtain the Wiener solution by iterative convergence of the weight update equation:

\[\begin{equation*} \textbf{w}(k)=\textbf{w}(k-1)+\mu(k)\left[\textbf{p}(k)-\textbf{R}(k)\textbf{w}(k-1)\right],\hspace{9pt} \tag{1} \end{equation*}\]

where \(\textbf{R}(k)\) and \(\textbf{p}(k)\) are the autocorrelation matrix and crosscorrelation vector, respectively. The RI algorithm attains the Wiener solution \(\textbf{w}_o\), by recursive inversion and simultaneous recursive estimation of the autocorrelation, thus, not requiring the inversion of the autocorrelation matrix. In this way, numerical instability problems due to loss of Hermitian symmetry, and loss of positive definiteness of the inverse autocorrelation are eliminated. Using (1), the a-priori and the a-posteriori residuals of the equation \(\textbf{R}(k)\textbf{w}(k)=\textbf{p}(k)\) are denoted, respectively, as:

\[\begin{eqnarray*} &&\!\!\!\!\! \boldsymbol{\xi}(k,k-1)=\textbf{p}(k)-\textbf{R}(k)\textbf{w}(k-1),\hspace{12pt} \tag{2} \\ &&\!\!\!\!\! \boldsymbol{\xi}(k,k)=\textbf{p}(k)\hspace{-1pt}-\hspace{-1pt}\textbf{R}(k)\textbf{w}(k) =[\textbf{I}\hspace{-1pt}-\hspace{-1pt}\mu(k)\textbf{R}(k)]\boldsymbol{\xi}(k,k\hspace{-1pt}-\hspace{-1pt}1).\!\!\!\!\!\nonumber\\ &&\!\!\!\!\! \tag{3} \end{eqnarray*}\]

In order to minimize the norm of (3), step-size is chosen as:

\[\begin{equation*} \mu(k)=\arg\min_\mu\|\boldsymbol{\xi}(k,k,\mu)\|^{\text{}^{\scriptscriptstyle 2}}\hspace{-4.25pt}_{\text{}_{\scriptscriptstyle 2}}, \tag{4} \end{equation*}\]

where the squared Euclidean norm \(\|\boldsymbol{\xi}(k,k,\mu)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\) is the cost function. Minimizing the norm of (3) corresponds to increasing the convergence rate. Expanding \(\|\boldsymbol{\xi}(k,k,\mu)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\), evaluating its gradient w.r.t. \(\mu\), setting it to zero, and solving for the scalar \(\mu\), the optimal NLTVSS formula minimizing \(\|\boldsymbol{\xi}(k,k,\mu)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\) is obtained as:

\[\begin{equation*} \mu_{\text{}_{\textit{NL}}}(k)=\frac{\boldsymbol{\xi}^{T}(k,k-1)\textbf{R}(k)\boldsymbol{\xi}(k,k-1)}{\boldsymbol{\xi}^{T}(k,k-1)\textbf{R}^2(k)\boldsymbol{\xi}(k,k-1)}.\hspace{17pt} \tag{5} \end{equation*}\]

In the RI algorithm, the autocorrelation matrix and the cross-correlation vector are updated as:

\[\begin{eqnarray*} &&\!\!\!\!\! \textbf{R}(k)=\beta\textbf{R}(k-1)+\textbf{x}(k)\textbf{x}^T(k),\hspace{12pt} \tag{6} \\ &&\!\!\!\!\! \textbf{p}(k)=\beta\textbf{p}(k-1)+d(k)\textbf{x}(k),\hspace{12pt} \tag{7} \end{eqnarray*}\]

where \(\beta\) is the forgetting factor. The last \(L\) input signal vectors are collected in an \(N\times L\) matrix; \(\textbf{X}(k)= [\textbf{x}(k) \textbf{x}(k-1)\cdots \textbf{x}(k-L+1)].\) The solutions of (6) and (7) are:

\[\begin{eqnarray*} &&\!\!\!\!\! \textbf{R}(k)=\sum_{i=0}^{k}\beta^{k-i}\textbf{x}(i)\textbf{x}^T(i),\quad \hspace{7pt} \tag{8} \\ &&\!\!\!\!\! \textbf{p}(k)=\sum_{i=0}^{k}\beta^{k-i}d(i)\textbf{x}(i).\hspace{12pt} \tag{9} \end{eqnarray*}\]

Substitution of (8) and (9) into (2) yields:

\[\begin{equation*} \boldsymbol{\xi}(k,k-1)=\sum_{i=0}^{k}\beta^{k-i}\textbf{x}(i)[d(i)-\textbf{x}^T(i)\textbf{w}(k-1)].\hspace{12pt} \tag{10} \end{equation*}\]

Instead of (8) and (9), \(\textbf{R}(k)\) and \(\textbf{p}(k)\) are estimated as follows:

\[\begin{eqnarray*} &&\!\!\!\!\! {\textbf{R}}(k)=\frac{1}{L}\sum_{i=k-L+1}^{k}\textbf{x}(i)\textbf{x}^T(i),\quad\hspace{12pt} \tag{11} \\ &&\!\!\!\!\! {\textbf{p}}(k)=\frac{1}{L}\sum_{i=k-L+1}^{k}d(i)\textbf{x}(i).\hspace{17pt} \tag{12} \end{eqnarray*}\]

Define the a-priori errors as:

\[\begin{equation*} e(i,k-1)=d(i)-\textbf{x}^T(i)\textbf{w}(k-1),\quad \forall i=k-L+1:k\hspace{5pt} \tag{13} \end{equation*}\]

Then, (10) as a function of (13) becomes:

\[\begin{equation*} \boldsymbol{\xi}(k,k-1)=\sum_{i=0}^{k}\beta^{k-i}e(i,k-1)\textbf{x}(i).\hspace{25pt} \tag{14} \end{equation*}\]

Substituting (11) and (12) into (2), the a-priori residual used instead of (14), and averaged over the last \(L\) errors in (13), is:

\[\begin{equation*} \boldsymbol{\xi}(k,k-1)=\frac{1}{L}\sum_{i=k-L+1}^{k}e(i,k-1)\textbf{x}(i).\hspace{28pt} \tag{15} \end{equation*}\]

A new vector for reducing CC of \(\mu_{\text{}_{\textit{NL}}}(k)\) in (5) is defined as:

\[\begin{equation*} {\textbf{R}}(k)\boldsymbol{\xi}(k,k-1)=\textbf{z}(k)= \frac{1}{L}\sum_{i=k-L+1}^{k}(\textbf{x}^T(i)\boldsymbol{\xi}(k,k-1))\textbf{x}(i). \tag{16} \end{equation*}\]

The number of multiplications of RHS of (16) is \(2LN\) which is less than \(N^2\) multiplications of LHS of (16). Substituting (16) into (5), NLTVSS with reduced CC is expressed as:

\[\begin{equation*} \mu_{\text{}_{\textit{NL}}}(k)=\frac{\boldsymbol{\xi}^T(k,k-1)\textbf{z}(k)}{\|\boldsymbol{\textbf{z}}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}}.\hspace{19pt} \tag{17} \end{equation*}\]

Substituting (15) for RHS of (2), and (17) into (1) yields the ENLMS algorithm weight update equation:

\[\begin{equation*} \textbf{w}(k)=\textbf{w}(k-1)+\mu_{\text{}_{\textit{NL}}}(k)\boldsymbol{\xi}(k,k-1).\hspace{9pt} \tag{18} \end{equation*}\]

It can be deduced that for the special case \(L=1\):

\[\begin{equation*} \textbf{z}(k)=\|\boldsymbol{\textbf{x}}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}e(k,k-1)\textbf{x}(k).\hspace{10pt} \tag{19} \end{equation*}\]

Using (19), an expression for the NLTVSS \(\mu_{\text{}_{\textit{NL}}}(k)\) is obtained as \(\mu_{\text{}_{\textit{NL}}}(k)=1/\|\boldsymbol{\textbf{x}}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\) that is used in (18) to reveal the ENLMS weight update for \(L=1\), which is also the NLMS weight update:

\[\begin{equation*} \textbf{w}(k)=\textbf{w}(k-1)+\mu_{o}\frac{e(k,k-1)\textbf{x}(k)}{\|\boldsymbol{\textbf{x}}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}},\hspace{14pt} \tag{20} \end{equation*}\]

where \(\mu_{o}\) is included to control the convergence rate and misalignment.

2.1  Stability Analysis of ENLMS

Define optimal error as \(e_o(i)=d(i)-\textbf{x}^T(i)\textbf{w}_o\), and \(\boldsymbol{\xi}(k,0)\) as:

\[\begin{equation*} \boldsymbol{\xi}(k,0)=\textbf{p}(k)-\textbf{R}(k)\textbf{w}_o=\frac{1}{L}\sum_{i=k-L+1}^{k}e_o(i)\textbf{x}(i).\hspace{7pt} \tag{21} \end{equation*}\]

Using the residual (21), and substituting (1) into weight-error vector defined as \(\boldsymbol{\varepsilon}(k)=\textbf{w}(k)-\textbf{w}_o\) [1], deduces:

\[\begin{equation*} \boldsymbol{\varepsilon}(k)=[\textbf{I}-\mu(k)\textbf{R}(k)]\boldsymbol{\varepsilon}(k-1)+\mu(k)\boldsymbol{\xi}(k,0).\hspace{12pt} \tag{22} \end{equation*}\]

The autocorrelation estimate can be expressed in terms of its eigenvalue decomposition as \(\textbf{R}(k)=\textbf{E}(k)\boldsymbol{\Lambda}(k)\textbf{E}^T(k)\), where \(\boldsymbol{\Lambda}(k)=\text{diag}\{\lambda_i\}, \forall i=1,2,\ldots,N,\) is a diagonal matrix of eigenvalues, and \(\textbf{E}(k)\) is a matrix, columns of which are the eigenvectors of \(\textbf{R}(k)\). Using this decomposition, (22) can be transformed into:

\[\begin{equation*} \boldsymbol{\tilde\varepsilon}(k)=[\textbf{I}-\mu(k)\boldsymbol{\Lambda}(k)]\boldsymbol{\tilde\varepsilon}(k-1)+\mu(k)\boldsymbol{\tilde\xi}(k,0),\hspace{13pt} \tag{23} \end{equation*}\]

where \(\boldsymbol{\tilde\varepsilon}(k)=\textbf{E}^T(k)\boldsymbol{\varepsilon}(k)\) and \(\boldsymbol{\tilde\xi}(k,0)=\textbf{E}^T(k)\boldsymbol{\xi}(k,0)\). The squared norm of \(\boldsymbol{\tilde\xi}(k,0)\) is given by:

\[\begin{equation*} \hspace{-5pt}\|\boldsymbol{\tilde\xi}(k,0)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}=\frac{1}{L^2}\hspace{-2pt}\sum_{i=k\hspace{-1pt}-\hspace{-1pt}L\hspace{-1pt}+\hspace{-1pt}1}^{k}\sum_{j=k\hspace{-1pt}-\hspace{-1pt}L\hspace{-1pt}+\hspace{-1pt}1}^{k}\hspace{-2pt}e_o(i)e_o(j)\textbf{x}^T(i)\textbf{x}(j),\hspace{0pt} \tag{24} \end{equation*}\]

which, based on the orthogonality of \(e_o(i)\) and \(\textbf{x}(i)\), yields:

\[ \begin{eqnarray*} \hspace{-10pt}E\big\{\|\boldsymbol{\tilde\xi}(k,0)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\big\}=\frac{1}{L^2}\hspace{-2pt}\sum_{i=k\hspace{-1pt}-\hspace{-1pt}L\hspace{-1pt}+\hspace{-1pt}1}^{k}\hspace{-2pt}E\big\{e^2_o(i)\|\textbf{x}(i)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\big\} \tag{25a} \\ =\frac{1}{L}\sigma_o^2E\big\{\|\textbf{x}(i)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\big\}=\frac{N}{L}\sigma_o^2\sigma_x^2,\hspace{-10pt} \tag{25b} \end{eqnarray*}\]

where \(\sigma_o^2\) is the noise variance, \(\sigma_x^2\) is the input signal variance, and \(E\hspace{-1pt}\big\{\hspace{-2pt}\cdot\hspace{-2pt}\big\}\) is mathematical expectation. The squared norm of (23) is:

\[\begin{equation*} \begin{aligned} \hspace{-0pt}\|\boldsymbol{\tilde\varepsilon}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\hspace{-2pt}=\hspace{-2pt}\sum_{n=1}^{N}(1\hspace{-2pt}-\hspace{-2pt}\mu(k)\lambda_n(k))^2\boldsymbol{\tilde\varepsilon}_n^2(k\hspace{-2pt}-\hspace{-2pt}1)\hspace{-1pt}+\hspace{-1pt}\mu^2(k)\|\boldsymbol{\tilde\xi}(k,0)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\\ +2\mu(k)[\boldsymbol{\tilde\varepsilon}^T(k\hspace{-2pt}-\hspace{-2pt}1)\big\{\textbf{I}\hspace{-2pt}-\hspace{-2pt}\mu(k)\boldsymbol{\Lambda}(k)\big\}\boldsymbol{\tilde\xi}(k,0)].\hspace{23pt} \end{aligned} \tag{26} \end{equation*}\]

Denote \(\text{$\tilde{\textbf{z}}$}(k)=\textbf{E}^T(k)\boldsymbol{\xi}(k,k-1)\), and \(\vartheta_i(k)=\frac{\lambda_i(k)}{\lambda_{min}(k)}, \forall i=1,\ldots,N\). Noticing \(1\leq\vartheta_i(k)\leq\vartheta_{max}(k)\), where \(\vartheta_{max}(k)=\frac{\lambda_{max}(k)}{\lambda_{min}(k)}\) is the eigenvalue spread, the NLTVSS \(\mu_{\text{}_{\textit{NL}}}(k)\) in (5) is reevaluated as:

\[\begin{equation*} \mu_{\text{}_{\textit{NL}}}(k)=\frac{\sum\limits_{i=1}^{N}\lambda_i(k)\text{$\tilde{\text{z}}$}_i^2(k)}{\sum\limits_{i=1}^{N}\lambda_i^2(k)\text{$\tilde{\text{z}}$}_i^2(k)}=\frac{\lambda_{min}(k)\sum\limits_{i=1}^{N}\vartheta_i(k)\text{$\tilde{\text{z}}$}_i^2(k)}{\lambda_{min}^2(k)\sum\limits_{i=1}^{N}\vartheta_i^2(k)\text{$\tilde{\text{z}}$}_i^2(k)},\hspace{0pt} \tag{27} \end{equation*}\]

from which, the following is deduced:

\[\begin{equation*} \mu_{\text{}_{\textit{NL}}}(k)\lambda_{min}(k)=\frac{\sum\limits_{i=1}^{N}\vartheta_i(k)\text{$\tilde{\text{z}}$}_i^2(k)}{\sum\limits_{i=1}^{N}\vartheta_i^2(k)\text{$\tilde{\text{z}}$}_i^2(k)}\geq\frac{1}{\vartheta_{max}(k)}. \tag{28} \end{equation*}\]

Defining \(\lambda_{a}(k)\approx\lambda_{i}(k)\) as the average of the eigenvalues of \(\textbf{R}(k)\) for the case in which the eigenvalue spread \(\vartheta_{max}(k)\) is not very large, NLTVSS in (27) can be approximated as \(\mu_{\text{}_{\textit{NL}}}(k)\simeq\frac{1}{\lambda_{a}(k)}\). In (26), replacing \(\lambda_n(k)\) by \(\lambda_{min}(k)\), and substituting (28) for \(\mu_{\text{}_{\textit{NL}}}(k)\lambda_{min}(k)\) results in the following:

\[\begin{equation*} \begin{aligned} \hspace{-0pt}\|\boldsymbol{\tilde\varepsilon}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\leq\left(1\hspace{-1pt}-\hspace{-1pt}\frac{1}{\vartheta_{max}(k)}\right)^{\mkern -3.5mu 2}\hspace{-2pt}\|\boldsymbol{\tilde\varepsilon}(k-1)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\hspace{0pt}+\hspace{0pt}\mu^2_{\text{}_{\textit{NL}}}(k)\|\boldsymbol{\tilde\xi}(k,0)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\\ +2\mu_{\text{}_{\textit{NL}}}(k)[\boldsymbol{\tilde\varepsilon}^T(k\hspace{-2pt}-\hspace{-2pt}1)\big\{\textbf{I}\hspace{-2pt}-\hspace{-2pt}\mu_{\text{}_{\textit{NL}}}(k)\boldsymbol{\Lambda}(k)\big\}\boldsymbol{\tilde\xi}(k,0)].\hspace{13pt} \end{aligned} \tag{29} \end{equation*}\]

The time-varying eigenvalue spread \(\vartheta_{max}(k)\) fluctuates around the theoretical eigenvalue spread \(\rho\) of the covariance matrix, and approaches it as time increases.Thus, \(\vartheta_{max}(k)\) is replaced by \(\rho\), while the expectation of (29) is taken to determine the mean steady-state behaviour of the error norm.Note that \(E[\boldsymbol{\tilde\varepsilon}^T(k\hspace{-2pt}-\hspace{-2pt}1)]\hspace{-2pt}\simeq\hspace{-2pt} 0\), and since the terms \(\boldsymbol{\tilde\varepsilon}^T(k\hspace{-2pt}-\hspace{-2pt}1)\) and \([\textbf{I}\hspace{-1pt}-\hspace{-1pt}\mu_{\text{}_{\textit{NL}}}(k)\boldsymbol{\Lambda}(k)]\boldsymbol{\tilde\xi}(k,0)\) are independent, \(E\big\{\boldsymbol{\tilde\varepsilon}^T(k\hspace{-2pt}-\hspace{-2pt}1)[\textbf{I}\hspace{-2pt}-\hspace{-2pt}\mu_{\text{}_{\textit{NL}}}(k)\boldsymbol{\Lambda}(k)]\boldsymbol{\tilde\xi}(k,0)\big\}\simeq 0\). Denoting \(v(k)\hspace{-1pt}=\hspace{-1pt}E\big[\|\boldsymbol{\tilde\varepsilon}(k)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\big]\) and taking the expected value of (29) yields:

\[\begin{equation*} v(k)\leq\left(1\hspace{0pt}-\hspace{0pt}\frac{1}{\rho}\right)^{\mkern -3.5mu 2}\hspace{-2pt}v(k\hspace{-0.5pt}-\hspace{-0.5pt}1)+E\big\{\mu^2_{\text{}_{\textit{NL}}}(k)\|\boldsymbol{\tilde\xi}(k,0)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\big\}.\hspace{7pt} \tag{30} \end{equation*}\]

The expected value of (11) is \(E\big[\textbf{R}(k)\big]\hspace{-0.5pt}=\hspace{-0.5pt}E\big\{\textbf{x}(i)\textbf{x}^T(i)\big\}\hspace{-0.5pt}=\hspace{-0.5pt}\textbf{R}_x\), the theoretical autocorrelation. The average eigenvalue of \(\textbf{R}(k)\) can be expressed as \(\lambda_{a}(k)\simeq \lambda_{x,a}\), where \(\lambda_{x,a}\) is the average eigenvalue of \(\textbf{R}_x\). Equating \(\text{tr}(\textbf{R}_x)=N\lambda_{x,a}\) and \(\text{tr}(\textbf{R}_x)=E\big[\|\textbf{x}(i)\|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}\big]=N\sigma_x^2\), leads to \(\lambda_{x,a}=\sigma_x^2\), which is substituted with \(\lambda_{a}(k)\simeq\lambda_{x,a}\), \(\mu_{\text{}_{\textit{NL}}}(k)\simeq\frac{1}{\lambda_{a}(k)}\), and (25b) into (30) to reveal:

\[\begin{equation*} v(k)\leq\left(1\hspace{0pt}-\hspace{0pt}\frac{1}{\rho}\right)^{\mkern -3.5mu 2}\hspace{-2pt}v(k\hspace{-0.5pt}-\hspace{-0.5pt}1)+\frac{N\sigma_o^2}{L\sigma_x^2}. \tag{31} \end{equation*}\]

It can be deduced from (31) that the steady-state weight error vector norm is upper bounded as follows:

\[\begin{equation*} v(\infty)\leq\left(\frac{\rho^2}{2\rho-1}\right)\frac{N\sigma_o^2}{L\sigma_x^2}. \hspace{20pt} \tag{32} \end{equation*}\]

The upper bound of \(v(\infty)\) increases as \(\rho\) increases. Yet, increasing \(L\) will decrease the upper bound of \(v(\infty)\), and compensate the increase due to \(\rho\), unraveling NLTVSS’s capability of improving the performance of ENLMS. The inequalities (31) and (32) show that, for all finite values of \(\rho\), the weight error vector norm is stable.

2.1.1  Experimental Results

The performance of the proposed ENLMS algorithm is evaluated, and compared with that of the APA, RLS and NLMS algorithms by executing five simulations, parameters of which are given in the captions of Figs. 1-5. The total multiplications per iteration of the algorithms are as follows: ENLMS: \((4L+3)N\), APA: \((L^2\hspace{-0.5pt}+\hspace{-0.5pt}2L)N\hspace{-0.5pt}+\hspace{-0.5pt}L^3+L\) [9], RLS: \(2N^2+2N\) [14], NLMS: \(2N+3\) [9]. The performance metric used in the SI application is mean-square deviation (MSD), in dB, defined as \(10\)log\(_{10}(E[\|\textbf{w}_o-\textbf{w}(k) \|^{\text{}^{\scriptscriptstyle 2}}_{\text{}_{\scriptscriptstyle 2}}])\). Each simulation is averaged over \(500\) Monte Carlo ensembles. The input signal used in the simulations \(1\) through \(4\) is generated from the AR(4) process: \(x(k)=1.79x(k-1)-1.85x(k-2)+1.27x(k-3)-0.41x(k-4)+v_o(k)\) where \(v_o(k)\) is white Gaussian noise (WGN) with \(\mathcal{N}(0,0.1481)\). The measurement noise is zero-mean additive WGN with variance \(\sigma_m^2=10^{-3}\). The signal-to-noise ratio SNR defined as \(10\)log\(_{10}(E[x^2(k)]/E[m^2(k)]=\sigma_x^2/\sigma_m^2)\) is \(30\) dB in all simulations. The symmetric IR used in all simulations has length \(N=65\). In simulations \(1\) through \(4\), the computational complexities in terms of the number of multiplications of ENLMS and APA are set to almost the same values for fair comparison. All compared algorithms are set to initially converge to nearly the same misalignment in each simulation.

Fig. 1  MSD convergence rate: ENLMS with L=3; APA with L=3, \(\mu_o=0.125\); NLMS with \(\mu_o=0.9\); RLS with \(\delta=0.001\), \(\lambda=0.9995\).

The first simulation (Fig. 1) shows the convergence rate performance of ENLMS at \(L=3\) as compared with other algorithms. ENLMS converges by \(\sim2500\) steps faster than NLMS, but slower than RLS and APA. In the second simulation (Fig. 2), ENLMS can be observed to converge faster than its counterpart algorithms APA and NLMS by \(\sim1000\) and \(\sim5500\) steps, respectively, even though RLS still converges faster than ENLMS. This performance enhancement occurs when the number of data-reuse of ENLMS is increased to \(L=12\). In the third simulation (Fig. 3), it can be observed that the convergence rate of ENLMS is approximately the same as that of RLS. Although RLS initially converges fast in the transient state, it starts slowing close to the steady-state convergence, whereas ENLMS initially converges slow in the transient state, but converges fast as it approaches the steady-state. Consequently, ENLMS and RLS converge with nearly the same speeds at the beginning of the steady-state conditions. However, when the number of data-reuse of ENLMS is \(L=21\), ENLMS is superior to RLS in terms of computational complexity, since the number of multiplications of ENLMS is \(5655\), which is significantly less than \(8580\) multiplications of RLS. In general, for \(L<N/2\), the number of multiplications of ENLMS is less than that of RLS. This simulation shows that when \(L\) is sufficiently increased and \(L<N/2\), ENLMS exhibits close convergence performance to that of RLS, and operates with less computational complexity than RLS. In addition, ENLMS can be observed to converge faster than its counterpart algorithms APA and NLMS in this setting, as well. In the fourth simulation (Fig. 4), when the number of data-reuse of ENLMS is drastically increased to \(L=33\), ENLMS can be observed to converge to steady-state at \(k\simeq1500\), which is \(\sim1200\) and \(\sim6700\) steps faster than APA and NLMS, respectively. The performance of ENLMS at increased \(L=33\) is close to that of the benchmark RLS algorithm. In the fifth simulation (Fig. 5), the ENLMS and RLS algorithms are compared according to varying eigenvalue spreads. When the input signal used is AR(4) process: \(x(k)=1.352x(k-1)-1.338x(k-2)+0.662x(k-3)-0.24x(k-4)+v_o(k)\), the eigenvalue spread \(\rho=264.9\). When the input signal used is the AR(4) process: \(x(k)=1.79x(k-1)-1.85x(k-2)+1.27x(k-3)-0.41x(k-4)+v_o(k)\), the eigenvalue spread \(\rho=1025.2\), and when the input signal used is AR(4) process: \(x(k)=1.70x(k-1)-1.95x(k-2)+1.27x(k-3)-0.41x(k-4)+v_o(k)\), the eigenvalue spread \(\rho=2159.5\). In all cases, \(v_o(k)\) is white Gaussian noise (WGN) with \(\mathcal{N}(0,0.1481)\), the measurement noise is zero-mean additive WGN with variance \(\sigma_m^2=10^{-3}\), and the SNR is \(30\) dB. The simulation shows that at the relatively low eigenvalue spread \(\rho=264.9\), the convergence speed of RLS is slightly faster than that of ENLMS. Yet, it can be observed that the convergence rate of ENLMS approaches that of RLS as the eigenvalue spread \(\rho\) is increased. It can be observed in Fig. 5 that, the convergence rate of ENLMS at \(\rho=1025.2\) becomes equivalent to that of RLS, and slightly higher at \(\rho=2159.5\) than that of RLS. Note that, for fair comparison of convergence rates, both methods are made to converge to almost the same misalignment as the eigenvalue spread is changed.

Fig. 2  MSD convergence rate: ENLMS with L=12; APA with L=6, \(\mu_o=0.0212\); NLMS with \(\mu_o=1.24\); RLS with \(\delta=4.7\), \(\lambda=0.999\).

Fig. 3  MSD convergence rate: ENLMS with L=21; APA with L=8, \(\mu_o=0.016\); NLMS with \(\mu_o=1.38\); RLS with \(\delta=3.9\), \(\lambda=0.9987\).

Fig. 4  MSD convergence rate: ENLMS with L=33; APA with L=10, \(\mu_o=0.136\); NLMS with \(\mu_o=1.45\); RLS with \(\delta=3.2\), \(\lambda=0.9984\).

Fig. 5  MSD convergence rate: ENLMS with L=15; \(\rho=264.9\), \(\rho=1025.2\), \(\rho=2159.5\); RLS with \(\delta=2.6\), \(\lambda=0.999\) for \(\rho=2159.5\), \(\delta=3.9\), \(\lambda=0.9989\) for \(\rho=1025.2\), \(\delta=3.0\), \(\lambda=0.9965\) for \(\rho=264.9\).

Overall, it can be observed that, as \(L\) is increased, the convergence rate of ENLMS becomes enhanced, however, with increased misalignment which also occurs with the other algorithms in comparison. Additionally, the simulations show that, as \(L\) is increased, the relative convergence rate difference between ENLMS and the compared algorithms alters in favour of ENLMS.

3.  Conclusion

A new extended NLMS algorithm based on the data reuse of the tap-input and of the desired response is proposed. A novel non-linear time-varying step-size formula is derived. The new ENLMS algorithm is analyzed theoretically and shown to be stable. The convergence speed of ENLMS is illustrated by simulations to become faster than that of its counterpart algorithms, and comparable to that of RLS, as the number of data-reuse is increased, when the impulse response is symmetric as in channel equalization applications.

References

[1] S. Haykin, Adaptive Filter Theory, 4th ed., Prentice Hall, Upper Saddle River, NJ, U.S.A., 2002.

[2] M.H. Hayes, Statistical Digital Signal Processing and Modeling, 1st ed., John Wiley & Sons, 111 River Street, Hoboken, NJ, U.S.A., 1996.

[3] E. Bjarnason, “Analysis of the filtered-X LMS algorithm,” IEEE Trans. Speech Audio Process., vol.3, no.6, pp.504-514, 1995.
CrossRef

[4] D. Feng, Z. Bao, and L. Jiao, “Total least mean squares algorithm,” IEEE Trans. Signal Process., vol.46, no.8, pp.2122-2130, 1998.
CrossRef

[5] D. Feng, Z. Bao, and L.C. Jiao, “Interference normalized least mean square algorithm,” IEEE Signal Process. Lett., vol.14, no.12, pp.988-991, 2007.
CrossRef

[6] T.K. Wao, Z. Bao, and L.C. Jiao, “Fast hierarchical least mean square algorithm,” IEEE Signal Process. Lett., vol.8, no.11, pp.289-291, 2001.
CrossRef

[7] S.S. Bhattacharjee, D. Ray, and N.V. George, “Adaptive modified versoria zero attraction least mean square algorithms,” IEEE Trans. Circuits. II, Exp. Briefs, vol.67, no.12, pp.3602-3606, 2020.
CrossRef

[8] B. Widrow and S. Stearns, Adaptive Signal Processing, Prentice Hall, Englewood Cliffs, NJ, 1985.

[9] F. Huang and J. Zhang, “A family of gain-combined proportionate adaptive filtering algorithms for sparse system identification,” Digital Signal Processing, vol.70, no.6, pp.49-58, 2017.
CrossRef

[10] P.S.R. Diniz, Adaptive Filtering, Algorithms and Practical Implementation, 4th ed., Springer, Springer Science and Business Media, New York, U.S.A., 2013.
CrossRef

[11] H. Zhao, B. Liu, and P. Song, “Variable step-size affine projection maximum correntropy criterion adaptive filter with correntropy induced metric for sparse system identification,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol.67, no.11, pp.2782-2786, 2020.
CrossRef

[12] M.S. Ahmad, O. Kukrer, and A. Hocanin, “Recursive inverse adaptive filtering algorithm,” Digital Signal Processing, vol.21, no.4, pp.491-496, 2011.
CrossRef

[13] M.S. Salman, O. Kukrer, and A. Hocanin, “Recursive inverse algorithm: Mean-square-error analysis,” Digital Signal Processing, vol.66, no.7, pp.10-17, 2017.
CrossRef

[14] C.E. Iliescu, C. Paleologu, J. Benesty, C. Stanciu, C. Anghel, and S. Ciochina, “Recursive least-squares algorithms for the identification of low-rank systems,” IEEE/ACM Trans. Audio, Speech, Language Process., vol.27, no.5, pp.903-917, 2019.
CrossRef

Authors

Hakan BERCAG
  Eastern Mediterranean University
Osman KUKRER
  Eastern Mediterranean University
Aykut HOCANIN
  Eastern Mediterranean University

Keyword