1. Introduction
In wireless sensor networks, one of the crucial tasks is the parameter estimation from noise-corrupted measurements collected at sensor nodes. The measurements can be linearly formulated with the known observation matrix \(\mathbf{H}\) and the parameters to be estimated. Selection of sensors makes a critical impact on the estimation accuracy since the information measured by sensors depends heavily upon their locations. The research for the sensor selection has been conducted in [1]-[4]: the selection methods based on convex relaxation [1] and cross-entropy optimization [2] were developed at a huge cost of complexity, especially for large sensor networks. Since greedy approach may lead to a feasible complexity, much effort to solve the selection problem in a greedy manner has been made, despite of its suboptimality. To achieve an additional complexity reduction, the log-determinant of the inverse estimation error covariance matrix was greedily maximized to derive a simple analytic rule by using its monotone submodularity [3]. A greedy method was also presented to directly minimize the mean squared estimation error (MSE) by applying the QR factorization [4].
Notice that most of the previous research assumed uncorrelated measurement noise to solve the sensor selection problem (equivalently, the noise covariance matrix is assumed to have zero or ignorable off-diagonal elements). The presence of correlated noise imposes more difficulties on the selection problem: specifically, the inverse estimation error covariance matrix is no longer expressed as a linear function of the selected sensors [5], [6]. The selection problem under correlated noise for the maximum likelihood estimation (MLE) was addressed and centralized and distributed algorithms for sparsity-aware sensor selection were presented using convex relaxation [5]. To minimize the MSE for estimation of random parameters, two sensor selection methods were developed by using convex relaxation and greedy approach [6]. Recently, efficient greedy methods were proposed by using different derivation processes to maximize the log-determinant of the inverse error covariance matrix, yielding an improved estimation accuracy when measurements are corrupted by correlated noise in [7], [8].
In this paper, we consider a situation where sensors collect linear measurements affected by correlated noise. We conduct linear Bayesian estimation of parameters by using the measurements at a subset of sensors. We pursue a direct minimization of the estimation error to find the best subset \(S\). Note that the estimation error covariance matrix is given by a function of the matrix \(\mathbf{H}_S\) with rows selected from \(\mathbf{H}\) and the covariance matrix \(\mathbf{K}_S\) with entries, each of which corresponds to noise covariance between the selected nodes. We then simplify the metric by factoring \(\mathbf{H}_S\) and \(\mathbf{K}_S\) based on the QR and LU factorizations, respectively and derive an analytic rule which facilitates an iterative selection of such a minimizing node in a greedy manner. We examine the performance of various selection methods through numerical experiments, showing a competitive estimation accuracy achieved by the proposed method in comparison with previous methods.
This paper is organized as follows. The problem is formulated in Sect. 2. The QR and LU factorizations are employed to simplify the estimation error and an analytic selection rule is derived in Sect. 3. Numerical experiments are executed in Sect. 4 and conclusions in Sect. 5.
2. Problem Formulation
In wireless sensor networks, the measurement vector \(\mathbf{y}\) with entries \(y_i, i\in \mathcal{V}=\{1 \cdots N\}\) is assumed to be corrupted by additive correlated noise \(\mathbf{w}=[w_1 \cdots w_N]^\top\) and collected by \(N\) sensor nodes deployed in a sensor field. In generating the measurements, we also assume a linear observation model with the known \(N\times p\) full column-rank observation matrix \(\mathbf{H}\) and the parameter vector \(\mathbf{\theta}\in \mathbb{R}^p\) to be estimated as follows:
\[\begin{equation*} \mathbf{y}=\mathbf{H}\mathbf{\theta}+\mathbf{w} \tag{1} \end{equation*}\] |
where \(\mathbf{H}\) consists of \(N\) row vectors \(\mathbf{h}_i^\top, i\in \mathcal{V}\) and the measurement noise \(\mathbf{w}\) independent of \(\mathbf{\theta}\) is assumed to follow the normal distribution \(\mathcal{N}(\mathbf{0},\mathbf{K})\) with the covariance matrix \(\mathbf{K}\) of rank \(N\). We consider the estimation of the random parameter \(\mathbf{\theta}\) drawn from \(\mathcal{N}(\mathbf{0},\mathbf{\Sigma_{\theta}})\) and the method presented in this work can be used for the case of the deterministic parameter by letting \(\mathbf{\Sigma}_{\theta}^{-1}=\mathbf{0}\). Then, we aim to select \(p\) nodes in the set \(S\) that minimizes the estimation error computed from the selected measurement vector \(\mathbf{y}_S\) with the \(p\) entries \(y_i, i\in S\).
In this work, we perform the linear Bayesian estimation and the estimation error covariance matrix \(\mathbf{\Sigma}(S)\) and the optimal estimator \(\hat{\mathbf{\theta}}\) (e.g., maximum a posteriori (MAP) estimator or minimum mean squared error (MMSE) estimator) are given as follows [3], [9]:
\[\begin{align} & \mathbf{\Sigma}(S)=\left(\mathbf{\Sigma}_\mathbf{\theta}^{-1}+\mathbf{H}_S^\top \mathbf{K}_S^{-1}\mathbf{H}_S\right)^{-1} \tag{2} \\ & \hat{\mathbf{\theta}}=\left(\mathbf{\Sigma}_\mathbf{\theta}^{-1}+\mathbf{H}_S^\top \mathbf{K}_S^{-1}\mathbf{H}_S\right)^{-1}\mathbf{H}_S^\top \mathbf{K}_S^{-1}\mathbf{y}_S \tag{3} \end{align}\] |
where \(\mathbf{H}_S\) is the \(|S|\times p\) matrix with rows \(\mathbf{h}_i^\top, i\in S\) and \(\mathbf{K}_S\) the \(|S|\times|S|\) covariance matrix with entries \(k_{ij}=E[w_iw_j], i,j\in S\).
Now, we formulate the sensor selection problem in which the best set \(S^*\) of sensor nodes is constructed so as to minimize the mean squared estimation error MSE(S):
\[\begin{equation*} S^*=\arg\min_{S, |S|=p} \textrm{MSE}(S)=\arg\min_{S, |S|=p} tr\left[\mathbf{\Sigma}(S)\right] \tag{4} \end{equation*}\] |
We seek to tackle the problem by adopting a greedy approach in which one sensor node is iteratively selected so as to minimize the intermediate MSE denoted by \(\textrm{MSE}(S_i)=tr\left[\mathbf{\Sigma}(S_i)\right]\), where \(S_i\) is a set of \(i\) sensors selected until the \(i\)-th iteration. Note that the node at the (\(i+1\))-th iteration is selected from the set \(S_i^C\equiv(V-S_i)\) consisting of the remaining sensors. Specifically,
\[\begin{align} & j^*\,{=}\,\arg\min_{S_{i+1}=S_i+\{j\},j\in S_i^C} tr\left[\mathbf{\Sigma}(S_{i+1})\right] \nonumber\\ &\hphantom{j^*}\,{=}\,\arg\min_{S_{i+1}=S_i+\{j\},j\in S_i^C} tr {\left[ \left(\mathbf{\Sigma}_\mathbf{\theta}^{-1} \,{+}\, \mathbf{H}_{S_{i+1}}^\top \mathbf{K}_{S_{i+1}}^{-1}\mathbf{H}_{S_{i+1}}\right)^{-1}\right]} \tag{5} \\ & S_{i+1}=S_i+\{j^*\} \tag{6} \end{align}\] |
where \(\mathbf{H}_{S_{i+1}}\) and \(\mathbf{K}_{S_{i+1}}\) are constructed as follows:
\[\begin{equation*} \mathbf{H}_{S_{i+1}}^\top= \begin{bmatrix} \mathbf{H}_{S_i}^\top & \mathbf{h}_j \end{bmatrix},\quad \mathbf{K}_{S_{i+1}}= \begin{bmatrix} \mathbf{K}_{S_i} & \mathbf{k}\\ \mathbf{k}^\top & k \end{bmatrix} \tag{7} \end{equation*}\] |
where \(\mathbf{k}_{i+1}^\top\equiv\left[\mathbf{k}^\top\quad k\right]\) with \(\mathbf{k}^\top=[k_{(1)j}\quad\cdots\quad k_{(i)j}]\) and \(k=k_{jj}\). Here, \(k_{(i)j}=E[w_{(i)}w_j]\) where the subscript \((i)\) denotes the number of the node selected at the \(i\)-th iteration. The selection process in (5) and (6) continues until \(|S_i|\) reaches \(p\).
3. Proposed Sensor Selection Algorithm
We first manipulate the metric in (5) by performing the QR and LU factorizations of \(\mathbf{H}_{S_{i+1}}^\top=\mathbf{Q}\bar{\mathbf{R}}_{i+1}\) and the symmetric matrix \(\mathbf{K}_{S_{i+1}}=\mathbf{M}_{S_{i+1}}\mathbf{M}_{S_{i+1}}^\top\) where \(\mathbf{Q}\) is the \(p\times p\) orthogonal matrix with \(p\) column vectors \(\mathbf{q}_j, \bar{\mathbf{R}}_{i+1}\) the \(p\times (i+1)\) matrix and \(\mathbf{M}_{S_{i+1}}\) the \((i+1)\times(i+1)\) lower triangular matrix. Then, assuming \(\mathbf{\Sigma}_\mathbf{\theta}=\sigma_{\theta}^2\mathbf{I}_p\) where \(\mathbf{I}_p\) is \(p\times p\) identity matrix, we have
\[\begin{align} &\textrm{MSE}(S_{i+1})&\nonumber\\ &=tr\left[\left(\frac{1}{\sigma_{\theta}^2}\mathbf{I}_p+\mathbf{Q}\bar{\mathbf{R}}_{i+1}(\mathbf{M}_{S_{i+1}}\mathbf{M}_{S_{i+1}}^\top)^{-1}\bar{\mathbf{R}}_{i+1}^\top\mathbf{Q}^\top\right)^{-1}\right]&\nonumber\\ &=tr\left[\left(\frac{1}{\sigma_{\theta}^2}\mathbf{I}_p+\bar{\mathbf{R}}_{i+1}(\mathbf{M}_{S_{i+1}}\mathbf{M}_{S_{i+1}}^\top)^{-1}\bar{\mathbf{R}}_{i+1}^\top\right)^{-1}\right]& \tag{8} \end{align}\] |
where (8) follows from the cyclic property of the trace operation and \(\mathbf{Q}^\top=\mathbf{Q}^{-1}\). Notably, \(\bar{\mathbf{R}}_{i+1}\) can be expressed by using the \((i+1)\times(i+1)\) upper triangular matrix \(\mathbf{R}_{i+1}\) which is constructed from \(\mathbf{R}_i\) given at the previous iteration as follows:
\[\begin{equation*} \bar{\mathbf{R}}_{i+1}= \begin{bmatrix} \mathbf{R}_{i+1}\\ \mathbf{0}_{(p-i-1)\times (i+1)} \end{bmatrix},\quad \mathbf{R}_{i+1}=\begin{bmatrix} \mathbf{R}_i & \mathbf{r}_i\\ \mathbf{0} & r_{i+1} \end{bmatrix} \tag{9} \end{equation*}\] |
where \(\mathbf{0}_{a\times b}\) indicates \(a\times b\) zero matrix and \(\mathbf{r}_{i+1}^\top=[\mathbf{r}_i^\top\quad r_{i+1}]\) can be obtained from the QR factorization of \(\mathbf{H}_{S_{i+1}}^\top\). Furthermore, \(\mathbf{M}_{S_{i+1}}\) and \(\mathbf{M}_{S_{i+1}}^{-1}\) can be computed from \(\mathbf{M}_{S_i}\) and \(\mathbf{M}_{S_i}^{-1}\) at the previous iteration, respectively [8]:
\[\begin{align} &\mathbf{M}_{S_{i+1}}=\begin{bmatrix} \mathbf{M}_{S_i} & \mathbf{0}\\ \mathbf{m}_i^\top& m_{i+1} \end{bmatrix}& \tag{10} \\ &\mathbf{M}_{S_{i+1}}^{-1}=\begin{bmatrix} \mathbf{M}_{S_i}^{-1} & \mathbf{0}\\ (\mathbf{m}_i^{inv})^\top& m_{i+1}^{-1} \end{bmatrix},\quad (\mathbf{m}_i^{inv})^\top=-\frac{\mathbf{m}_i^\top\mathbf{M}_{S_i}^{-1}}{m_{i+1}}& \tag{11} \end{align}\] |
where \(\mathbf{M}_{S_i}\mathbf{m}_i=\mathbf{k}, \quad m_{i+1}^2=k-\parallel\mathbf{m}_i\parallel^2\). Denoting \(\mathbf{R}_{i+1}\left(\mathbf{M}_{S_{i+1}}^{-1}\right)^\top\) by the upper triangular matrix \(\mathbf{T}_{i+1}\) and plugging (9), (10) and (11) into (8), we produce
\[\begin{align} &\textrm{MSE}(S_{i+1})&\nonumber\\ &=tr\begin{bmatrix} \left(\mathbf{T}_{i+1}\mathbf{T}_{i+1}^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i+1}\right)^{-1} & \mathbf{0}_{(p-i-1)\times (i+1)}^\top\\ \mathbf{0}_{(p-i-1)\times (i+1)} & \frac{1}{\sigma_{\theta}^2}\mathbf{I}_{p-i-1} \end{bmatrix}& \tag{12} \\ &\propto tr\left[\left(\mathbf{T}_{i+1}\mathbf{T}_{i+1}^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i+1}\right)^{-1}\right]& \tag{13} \end{align}\] |
where (13) follows since \(tr(\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{p-i-1})\) is irrelevant to the process of finding the minimizing sensor at the \((i+1)\)-th iteration. Note that \(\mathbf{T}_{i+1}\equiv\mathbf{R}_{i+1}\left(\mathbf{M}_{S_{i+1}}^{-1}\right)^\top\) is also written by using the matrices at the previous iteration:
\[\begin{equation*} \mathbf{T}_{i+1} =\begin{bmatrix} \mathbf{R}_i(\mathbf{M}_{S_i}^{-1})^\top & \frac{-\mathbf{R}_i(\mathbf{M}_{S_i}^{-1})^\top\mathbf{m}_i+\mathbf{r}_i}{m_{i+1}}\\ \mathbf{0} & r_{i+1}/m_{i+1} \end{bmatrix} \equiv\begin{bmatrix} \mathbf{T}_i & \mathbf{t}_i\\ 0 & t_{i+1} \end{bmatrix} \tag{14} \end{equation*}\] |
Now, we present a theorem which provides a simple criterion to find the sensor node minimizing (13).
Theorem 1. The sensor node at the \((i+1)\)-th iteration that minimizes the MSE in (13) is determined as follows:
\[\begin{equation*} j^*=\arg\min_{j\in S_i^C} \frac{\sigma_{\theta}^2(1+k)-\parallel\left(\mathbf{T}_i\mathbf{T}_i^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i}\right)^{-1}\mathbf{t}_i\parallel^2}{t_{i+1}^2\sigma_{\theta}^2+(1+k)} \tag{15} \end{equation*}\] |
where \(k=\mathbf{t}_i^\top\left(\mathbf{T}_i\mathbf{T}_i^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i}\right)^{-1}\mathbf{t}_i\).
Proof. We first denote \(\mathbf{T}_i\mathbf{T}_i^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i}\) by \(\mathbf{P}_i\) for a simplified notation. Then, we employ the derivation process in [4] to yield
\[\begin{align} &\left(\mathbf{T}_{i+1}\mathbf{T}_{i+1}^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i+1}\right)^{-1}&\nonumber\\ &=\begin{bmatrix} \left(\mathbf{P}_i\right)^{-1}-\frac{\left(\mathbf{P}_i\right)^{-1}\mathbf{t}_i\mathbf{t}_i^\top\left(\mathbf{P}_i\right)^{-1}}{t_{i+1}^2\sigma_{\theta}^2+1+k} & -\frac{t_{i+1}\sigma_{\theta}^2\left(\mathbf{P}_i\right)^{-1}\mathbf{t}_i}{t_{i+1}^2\sigma_{\theta}^2+1+k}\\ -\frac{t_{i+1}\sigma_{\theta}^2\mathbf{t}_i^\top\left(\mathbf{P}_i\right)^{-1}}{t_{i+1}^2\sigma_{\theta}^2+1+k} & \frac{\sigma_{\theta}^2(1+k)}{t_{i+1}^2\sigma_{\theta}^2+1+k} \end{bmatrix}& \tag{16} \end{align}\] |
Thus, the MSE is given by
\[\begin{align} &\textrm{MSE}(S_{i+1})\propto tr\left[\left(\mathbf{P}_i\right)^{-1}\right] \nonumber\\ &-\frac{1}{t_{i+1}^2\sigma_{\theta}^2+1+k}tr\left[\left(\mathbf{P}_i\right)^{-1}\mathbf{t}_i\mathbf{t}_i^\top\left(\mathbf{P}_i\right)^{-1}\right] +\frac{\sigma_{\theta}^2(1+k)}{t_{i+1}^2\sigma_{\theta}^2+1+k} \nonumber\\ \tag{17} \\ &\propto \frac{\sigma_{\theta}^2(1+k)-\parallel\left(\mathbf{T}_i\mathbf{T}_i^\top+\frac{1}{\sigma_{\theta}^2}\mathbf{I}_{i}\right)^{-1}\mathbf{t}_i\parallel^2}{t_{i+1}^2\sigma_{\theta}^2+(1+k)} \tag{18} \end{align}\] |
where the first term in (17) is irrelevant in finding the minimizing node at the \((i+1)\)-th iteration. \(\tag*{◻}\)
For the performance comparison, we first consider two selection methods previously developed in the assumption of uncorrelated noise (equivalently, \(\mathbf{K}=\sigma^2\mathbf{I}_N\)) denoted by Greedy Sensor Selection (GSS) [3] and by Greedy Sensor Selection based on QR factorization (GSS-QR) in [4], respectively. The former maximizes the log-determinant of \(\mathbf{\Sigma}(S)^{-1}\) while the latter directly minimizes the MSE(S). We also evaluate the method denoted by Greedy Sensor Selection under Correlated Noise (GSS-CN) [7] which maximizes the log-determinant of \(\mathbf{\Sigma}(S)^{-1}\) when correlated noise is considered. In Table 1, the complexity of the selection methods are provided for comparison, showing that the four selection methods yield the same complexity order for the case of \(|S|=p\). In what follows, the selection methods are experimentally evaluated in terms of the estimation performance and the execution time.
4. Experimental Results
We conduct numerical experiments to evaluate the proposed selection algorithm in comparison with previous methods in two cases as follows:
-
Case 1: random matrix \(\mathbf{H}\) with Gaussian iid entries, \(h_{ij} \sim \mathcal{N}(0,0.1^2)\) and covariance matrix \(\mathbf{K}\) constructed from random uniform data distributed over \([0 \quad 1]\)
-
Case 2: random matrix \(\mathbf{H}\) with Gaussian iid entries, \(h_{ij} \sim \mathcal{N}(0,0.1^2)\) and covariance matrix \(\mathbf{K}\) constructed from random normal data \(\sim \mathcal{N}(0,1)\)
We generate 200 measurement vectors \(\mathbf{y}\in \mathbb{R}^{N},N=500\) for each of 50 realizations of \(\mathbf{H}\in\mathbb{R}^{N\times p}\), the covariance matrix \(\mathbf{K}\) and the parameter \(\mathbf{\theta} \sim \mathcal{N}(\mathbf{0},\sigma_{\theta}^2\mathbf{I}_p)\). We construct two types of the covariance matrix \(\mathbf{K}\) using random data which are generated from uniform distribution (Case 1) or normal distribution (Case 2), respectively. The random data are averaged over 5 neighboring nodes to incur correlation between nodes which yields non-ignorable off-diagonal elements of \(\mathbf{K}\). Then, we obtain the set \(S\) with \(|S|=p\) sensor nodes by applying the selection methods such as GSS, GSS-QR, GSS-CN, and the proposed algorithm. Note that since GSS and GSS-QR assume uncorrelated noise, we use the noise covariance matrix as the diagonal matrix with diagonal entries of \(\mathbf{K}\). We generate the parameter vector \(\mathbf{\theta}\) in the range of \(\sigma_{\theta}\) from 0.1 to 0.5 and compute the MSE \(\mathbb{E}\frac{\parallel\mathbf{\theta}-\hat{\mathbf{\theta}}\parallel^2}{\parallel\mathbf{\theta}\parallel^2}\) averaged over the range to validate the proposed algorithm.
We first investigate the estimation performance by varying parameter dimension \(p=|S|=10,15,\dots,40\). The MSEs are plotted in Fig. 1 and 2. As expected, the proposed method outperforms all of the previous methods because it directly optimizes the estimation error in consideration of the impact of correlated noise. It can be noticed that the better estimation accuracy is achieved as the parameter dimension increases because the linear Bayesian estimation with more measurements tends to suppress correlated noise more efficiently. For evaluation of the complexity, we compare the execution time in second. Figure 3 shows the execution times in Case 1 for various parameter dimensions \(p=|S|=10,\ldots,40\). The proposed method offers a reasonable increase in the execution time as compared with GSS-CN while achieving a superior estimation accuracy to the other methods.
5. Conclusions
We proposed a greedy sensor selection method which directly minimizes the estimation error under correlated noise. We manipulated the estimation error by applying the QR and LU factorizations and presented an iterative analytic selection rule. We validate the proposed algorithm through numerical experiments in comparison with the previous novel methods, demonstrating a competitive estimation performance with a reasonable increase in complexity.
Acknowledgments
This study was supported by a research fund from Chosun University, 2023.
References
[1] S. Joshi and S. Boyd, “Sensor selection via convex optimization,” IEEE Trans. on Signal Processing, vol.57, no.2, pp.451-462, 2009.
CrossRef
[2] M. Nareem, S. Xue, and D.C. Lee, “Cross-entropy optimization for sensor selection problems,” International Symposium on Commucation and Information technology (ISCIT), pp.396-401, 2009.
CrossRef
[3] M. Shamaiah, S. Banerjee, and H. Vikalo, “Greedy sensor selection: leveraging submodularity,” 49th IEEE International Conference on Decision and Contron (CDC), pp.2572-2577, Dec. 2010.
CrossRef
[4] Y.H. Kim, “Greedy sensor selection based on QR factorization,” EURASIP journal on Advances in Signal Processing, vol.2021, no.117, pp.1-13, Dec. 2021.
CrossRef
[5] H. Jamali-Rad, A. Simonetto, G. Leus, and X. Ma, “Sparsity-aware sensor selection for correlated noise,” 17th International Conference on Information Fusion (FUSION), pp.1-7, July 2014.
[6] S. Liu, S.P. Chepuri, M. Fardad, E. Masazade, G. Leus, and P.K. Varshney, “Sensor selection for estimation with correlated measurement noise,” IEEE Trans. on Signal Processing, vol.64, no.13, pp.3509-3522, July 2016.
CrossRef
[7] K. Yamada, Y. Saito, K. Nankai, T. Nonomura, K. Asai, and D.Tsubakino, “Fast greedy optimization of sensor selection in measurement with correlated noise,” Mechanical Systems and Signal Processing, vol.158, 107619, Sept. 2021.
CrossRef
[8] Y.H. Kim, “Greedy selection of sensors with measurements under correlated noise,” EURASIP journal on Advances in Signal Processing, vol.2024, no.31, pp.1-14, March 2024.
CrossRef
[9] S.M. Kay, Fundamentals of Statistical Signal Processing: Estimation theory, Prentice Hall, New Jersey, 1993.