1-11hit |
Jiansheng BAI Jinjie YAO Yating HOU Zhiliang YANG Liming WANG
Modulated signal detection has been rapidly advancing in various wireless communication systems as it's a core technology of spectrum sensing. To address the non-Gaussian statistical of noise in radio channels, especially its pulse characteristics in the time/frequency domain, this paper proposes a method based on Information Geometric Difference Mapping (IGDM) to solve the signal detection problem under Alpha-stable distribution (α-stable) noise and improve performance under low Generalized Signal-to-Noise Ratio (GSNR). Scale Mixtures of Gaussians is used to approximate the probability density function (PDF) of signals and model the statistical moments of observed data. Drawing on the principles of information geometry, we map the PDF of different types of data into manifold space. Through the application of statistical moment models, the signal is projected as coordinate points within the manifold structure. We then design a dual-threshold mechanism based on the geometric mean and use Kullback-Leibler divergence (KLD) to measure the information distance between coordinates. Numerical simulations and experiments were conducted to prove the superiority of IGDM for detecting multiple modulated signals in non-Gaussian noise, the results show that IGDM has adaptability and effectiveness under extremely low GSNR.
Learning for boltzmann machines deals with each state individually. If given data is categorized, the probabilities have to be distributed to each state, not to each catetory. We propose boltzmann machines identifying the states in the same categories. Boltzmann machines with hidden units are the special cases. Boltzmann learning and em algorithm are effective learning methods for boltzmann machines. We solve boltzmann learning and em algorithm for the proposed models.
The family of Quasi-Additive (QA) algorithms is a natural generalization of the perceptron learning, which is a kind of on-line learning having two parameter vectors: One is an accumulation of input vectors and the other is a weight vector for prediction associated with the former by a nonlinear function. We show that the vectors have a dually-flat structure from the information-geometric point of view, and this representation makes it easier to discuss the convergence properties.
Independent component analysis (ICA) is a new method of extracting independent components from multivariate data. It can be applied to various fields such as vision and auditory signal analysis, communication systems, and biomedical and brain engineering. There have been proposed a number of algorithms. The present article shows that most of them use estimating functions from the statistical point of view, and give a unified theory, based on information geometry, to elucidate the efficiency and stability of the algorithms. This gives new efficient adaptive algorithms useful for various problems.
We study a class of nonlinear dynamical systems to develop efficient algorithms. As an efficient algorithm, interior point method based on Newton's method is well-known for solving convex programming problems which include linear, quadratic, semidefinite and lp-programming problems. On the other hand, the geodesic of information geometry is represented by a continuous Newton's method for minimizing a convex function called divergence. Thus, we discuss a relation between information geometry and convex programming in a related family of continuous Newton's method. In particular, we consider the α-projection problem from a given data onto an information geometric submanifold spanned with power-functions. In general, an information geometric structure can be induced from a standard convex programming problem. In contrast, the correspondence from information geometry to convex programming is slightly complicated. We first present there exists a same structure between the α-projection and semidefinite programming problems. The structure is based on the linearities or autoparallelisms in the function space and the space of matrices, respectively. However, the α-projection problem is not a form of convex programming. Thus, we reformulate it to a lp-programming and the related ones. For the reformulated problems, we derive self-concordant barrier functions according to the values of α. The existence of a polynomial time algorithm is theoretically confirmed for the problem. Furthermore, we present the coincidence with the gradient vectors for the divergence and a modified barrier function. These results connect a part of nonlinear and algorithm theories by the discreteness of variables.
Information geometry is applied to the manifold of neural networks called multilayer perceptrons. It is important to study a total family of networks as a geometrical manifold, because learning is represented by a trajectory in such a space. The manifold of perceptrons has a rich differential-geometrical structure represented by a Riemannian metric and singularities. An efficient learning method is proposed by using it. The parameter space of perceptrons includes a lot of algebraic singularities, which affect trajectories of learning. Such singularities are studied by using simple models. This poses an interesting problem of statistical inference and learning in hierarchical models including singularities.
This paper surveys recent progress in the investigation of the underlying discrete proximity structures of geometric clustering with respect to the divergence in information geometry. Geometric clustering with respect to the divergence provides powerful unsupervised learning algorithms, and can be applied to classifying and obtaining generalizations of complex objects represented in the feature space. The proximity relation, defined by the Voronoi diagram by the divergence, plays an important role in the design and analysis of such algorithms.
From an information geometric viewpoint, we investigate a characteristic of the submanifold of a mixture or exponential family in the manifold of finite discrete distributions. Using the characteristic, we derive a direct calculation method for an em-geodesic in the submanifold. In this method, the value of the primal parameter on the geodesic can be obtained without iterations for a gradient system which represents the geodesic. We also derive the similar algorithms for both problems of parameter estimation and functional extension of the submanifold for a data in the ambient manifold. These theoretical approaches from geometric analysis will contribute to the development of an efficient algorithm in computational complexity.
The importance sampling simulation technique has been exploited to obtain an accurate estimate for a very small probability which is not tractable by the ordinary Monte Carlo simulation. In this paper, we will investigate the simulation for a sample average of an output sequence from a Markov chain. The optimal simulation distribution will be characterized by the Kullback-Leibler divergence of Markov chains and geometric properties of the importance sampling simulation will be presented. As a result, an effective computation method for the optimal simulation distribution will be obtained.
The mean field theory has been recognized as offering an efficient computational framework in solving discrete optimization problems by neural networks. This paper gives a formulation based on the information geometry to the mean field theory, and makes clear from the information-theoretic point of view the meaning of the mean field theory as a method of approximating a given probability distribution. The geometrical interpretation of the phase transition observed in the mean field annealing is shown on the basis of this formulation. The discussion of the standard mean field theory is extended to introduce a more general computational framework, which we call the generalized mean field theory.
Information geometry is a new powerful method of information sciences. Information geometry is applied to manifolds of neural networks of various architectures. Here is proposed a new theoretical approach to the manifold consisting of feedforward neural networks, the manifold of Boltzmann machines and the manifold of neural networks of recurrent connections. This opens a new direction of studies on a family of neural networks, not a study of behaviors of single neural networks.