1-4hit |
The natural gradient descent is an optimization method for real-valued neural networks that was proposed from the viewpoint of information geometry. Here, we present an extension of the natural gradient descent to complex-valued neural networks. Our idea is to use the Hermitian extension of the Fisher information matrix. Moreover, we generalize the projected natural gradient (PRONG), which is a fast natural gradient descent algorithm, to complex-valued neural networks. We also consider the advantage of complex-valued neural networks over real-valued neural networks. A useful property of complex numbers in the complex plane is that the rotation is simply expressed by the multiplication. By focusing on this property, we construct the output function of complex-valued neural networks, which is invariant even if the input is changed to its rotated value. Then, our complex-valued neural network can learn rotated data without data augmentation. Finally, through simulation of online character recognition, we demonstrate the effectiveness of the proposed approach.
Seungjin CHOI Andrzej CICHOCKI Liqing ZHANG Shun-ichi AMARI
This paper addresses a maximum likelihood method for source separation in the case of overdetermined mixtures corrupted by additive white Gaussian noise. We consider an approximate likelihood which is based on the Laplace approximation and develop a natural gradient adaptation algorithm to find a local maximum of the corresponding approximate likelihood. We present a detailed mathematical derivation of the algorithm using the Lie group invariance. Useful behavior of the algorithm is verified by numerical experiments.
Information geometry is applied to the manifold of neural networks called multilayer perceptrons. It is important to study a total family of networks as a geometrical manifold, because learning is represented by a trajectory in such a space. The manifold of perceptrons has a rich differential-geometrical structure represented by a Riemannian metric and singularities. An efficient learning method is proposed by using it. The parameter space of perceptrons includes a lot of algebraic singularities, which affect trajectories of learning. Such singularities are studied by using simple models. This poses an interesting problem of statistical inference and learning in hierarchical models including singularities.
Seungjin CHOI Shunichi AMARI Andrzej CICHOCKI
Spatio-temporal decorrelation is the task of eliminating correlations between associated signals in spatial domain as well as in time domain. In this paper, we present a simple but efficient adaptive algorithm for spatio-temporal decorrelation. For the task of spatio-temporal decorrelation, we consider a dynamic recurrent network and calculate the associated natural gradient for the minimization of an appropriate optimization function. The natural gradient based spatio-temporal decorrelation algorithm is applied to the task of blind deconvolution of linear single input multiple output (SIMO) system and its performance is compared to the spatio-temporal anti-Hebbian learning rule.