IEICE global.ieice.org Site

Keyword Search Result

[Keyword] first order optimization(5hit)

1-5hit

Novel Superlinear First Order Algorithms
Peter GECZY Shiro USUI

PAPER-Neural Networks and Bioengineering

Vol:
E87-A No:6
Page(s):
1620-1631
Applying the formerly proposed classification framework for first order line search optimization techniques we introduce novel superlinear first order line search methods. Novelty of the methods lies in the line search subproblem. The presented line search subproblem features automatic step length and momentum adjustments at every iteration of the algorithms realizable in a single step calculation. This keeps the computational complexity of the algorithms linear and does not harm the stability and convergence of the methods. The algorithms have none or linear memory requirements and are shown to be convergent and capable of reaching the superlinear convergence rates. They were practically applied to artificial neural network training and compared to the relevant training methods within the same class. The simulation results show satisfactory performance of the introduced algorithms over the standard and previously proposed methods.
Novel First Order Optimization Classification Framework
Peter GECZY Shiro USUI

PAPER-Numerical Analysis and Optimization

Vol:
E83-A No:11
Page(s):
2312-2319
Numerous scientific and engineering fields extensively utilize optimization techniques for finding appropriate parameter values of models. Various optimization methods are available for practical use. The optimization algorithms are classified primarily due to the rates of convergence. Unfortunately, it is often the case in practice that the particular optimization method with specified convergence rates performs substantially differently on diverse optimization tasks. Theoretical classification of convergence rates then lacks its relevance in the context of the practical optimization. It is therefore desirable to formulate a novel classification framework relevant to the theoretical concept of convergence rates as well as to the practical optimization. This article introduces such classification framework. The proposed classification framework enables specification of optimization techniques and optimization tasks. It also underlies its inherent relationship to the convergence rates. Novel classification framework is applied to categorizing the tasks of optimizing polynomials and the problem of training multilayer perceptron neural networks.
Superlinear Conjugate Gradient Method with Adaptable Step Length and Constant Momentum Term
Peter GECZY Shiro USUI

PAPER-Numerical Analysis and Optimization

Vol:
E83-A No:11
Page(s):
2320-2328
First order line seach optimization techniques gained essential practical importance over second order optimization techniques due to their computational simplicity and low memory requirements. The computational excess of second order methods becomes unbearable for large optimization tasks. The only applicable optimization techniques in such cases are variations of first order approaches. This article presents one such variation of first order line search optimization technique. The presented algorithm has substantially simplified a line search subproblem into a single step calculation of the appropriate value of step length. This remarkably simplifies the implementation and computational complexity of the line search subproblem and yet does not harm the stability of the method. The algorithm is theoretically proven convergent, with superlinear convergence rates, and exactly classified within the formerly proposed classification framework for first order optimization. Performance of the proposed algorithm is practically evaluated on five data sets and compared to the relevant standard first order optimization technique. The results indicate superior performance of the presented algorithm over the standard first order method.
Dynamic Sample Selection: Theory
Peter GECZY Shiro USUI

PAPER-Neural Networks

Vol:
E81-A No:9
Page(s):
1931-1939
Conventional approaches to neural network training do not consider possibility of selecting training samples dynamically during the learning phase. Neural network is simply presented with the complete training set at each iteration of the learning. The learning can then become very costly for large data sets. Huge redundancy of data samples may lead to the ill-conditioned training problem. Ill-conditioning during the training causes rank-deficiencies of error and Jacobean matrices, which results in slower convergence speed, or in the worst case, the failure of the algorithm to progress. Rank-deficiencies of essential matrices can be avoided by an appropriate selection of training exemplars at each iteration of training. This article presents underlying theoretical grounds for dynamic sample selection (DSS), that is mechanism enabling to select a subset of training set at each iteration. Theoretical material is first presented for general objective functions, and then for the objective functions satisfying the Lipschitz continuity condition. Furthermore, implementation specifics of DSS to first order line search techniques are theoretically described.
Dynamic Sample Selection: Implementation
Peter GECZY Shiro USUI

PAPER-Neural Networks

Vol:
E81-A No:9
Page(s):
1940-1947
Computational expensiveness of the training techniques, due to the extensiveness of the data set, is among the most important factors in machine learning and neural networks. Oversized data set may cause rank-deficiencies of Jacobean matrix which plays essential role in training techniques. Then the training becomes not only computationally expensive but also ineffective. In [1] the authors introduced the theoretical grounds for dynamic sample selection having a potential of eliminating rank-deficiencies. This study addresses the implementation issues of the dynamic sample selection based on the theoretical material presented in [1]. The authors propose a sample selection algorithm implementable into an arbitrary optimization technique. An ability of the algorithm to select a proper set of samples at each iteration of the training has been observed to be very beneficial as indicated by several experiments. Recently proposed approaches to sample selection work reasonably well if pattern-weight ratio is close to 1. Small improvements can be detected also at the values of the pattern-weight ratio equal to 2 or 3. The dynamic sample selection approach, presented in this article, can increase the convergence speed of first order optimization techniques, used for training MLP networks, even at the value of the pattern-weight ratio (E-FP) as high as 15 and possibly even more.