The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] perceptron(35hit)

21-35hit(35hit)

  • Differential and Algebraic Geometry of Multilayer Perceptrons

    Shun-ichi AMARI  Tomoko OZEKI  

     
    INVITED PAPER

      Vol:
    E84-A No:1
      Page(s):
    31-38

    Information geometry is applied to the manifold of neural networks called multilayer perceptrons. It is important to study a total family of networks as a geometrical manifold, because learning is represented by a trajectory in such a space. The manifold of perceptrons has a rich differential-geometrical structure represented by a Riemannian metric and singularities. An efficient learning method is proposed by using it. The parameter space of perceptrons includes a lot of algebraic singularities, which affect trajectories of learning. Such singularities are studied by using simple models. This poses an interesting problem of statistical inference and learning in hierarchical models including singularities.

  • An Active Learning Algorithm Based on Existing Training Data

    Hiroyuki TAKIZAWA  Taira NAKAJIMA  Hiroaki KOBAYASHI  Tadao NAKAMURA  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E83-D No:1
      Page(s):
    90-99

    A multilayer perceptron is usually considered a passive learner that only receives given training data. However, if a multilayer perceptron actively gathers training data that resolve its uncertainty about a problem being learnt, sufficiently accurate classification is attained with fewer training data. Recently, such active learning has been receiving an increasing interest. In this paper, we propose a novel active learning strategy. The strategy attempts to produce only useful training data for multilayer perceptrons to achieve accurate classification, and avoids generating redundant training data. Furthermore, the strategy attempts to avoid generating temporarily useful training data that will become redundant in the future. As a result, the strategy can allow multilayer perceptrons to achieve accurate classification with fewer training data. To demonstrate the performance of the strategy in comparison with other active learning strategies, we also propose an empirical active learning algorithm as an implementation of the strategy, which does not require expensive computations. Experimental results show that the proposed algorithm improves the classification accuracy of a multilayer perceptron with fewer training data than that for a conventional random selection algorithm that constructs a training data set without explicit strategies. Moreover, the algorithm outperforms typical active learning algorithms in the experiments. Those results show that the algorithm can construct an appropriate training data set at lower computational cost, because training data generation is usually costly. Accordingly, the algorithm proves the effectiveness of the strategy through the experiments. We also discuss some drawbacks of the algorithm.

  • Neural Network Model Switching for Efficient Feature Extraction

    Keisuke KAMEYAMA  Yukio KOSUGI  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E82-D No:10
      Page(s):
    1372-1383

    In order to improve the efficiency of the feature extraction of backpropagation (BP) learning in layered neural networks, model switching for changing the function model without altering the map is proposed. Model switching involves map preserving reduction of units by channel fusion, or addition of units by channel installation. For reducing the model size by channel fusion, two criteria for detection of the redundant channels are addressed, and the local link weight compensations for map preservation are formulated. The upper limits of the discrepancies between the maps of the switched models are derived for use as the unified criterion in selecting the switching model candidate. In the experiments, model switching is used during the BP training of a layered network model for image texture classification, to aid its inefficiency of feature extraction. The results showed that fusion and re-installation of redundant channels, weight compensations on channel fusion for map preservation, and the use of the unified criterion for model selection are all effective for improved generalization ability and quick learning. Further, the possibility of using model switching for concurrent optimization of the model and the map will be discussed.

  • Acceleration Techniques for the Network Inversion Algorithm

    Hiroyuki TAKIZAWA  Taira NAKAJIMA  Masaaki NISHI  Hiroaki KOBAYASHI  Tadao NAKAMURA  

     
    LETTER-Bio-Cybernetics and Neurocomputing

      Vol:
    E82-D No:2
      Page(s):
    508-511

    We apply two acceleration techniques for the backpropagation algorithm to an iterative gradient descent algorithm called the network inversion algorithm. Experimental results show that these techniques are also quite effective to decrease the number of iterations required for the detection of input vectors on the classification boundary of a multilayer perceptron.

  • A Flexible Learning Algorithm for Binary Neural Networks

    Atsushi YAMAMOTO  Toshimichi SAITO  

     
    PAPER-Neural Networks

      Vol:
    E81-A No:9
      Page(s):
    1925-1930

    This paper proposes a simple learning algorithm that can realize any boolean function using the three-layer binary neural networks. The algorithm has flexible learning functions. 1) moving "core" for the inputs separations,2) "don't care" settings of the separated inputs. The "don't care" inputs do not affect the successive separations. Performing numerical simulations on some typical examples, we have verified that our algorithm can give less number of hidden layer neurons than those by conventional ones.

  • Broadband Active Noise Control Using a Neural Network

    Casper K. CHEN  Tzi-Dar CHIUEH  Jyh-Horng CHEN  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E81-D No:8
      Page(s):
    855-861

    This paper presents a neural network-based control system for Adaptive Noise Control (ANC). The control system derives a secondary signal to destructively interfere with the original noise to cut down the noise power. This paper begins with an introduction to feedback ANC systems and then describes our adaptive algorithm in detail. Three types of noise signals, recorded in destroyer, F16 airplane and MR imaging room respectively, were then applied to our noise control system which was implemented by software. We obtained an average noise power attenuation of about 20 dB. It was shown that our system performed as well as traditional DSP controllers for narrow-band noise and achieved better results for nonlinear broadband noise problems. In this paper we also present a hardware implementation method for the proposed algorithm. This hardware architecture allows fast and efficient field training in new environments and makes real-time real-life applications possible.

  • Analysis of the Effects of Offset Errors in Neural LSIs

    Fuyuki OKAMOTO  Hachiro YAMADA  

     
    PAPER-Analog Signal Processing

      Vol:
    E80-A No:9
      Page(s):
    1640-1646

    It is well known that offset errors in the multipliers of neural LSIs can have fatal effects on performance. The aim of this study is to understand theoretically how offset errors affect performance of neural LSIs. We have used a single-layer perceptron as an example, and compare our theoretically derived results with computer simulations. We have found that offset errors in the multipliers for the forward process can be canceled out through learning, but those for the updating process cannot be. We have examined the asymptotic behavior of learning for the updating process and derived a mathematical expression for dL, the excess of the averaged loss function L. The derived expression gives us a basis for estimating robustness with respect to the offset errors. Our analysis indicates that dL can be expressed in the form of a quadratic form of offset errors and the inverse of the Hessian matrix of L. We have found that increasing the number of synapses degrades the performacne. We have also learned that enlarging the input signal level and reducing the signal level of the desired response can be effective techniques for reducing the effects of offset errors of the updating process.

  • Combining Local Representative Networks to Improve Learning in Complex Nonlinear Learning Systems

    Goutam CHAKRABORTY  Masayuki SAWADA  Shoichi NOGUCHI  

     
    LETTER

      Vol:
    E80-A No:9
      Page(s):
    1630-1633

    In fully connected Multilayer perceptron (MLP), all the hidden units are activated by samples from the whole input space. For complex problems, due to interference and cross coupling of hidden units' activations, the network needs many hidden units to represent the problem and the error surface becomes highly non-linear. Searching for the minimum is then complex and computationally expensive, and simple gradient descent algorithms usually fail. We propose a network, where the input space is partitioned into local sub-regions. Subsequently, a number of smaller networks are simultaneously trained by overlapping subsets of the input samples. Remarkable improvement of training efficiency as well as generalization performance of this combined network are observed through various simulations.

  • A Sparse Memory Access Architecture for Digital Neural Network LSIs

    Kimihisa AIHARA  Osamu FUJITA  Kuniharu UCHIMURA  

     
    PAPER-Neural Networks and Chips

      Vol:
    E80-C No:7
      Page(s):
    996-1002

    A sparse memory access architecture which is proposed to achieve a high-computational-speed neural-network LSI is described in detail. This architecture uses two key techniques, compressible synapse-weight neuron calculation and differential neuron operation, to reduce the number of accesses to synapse weight memories and the number of neuron calculations without incurring an accuracy penalty. The test chip based on this architecture has 96 parallel data-driven processing units and enough memory for 12,288 synapse weights. In a pattern recognition example, the number of memory accesses and neuron calculations was reduced to 0.87% that needed in the conventional method and the practical performance was 18 GCPS. The sparse memory access architecture is also effective when the synapse weights are stored in off-chip memory.

  • Combining Multiple Classifiers in a Hybrid System for High Performance Chinese Syllable Recognition

    Liang ZHOU  Satoshi IMAI  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E79-D No:11
      Page(s):
    1570-1578

    A multiple classifier system can be a powerful solution for robust pattern recognition. It is expected that the appropriate combination of multiple classifiers may reduce errors, provide robustness, and achieve higher performance. In this paper, high performance Chinese syllable recognition is presented using combinations of multiple classifiers. Chinese syllable recognition is divided into base syllable recognition (disregarding the tones) and recognition of 4 tones. For base syllable recognition, we used a combination of two multisegment vector quantization (MSVQ) classifiers based on different features (instantaneous and transitional features of speech). For tone recognition, vector quantization (VQ) classifier was first used, and was comparable to multilayer perceptron (MLP) classifier. To get robust or better performance, a combination of distortion-based classifier (VQ) and discriminant-based classifier (MLP) is proposed. The evaluations have been carried out using standard syllable database CRDB in China, and experimental results have shown that combination of multiple classifiers with different features or different methodologies can improve recognition performance. Recognition accuracy for base syllable, tone, and tonal syllable is 96.79%, 99.82% and 96.24% respectively. Since these results were evaluated on a standard database, they can be used as a benchmark that allows direct comparison against other approaches.

  • Fractal Connection Structure: A Simple Way to lmprove Generalization in Nonlinear Learning Systems

    Basabi CHAKRABORTY  Yasuji SAWADA  

     
    PAPER-Neural Nets and Human Being

      Vol:
    E79-A No:10
      Page(s):
    1618-1623

    The capability of generalization is the most desirable property of a learning system. It is well known that to achieve good generalization, the complexity of the system should match the intrinsic complexity of the problem to be learned. In this work, introduction of fractal connection structure in nonlinear learning systems like multilayer perceptrons as a means of improving its generalization capability in classification problems has been investigated via simulation on sonar data set in underwater target classification problem. It has been found that fractally connected net has better generalization capability compared to the fully connected net and a randomly connected net of same average connectivity for proper choice of fractal dimension which controlls the average connectivity of the net.

  • Geometry of Admissible Parameter Region in Neural Learning

    Kazushi IKEDA  Shun-Ichi AMARI  

     
    PAPER-Neural Networks

      Vol:
    E79-A No:6
      Page(s):
    938-943

    In general, a learning machine will behave better as the number of training examples increases. It is important to know how fast and how well the behavior is improved. The average prediction error, the average of the probability that the trained machine mispredicts the output signal, is one of the most popular criteria to see the behavior. However, it is not easy to evaluate the average prediction error even in the simplest case, that is, the linear dichotomy (perceptron) case. When a continuous deterministic dichotomy machine is trained by t examples of input-output pairs produced from a realizable teacher, these examples limits the region of the parameter space which includes the true parameter. Any parameter in the region can explain the input-output behaviors of the examples. Such a region, celled the admissible region, forms in general a (curved) polyhedron in the parameter space, and it becomes smaller and smaller as the number of examples increases. The present paper studies the shape and volume of the admissible region. We use the stochastic geometrical approach to this problem. We have studied the stochastic geometrical features of the admissible region using the fact that it is dual to the convex hull the examples make in the example space. Since the admissible region is related to the average prediction error of the linear dichotomy, we derived the new upper and lower bounds of the average prediction error.

  • Automatic Language Identification Using Sequential Information of Phonemes

    Takayuki ARAI  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    705-711

    In this paper approaches to language identification based on the sequential information of phonemes are described. These approaches assume that each language can be identified from its own phoneme structure, or phonotactics. To extract this phoneme structure, we use phoneme classifiers and grammars for each language. The phoneme classifier for each language is implemented as a multi-layer perceptron trained on quasi-phonetic hand-labeled transcriptions. After training the phoneme classifiers, the grammars for each language are calculated as a set of transition probabilities for each phoneme pair. Because of the interest in automatic language identification for worldwide voice communication, we decided to use telephone speech for this study. The data for this study were drawn from the OGI (Oregon Graduate Institute)-TS (telephone speech) corpus, a standard corpus for this type of research. To investigate the basic issues of this approach, two languages, Japanese and English, were selected. The language classification algorithms are based on Viterbi search constrained by a bigram grammar and by minimum and maximum durations. Using a phoneme classifier trained only on English phonemes, we achieved 81.1% accuracy. We achieved 79.3% accuracy using a phoneme classifier trained on Japanese phonemes. Using both the English and the Japanese phoneme classifiers together, we obtained our best result: 83.3%. Our results were comparable to those obtained by other methods such as that based on the hidden Markov model.

  • Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech

    Thanh Tung LE  John MASON  Tadashi KITAMURA  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    744-750

    A multi-layer perceptron (MLP) acting directly in the time-domain is applied as a speech signal enhancer, and the performance examined in the context of three common classes of degradation, namely low bit-rate CELP degradation is non-linear system degradation, additive noise, and convolution by a linear system. The investigation focuses on two topics: (i) the influence of non-linearities within the network and (ii) network topology, comparing single and multiple output structures. The objective is to examine how these characteristics influence network performance and whether this depends on the class of degradation. Experimental results show the importance of matching the enhancer to the class of degradation. In the case of the CELP coder the standard MLP with its inherently non-linear characteristics is shown to be consistently better than any equivalent linear structure (up to 3.2 dB compared with 1.6 dB SNR improvement). In contrast, when the degradation is from additive noise, a linear enhancer is always, superior.

  • A Trial Multilayer Perceptron Neural Network for ATM Connection Admission Control

    Sang Hyuk KANG  Dan Keun SUNG  

     
    PAPER

      Vol:
    E76-B No:3
      Page(s):
    258-262

    Future broadband ATM networks are expected to accommodate various kinds of multi-media services with different traffic characteristics and quality of service (QOS) requirements. However, it is very difficult to control traffic by conventional mechanisms in this complex traffic environment. As an alternative approach, a multilayer perceptron neural network model is proposed as an intelligent control mechanism like "a traffic control policeman" in order to perform ATM connection admission control. This proposed neural control model is analyzed by computer simulations in a homogeneous and heterogeneous traffic environment and the result shows the effectiveness of this intelligent control mechanism, compared with that of an analytical method.

21-35hit(35hit)