IEICE global.ieice.org Site

Author Search Result

[Author] Susumu KUROYANAGI(8hit)

1-8hit

Auditory Pulse Neural Network Model to Extract the Inter-Aural Time and Level Difference for Sound Localization
Susumu KUROYANAGI Akira IWATA

PAPER-Audition

Vol:
E77-D No:4
Page(s):
466-474
A novel pulse neural network model for sound localization has been proposed. Our model is based on the physiological auditory nervous system. Human beings can perceive the sound direction using inter-aural time difference (ILD) and inter-aural level difference (ILD) of two sounds. The model extracts these features using only pulse train information. The model is divided roughly into three sections: preprocessing for input signals; transforming continuous signals to pulse trains; and extracting features. The last section consists of two parts: ITD extractor and ILD extractor. Both extractors are implemented using a pulse neuron model. They have the same network structure, differing only in terms of parameters and arrangements of the pulse neuron model. The pulse neuron model receives pulse trains and outputs a pulse train. Because the pulses have only simple informations, their data structures are very simple and clear. Thus, a strict design is not required for the implementation of the model. These advantages are profitable for realizing this model by hardware. A computer simulation has demonstrated that time and level differences between two signals have been successfully extracted by the model.
A Solution for Imbalanced Training Sets Problem by CombNET-II and Its Application on Fog Forecasting
Anto Satriyo NUGROHO Susumu KUROYANAGI Akira IWATA

PAPER-Biocybernetics, Neurocomputing

Vol:
E85-D No:7
Page(s):
1165-1174
Studies on artificial neural network have been conducted for a long time, and its contribution has been shown in many fields. However, the application of neural networks in the real world domain is still a challenge, since nature does not always provide the required satisfactory conditions. One example is the class size imbalanced condition in which one class is heavily under-represented compared to another class. This condition is often found in the real world domain and presents several difficulties for algorithms that assume the balanced condition of the classes. In this paper, we propose a method for solving problems posed by imbalanced training sets by applying the modified large-scale neural network "CombNET-II. " CombNET-II consists of two types of neural networks. The first type is a one-layer vector quantization neural network to turn the problem into a more balanced condition. The second type consists of several modules of three-layered multilayer perceptron trained by backpropagation for finer classification. CombNET-II combines the two types of neural networks to solve the problem effectively within a reasonable time. The performance is then evaluated by turning the model into a practical application for a fog forecasting problem. Fog forecasting is an imbalanced training sets problem, since the probability of fog appearance in the observation location is very low. Fog events should be predicted every 30 minutes based on the observation of meteorological conditions. Our experiments showed that CombNET-II could achieve a high prediction rate compared to the k-nearest neighbor classifier and the three-layered multilayer perceptron trained with BP. Part of this research was presented in the 1999 Fog Forecasting Contest sponsored by Neurocomputing Technical Group of IEICE, Japan, and CombNET-II achieved the highest accuracy among the participants.
Design of a Compact Sound Localization Device on a Stand-Alone FPGA-Based Platform
Mauricio KUGLER Teemu TOSSAVAINEN Susumu KUROYANAGI Akira IWATA

PAPER-Computer System

Pubricized:
2016/07/26
Vol:
E99-D No:11
Page(s):
2682-2693
Sound localization systems are widely studied and have several potential applications, including hearing aid devices, surveillance and robotics. However, few proposed solutions target portable systems, such as wearable devices, which require a small unnoticeable platform, or unmanned aerial vehicles, in which weight and low power consumption are critical aspects. The main objective of this research is to achieve real-time sound localization capability in a small, self-contained device, without having to rely on large shaped platforms or complex microphone arrays. The proposed device has two surface-mount microphones spaced only 20 mm apart. Such reduced dimensions present challenges for the implementation, as differences in level and spectra become negligible, and only time-difference of arrival (TDoA) can be used as a localization cue. Three main issues have to be addressed in order to accomplish these objectives. To achieve real-time processing, the TDoA is calculated using zero-crossing spikes applied to the hardware-friendly Jeffers model. In order to make up for the reduction in resolution due to the small dimensions, the signal is upsampled several-fold within the system. Finally, a coherence-based spectral masking is used to select only frequency components with relevant TDoA information. The proposed system was implemented on a field-programmable gate array (FPGA) based platform, due to the large amount of concurrent and independent tasks, which can be efficiently parallelized in reconfigurable hardware devices. Experimental results with white-noise and environmental sounds show high accuracies for both anechoic and reverberant conditions.
An Approach for Sound Source Localization by Complex-Valued Neural Network
Hirofumi TSUZUKI Mauricio KUGLER Susumu KUROYANAGI Akira IWATA

PAPER-Biocybernetics, Neurocomputing

Vol:
E96-D No:10
Page(s):
2257-2265
This paper presents a Complex-Valued Neural Network-based sound localization method. The proposed approach uses two microphones to localize sound sources in the whole horizontal plane. The method uses time delay and amplitude difference to generate a set of features which are then classified by a Complex-Valued Multi-Layer Perceptron. The advantage of using complex values is that the amplitude information can naturally masks the phase information. The proposed method is analyzed experimentally with regard to the spectral characteristics of the target sounds and its tolerance to noise. The obtained results emphasize and confirm the advantages of using Complex-Valued Neural Networks for the sound localization problem in comparison to the traditional Real-Valued Neural Network model.
Real-Time Hardware Implementation of a Sound Recognition System with In-Field Learning
Mauricio KUGLER Teemu TOSSAVAINEN Miku NAKATSU Susumu KUROYANAGI Akira IWATA

PAPER-Speech and Hearing

Pubricized:
2016/03/30
Vol:
E99-D No:7
Page(s):
1885-1894
The development of assistive devices for automated sound recognition is an important field of research and has been receiving increased attention. However, there are still very few methods specifically developed for identifying environmental sounds. The majority of the existing approaches try to adapt speech recognition techniques for the task, usually incurring high computational complexity. This paper proposes a sound recognition method dedicated to environmental sounds, designed with its main focus on embedded applications. The pre-processing stage is loosely based on the human hearing system, while a robust set of binary features permits a simple k-NN classifier to be used. This gives the system the capability of in-field learning, by which new sounds can be simply added to the reference set in real-time, greatly improving its usability. The system was implemented in an FPGA based platform, developed in-house specifically for this application. The design of the proposed method took into consideration several restrictions imposed by the hardware, such as limited computing power and memory, and supports up to 12 reference sounds of around 5.3 s each. Experimental results were performed in a database of 29 sounds. Sensitivity and specificity were evaluated over several random subsets of these signals. The obtained values for sensitivity and specificity, without additional noise, were, respectively, 0.957 and 0.918. With the addition of +6 dB of pink noise, sensitivity and specificity were 0.822 and 0.942, respectively. The in-field learning strategy presented no significant change in sensitivity and a total decrease of 5.4% in specificity when progressively increasing the number of reference sounds from 1 to 9 under noisy conditions. The minimal signal-to-noise ration required by the prototype to correctly recognize sounds was between -8 dB and 3 dB. These results show that the proposed method and implementation have great potential for several real life applications.
A Character-Based Postprocessing System for Handwritten Japanese Address Recognition
Keiji YAMANAKA Susumu KUROYANAGI Akira IWATA

PAPER-Image Processing,Computer Graphics and Pattern Recognition

Vol:
E82-D No:2
Page(s):
468-474
Based on a previous work on handwritten Japanese kanji character recognition, a postprocessing system for handwritten Japanese address recognition is proposed. Basically, the recognition system is composed of CombNET-II, a general-purpose large-scale character recognizer and MMVA, a modified majority voting system. Beginning with a set of character candidates, produced by a character recognizer for each character that composes the input word and a lexicon, an interpretation to the input word is generated. MMVA is used in the postprocessing stage to select the interpretation that accumulates the highest score. In the case of more than one possible interpretation, the Conflict Analyzing System calls the character recognizer again to generate scores for each character that composes each interpretation to determine the final output word. The proposed word recognition system was tested with 2 sets of handwritten Japanese city names, and recognition rates higher than 99% were achieved, demonstrating the effectiveness of the method.
CombNET-III with Nonlinear Gating Network and Its Application in Large-Scale Classification Problems
Mauricio KUGLER Susumu KUROYANAGI Anto Satriyo NUGROHO Akira IWATA

PAPER-Pattern Recognition

Vol:
E91-D No:2
Page(s):
286-295
Modern applications of pattern recognition generate very large amounts of data, which require large computational effort to process. However, the majority of the methods intended for large-scale problems aim to merely adapt standard classification methods without considering if those algorithms are appropriated for large-scale problems. CombNET-II was one of the first methods specifically proposed for such kind of a task. Recently, an extension of this model, named CombNET-III, was proposed. The main modifications over the previous model was the substitution of the expert networks by Support Vectors Machines (SVM) and the development of a general probabilistic framework. Although the previous model's performance and flexibility were improved, the low accuracy of the gating network was still compromising CombNET-III's classification results. In addition, due to the use of SVM based experts, the computational complexity is higher than CombNET-II. This paper proposes a new two-layered gating network structure that reduces the compromise between number of clusters and accuracy, increasing the model's performance with only a small complexity increase. This high-accuracy gating network also enables the removal the low confidence expert networks from the decoding procedure. This, in addition to a new faster strategy for calculating multiclass SVM outputs significantly reduced the computational complexity. Experimental results of problems with large number of categories show that the proposed model outperforms the original CombNET-III, while presenting a computational complexity more than one order of magnitude smaller. Moreover, when applied to a database with a large number of samples, it outperformed all compared methods, confirming the proposed model's flexibility.
CombNET-III: A Support Vector Machine Based Large Scale Classifier with Probabilistic Framework
Mauricio KUGLER Susumu KUROYANAGI Anto Satriyo NUGROHO Akira IWATA

PAPER-Pattern Recognition

Vol:
E89-D No:9
Page(s):
2533-2541
Several research fields have to deal with very large classification problems, e.g. handwritten character recognition and speech recognition. Many works have proposed methods to address problems with large number of samples, but few works have been done concerning problems with large numbers of classes. CombNET-II was one of the first methods proposed for such a kind of task. It consists of a sequential clustering VQ based gating network (stem network) and several Multilayer Perceptron (MLP) based expert classifiers (branch networks). With the objectives of increasing the classification accuracy and providing a more flexible model, this paper proposes a new model based on the CombNET-II structure, the CombNET-III. The new model, intended for, but not limited to, problems with large number of classes, replaces the branch networks MLP with multiclass Support Vector Machines (SVM). It also introduces a new probabilistic framework that outputs posterior class probabilities, enabling the model to be applied in different scenarios (e.g. together with Hidden Markov Models). These changes permit the use of a larger number of smaller clusters, which reduce the complexity of the final classifiers. Moreover, the use of binary SVM with probabilistic outputs and a probabilistic decoding scheme permit the use of a pairwise output encoding on the branch networks, which reduces the computational complexity of the training stage. The experimental results show that the proposed model outperforms both the previous model CombNET-II and a single multiclass SVM, while presenting considerably smaller complexity than the latter. It is also confirmed that CombNET-III classification accuracy scales better with the increasing number of clusters, in comparison with CombNET-II.

Author Search Result

[Author] Susumu KUROYANAGI(8hit)

Auditory Pulse Neural Network Model to Extract the Inter-Aural Time and Level Difference for Sound Localization

A Solution for Imbalanced Training Sets Problem by CombNET-II and Its Application on Fog Forecasting

Design of a Compact Sound Localization Device on a Stand-Alone FPGA-Based Platform

An Approach for Sound Source Localization by Complex-Valued Neural Network

Real-Time Hardware Implementation of a Sound Recognition System with In-Field Learning

A Character-Based Postprocessing System for Handwritten Japanese Address Recognition

CombNET-III with Nonlinear Gating Network and Its Application in Large-Scale Classification Problems

CombNET-III: A Support Vector Machine Based Large Scale Classifier with Probabilistic Framework

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles