Training Data Selection Method for Generalization by Multilayer Neural Networks

Kazuyuki HARA; Kenji NAKAYAMA

Training Data Selection Method for Generalization by Multilayer Neural Networks

Kazuyuki HARA, Kenji NAKAYAMA

Full Text Views

0

Cite this

Summary :

A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E81-A No.3 pp.374-381

Publication Date: 1998/03/25

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section PAPER (Special Section of Selected Papers from the 10th Karuizawa Workshop on Circuits and Systems)

Category

Cite this

Copy

Kazuyuki HARA, Kenji NAKAYAMA, "Training Data Selection Method for Generalization by Multilayer Neural Networks" in IEICE TRANSACTIONS on Fundamentals, vol. E81-A, no. 3, pp. 374-381, March 1998, doi: .
Abstract: A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e81-a_3_374/_p

Copy

@ARTICLE{e81-a_3_374,
author={Kazuyuki HARA, Kenji NAKAYAMA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Training Data Selection Method for Generalization by Multilayer Neural Networks},
year={1998},
volume={E81-A},
number={3},
pages={374-381},
abstract={A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.},
keywords={},
doi={},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - Training Data Selection Method for Generalization by Multilayer Neural Networks
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 374
EP - 381
AU - Kazuyuki HARA
AU - Kenji NAKAYAMA
PY - 1998
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E81-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 1998
AB - A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.
ER -