A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kazuyuki HARA, Kenji NAKAYAMA, "Training Data Selection Method for Generalization by Multilayer Neural Networks" in IEICE TRANSACTIONS on Fundamentals,
vol. E81-A, no. 3, pp. 374-381, March 1998, doi: .
Abstract: A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e81-a_3_374/_p
Copy
@ARTICLE{e81-a_3_374,
author={Kazuyuki HARA, Kenji NAKAYAMA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Training Data Selection Method for Generalization by Multilayer Neural Networks},
year={1998},
volume={E81-A},
number={3},
pages={374-381},
abstract={A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.},
keywords={},
doi={},
ISSN={},
month={March},}
Copy
TY - JOUR
TI - Training Data Selection Method for Generalization by Multilayer Neural Networks
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 374
EP - 381
AU - Kazuyuki HARA
AU - Kenji NAKAYAMA
PY - 1998
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E81-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 1998
AB - A training data selection method is proposed for multilayer neural networks (MLNNs). This method selects a small number of the training data, which guarantee both generalization and fast training of the MLNNs applied to pattern classification. The generalization will be satisfied using the data locate close to the boundary of the pattern classes. However, if these data are only used in the training, convergence is slow. This phenomenon is analyzed in this paper. Therefore, in the proposed method, the MLNN is first trained using some number of the data, which are randomly selected (Step 1). The data, for which the output error is relatively large, are selected. Furthermore, they are paired with the nearest data belong to the different class. The newly selected data are further paired with the nearest data. Finally, pairs of the data, which locate close to the boundary, can be found. Using these pairs of the data, the MLNNs are further trained (Step 2). Since, there are some variations to combine Steps 1 and 2, the proposed method can be applied to both off-line and on-line training. The proposed method can reduce the number of the training data, at the same time, can hasten the training. Usefulness is confirmed through computer simulation.
ER -