The search functionality is under construction.

IEICE TRANSACTIONS on Information

On Gradient Descent Training Under Data Augmentation with On-Line Noisy Copies

Katsuyuki HAGIWARA

  • Full Text Views

    0

  • Cite this

Summary :

In machine learning, data augmentation (DA) is a technique for improving the generalization performance of models. In this paper, we mainly consider gradient descent of linear regression under DA using noisy copies of datasets, in which noise is injected into inputs. We analyze the situation where noisy copies are newly generated and injected into inputs at each epoch, i.e., the case of using on-line noisy copies. Therefore, this article can also be viewed as an analysis on a method using noise injection into a training process by DA. We considered the training process under three training situations which are the full-batch training under the sum of squared errors, and full-batch and mini-batch training under the mean squared error. We showed that, in all cases, training for DA with on-line copies is approximately equivalent to the l2 regularization training for which variance of injected noise is important, whereas the number of copies is not. Moreover, we showed that DA with on-line copies apparently leads to an increase of learning rate in full-batch condition under the sum of squared errors and the mini-batch condition under the mean squared error. The apparent increase in learning rate and regularization effect can be attributed to the original input and additive noise in noisy copies, respectively. These results are confirmed in a numerical experiment in which we found that our result can be applied to usual off-line DA in an under-parameterization scenario and can not in an over-parametrization scenario. Moreover, we experimentally investigated the training process of neural networks under DA with off-line noisy copies and found that our analysis on linear regression can be qualitatively applied to neural networks.

Publication
IEICE TRANSACTIONS on Information Vol.E106-D No.9 pp.1537-1545
Publication Date
2023/09/01
Publicized
2023/06/12
Online ISSN
1745-1361
DOI
10.1587/transinf.2023EDP7008
Type of Manuscript
PAPER
Category
Artificial Intelligence, Data Mining

Authors

Katsuyuki HAGIWARA
  Mie University

Keyword