The search functionality is under construction.

Keyword Search Result

[Keyword] generalization capability(8hit)

1-8hit
  • A New Meta-Criterion for Regularized Subspace Information Criterion

    Yasushi HIDAKA  Masashi SUGIYAMA  

     
    PAPER-Pattern Recognition

      Vol:
    E90-D No:11
      Page(s):
    1779-1786

    In order to obtain better generalization performance in supervised learning, model parameters should be determined appropriately, i.e., they should be determined so that the generalization error is minimized. However, since the generalization error is inaccessible in practice, the model parameters are usually determined so that an estimator of the generalization error is minimized. The regularized subspace information criterion (RSIC) is such a generalization error estimator for model selection. RSIC includes an additional regularization parameter and it should be determined appropriately for better model selection. A meta-criterion for determining the regularization parameter has also been proposed and shown to be useful in practice. In this paper, we show that there are several drawbacks in the existing meta-criterion and give an alternative meta-criterion that can solve the problems. Through simulations, we show that the use of the new meta-criterion further improves the model selection performance.

  • Generalization Error Estimation for Non-linear Learning Methods

    Masashi SUGIYAMA  

     
    LETTER-Neural Networks and Bioengineering

      Vol:
    E90-A No:7
      Page(s):
    1496-1499

    Estimating the generalization error is one of the key ingredients of supervised learning since a good generalization error estimator can be used for model selection. An unbiased generalization error estimator called the subspace information criterion (SIC) is shown to be useful for model selection, but its range of application is limited to linear learning methods. In this paper, we extend SIC to be applicable to non-linear learning.

  • Analytic Optimization of Shrinkage Parameters Based on Regularized Subspace Information Criterion

    Masashi SUGIYAMA  Keisuke SAKURAI  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E89-A No:8
      Page(s):
    2216-2225

    For obtaining a higher level of generalization capability in supervised learning, model parameters should be optimized, i.e., they should be determined in such a way that the generalization error is minimized. However, since the generalization error is inaccessible in practice, model parameters are usually determined in such a way that an estimate of the generalization error is minimized. A standard procedure for model parameter optimization is to first prepare a finite set of candidates of model parameter values, estimate the generalization error for each candidate, and then choose the best one from the candidates. If the number of candidates is increased in this procedure, the optimization quality may be improved. However, this in turn increases the computational cost. In this paper, we give methods for analytically finding the optimal model parameter value from a set of infinitely many candidates. This maximally enhances the optimization quality while the computational cost is kept reasonable.

  • Active Learning with Model Selection -- Simultaneous Optimization of Sample Points and Models for Trigonometric Polynomial Models

    Masashi SUGIYAMA  Hidemitsu OGAWA  

     
    PAPER-Pattern Recognition

      Vol:
    E86-D No:12
      Page(s):
    2753-2763

    In supervised learning, the selection of sample points and models is crucial for acquiring a higher level of the generalization capability. So far, the problems of active learning and model selection have been independently studied. If sample points and models are simultaneously optimized, then a higher level of the generalization capability is expected. We call this problem active learning with model selection. However, active learning with model selection can not be generally solved by simply combining existing active learning and model selection techniques because of the active learning/model selection dilemma: the model should be fixed for selecting sample points and conversely the sample points should be fixed for selecting models. In this paper, we show that the dilemma can be dissolved if there is a set of sample points that is optimal for all models in consideration. Based on this idea, we give a practical procedure for active learning with model selection in trigonometric polynomial models. The effectiveness of the proposed procedure is demonstrated through computer simulations.

  • Improving Precision of the Subspace Information Criterion

    Masashi SUGIYAMA  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E86-A No:7
      Page(s):
    1885-1895

    Evaluating the generalization performance of learning machines without using additional test samples is one of the most important issues in the machine learning community. The subspace information criterion (SIC) is one of the methods for this purpose, which is shown to be an unbiased estimator of the generalization error with finite samples. Although the mean of SIC agrees with the true generalization error even in small sample cases, the scatter of SIC can be large under some severe conditions. In this paper, we therefore investigate the causes of degrading the precision of SIC, and discuss how its precision could be improved.

  • Incremental Construction of Projection Generalizing Neural Networks

    Masashi SUGIYAMA  Hidemitsu OGAWA  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E85-D No:9
      Page(s):
    1433-1442

    In many practical situations in NN learning, training examples tend to be supplied one by one. In such situations, incremental learning seems more natural than batch learning in view of the learning methods of human beings. In this paper, we propose an incremental learning method in neural networks under the projection learning criterion. Although projection learning is a linear learning method, achieving the above goal is not straightforward since it involves redundant expressions of functions with over-complete bases, which is essentially related to pseudo biorthogonal bases (or frames). The proposed method provides exactly the same learning result as that obtained by batch learning. It is theoretically shown that the proposed method is more efficient in computation than batch learning.

  • Active Learning for Optimal Generalization in Trigonometric Polynomial Models

    Masashi SUGIYAMA  Hidemitsu OGAWA  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E84-A No:9
      Page(s):
    2319-2329

    In this paper, we consider the problem of active learning, and give a necessary and sufficient condition of sample points for the optimal generalization capability. By utilizing the properties of pseudo orthogonal bases, we clarify the mechanism of achieving the optimal generalization capability. We also show that the condition does not only provide the optimal generalization capability but also reduces the computational complexity and memory required to calculate learning result functions. Based on the optimality condition, we give design methods of optimal sample points for trigonometric polynomial models. Finally, the effectiveness of the proposed active learning method is demonstrated through computer simulations.

  • Self-Tuning of Fuzzy Reasoning by the Steepest Descent Method and Its Application to a Parallel Parking

    Hitoshi MIYATA  Makoto OHKI  Masaaki OHKITA  

     
    PAPER-Algorithm and Computational Complexity

      Vol:
    E79-D No:5
      Page(s):
    561-569

    For a fuzzy control of manipulated variable so as to match a required output of a plant, tuning of fuzzy rules are necessary. For its purpose, various methods to tune their rules automatically have been proposed. In these method, some of them necessitate much time for its tuning, and the others are lacking in the generalization capability. In the fuzzy control by the steepest descent method, a use of piecewise linear membership functions (MSFs) has been proposed. In this algorithm, MSFs of the premise for each fuzzy rule are tuned having no relation to the other rules. Besides, only the MSFs corresponding to the given input and output data for the learning can be tuned efficiently. Comparing with the conventional triangular form and the Gaussian distribution of MSFs, an expansion of the expressiveness is indicated. As a result, for constructing the inference rules, the training cycles can be reduced in number and the generalization capability to express the behavior of a plant is expansible. An effectiveness of this algorithm is illustrated with an example of a parallel parking of an autonomous mobile robot.