The search functionality is under construction.

Author Search Result

[Author] Takafumi KANAMORI(6hit)

1-6hit
  • Constrained Least-Squares Density-Difference Estimation

    Tuan Duong NGUYEN  Marthinus Christoffel DU PLESSIS  Takafumi KANAMORI  Masashi SUGIYAMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:7
      Page(s):
    1822-1829

    We address the problem of estimating the difference between two probability densities. A naive approach is a two-step procedure that first estimates two densities separately and then computes their difference. However, such a two-step procedure does not necessarily work well because the first step is performed without regard to the second step and thus a small error in the first stage can cause a big error in the second stage. Recently, a single-shot method called the least-squares density-difference (LSDD) estimator has been proposed. LSDD directly estimates the density difference without separately estimating two densities, and it was demonstrated to outperform the two-step approach. In this paper, we propose a variation of LSDD called the constrained least-squares density-difference (CLSDD) estimator, and theoretically prove that CLSDD improves the accuracy of density difference estimation for correctly specified parametric models. The usefulness of the proposed method is also demonstrated experimentally.

  • Theoretical Analysis of Density Ratio Estimation

    Takafumi KANAMORI  Taiji SUZUKI  Masashi SUGIYAMA  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E93-A No:4
      Page(s):
    787-798

    Density ratio estimation has gathered a great deal of attention recently since it can be used for various data processing tasks. In this paper, we consider three methods of density ratio estimation: (A) the numerator and denominator densities are separately estimated and then the ratio of the estimated densities is computed, (B) a logistic regression classifier discriminating denominator samples from numerator samples is learned and then the ratio of the posterior probabilities is computed, and (C) the density ratio function is directly modeled and learned by minimizing the empirical Kullback-Leibler divergence. We first prove that when the numerator and denominator densities are known to be members of the exponential family, (A) is better than (B) and (B) is better than (C). Then we show that once the model assumption is violated, (C) is better than (A) and (B). Thus in practical situations where no exact model is available, (C) would be the most promising approach to density ratio estimation.

  • Robust Label Prediction via Label Propagation and Geodesic k-Nearest Neighbor in Online Semi-Supervised Learning

    Yuichiro WADA  Siqiang SU  Wataru KUMAGAI  Takafumi KANAMORI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/04/26
      Vol:
    E102-D No:8
      Page(s):
    1537-1545

    This paper proposes a computationally efficient offline semi-supervised algorithm that yields a more accurate prediction than the label propagation algorithm, which is commonly used in online graph-based semi-supervised learning (SSL). Our proposed method is an offline method that is intended to assist online graph-based SSL algorithms. The efficacy of the tool in creating new learning algorithms of this type is demonstrated in numerical experiments.

  • Multiscale Bagging and Its Applications

    Hidetoshi SHIMODAIRA  Takafumi KANAMORI  Masayoshi AOKI  Kouta MINE  

     
    PAPER

      Vol:
    E94-D No:10
      Page(s):
    1924-1932

    We propose multiscale bagging as a modification of the bagging procedure. In ordinary bagging, the bootstrap resampling is used for generating bootstrap samples. We replace it with the multiscale bootstrap algorithm. In multiscale bagging, the sample size m of bootstrap samples may be altered from the sample size n of learning dataset. For assessing the output of a classifier, we compute bootstrap probability of class label; the frequency of observing a specified class label in the outputs of classifiers learned from bootstrap samples. A scaling-law of bootstrap probability with respect to σ2=n/m has been developed in connection with the geometrical theory. We consider two different ways for using multiscale bagging of classifiers. The first usage is to construct a confidence set of class labels, instead of a single label. The second usage is to find inputs close to decision boundaries in the context of query by bagging for active learning. It turned out, interestingly, that an appropriate choice of m is m =-n, i.e., σ2=-1, for the first usage, and m =∞, i.e., σ2=0, for the second usage.

  • Multiclass Boosting Algorithms for Shrinkage Estimators of Class Probability

    Takafumi KANAMORI  

     
    PAPER-Artificial Intelligence and Cognitive Science

      Vol:
    E90-D No:12
      Page(s):
    2033-2042

    Our purpose is to estimate conditional probabilities of output labels in multiclass classification problems. Adaboost provides highly accurate classifiers and has potential to estimate conditional probabilities. However, the conditional probability estimated by Adaboost tends to overfit to training samples. We propose loss functions for boosting that provide shrinkage estimator. The effect of regularization is realized by shrinkage of probabilities toward the uniform distribution. Numerical experiments indicate that boosting algorithms based on proposed loss functions show significantly better results than existing boosting algorithms for estimation of conditional probabilities.

  • Least-Squares Conditional Density Estimation

    Masashi SUGIYAMA  Ichiro TAKEUCHI  Taiji SUZUKI  Takafumi KANAMORI  Hirotaka HACHIYA  Daisuke OKANOHARA  

     
    PAPER-Pattern Recognition

      Vol:
    E93-D No:3
      Page(s):
    583-594

    Estimating the conditional mean of an input-output relation is the goal of regression. However, regression analysis is not sufficiently informative if the conditional distribution has multi-modality, is highly asymmetric, or contains heteroscedastic noise. In such scenarios, estimating the conditional distribution itself would be more useful. In this paper, we propose a novel method of conditional density estimation that is suitable for multi-dimensional continuous variables. The basic idea of the proposed method is to express the conditional density in terms of the density ratio and the ratio is directly estimated without going through density estimation. Experiments using benchmark and robot transition datasets illustrate the usefulness of the proposed approach.