The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] squared-loss mutual information(3hit)

1-3hit
  • Computationally Efficient Estimation of Squared-Loss Mutual Information with Multiplicative Kernel Models

    Tomoya SAKAI  Masashi SUGIYAMA  

     
    LETTER-Fundamentals of Information Systems

      Vol:
    E97-D No:4
      Page(s):
    968-971

    Squared-loss mutual information (SMI) is a robust measure of the statistical dependence between random variables. The sample-based SMI approximator called least-squares mutual information (LSMI) was demonstrated to be useful in performing various machine learning tasks such as dimension reduction, clustering, and causal inference. The original LSMI approximates the pointwise mutual information by using the kernel model, which is a linear combination of kernel basis functions located on paired data samples. Although LSMI was proved to achieve the optimal approximation accuracy asymptotically, its approximation capability is limited when the sample size is small due to an insufficient number of kernel basis functions. Increasing the number of kernel basis functions can mitigate this weakness, but a naive implementation of this idea significantly increases the computation costs. In this article, we show that the computational complexity of LSMI with the multiplicative kernel model, which locates kernel basis functions on unpaired data samples and thus the number of kernel basis functions is the sample size squared, is the same as that for the plain kernel model. We experimentally demonstrate that LSMI with the multiplicative kernel model is more accurate than that with plain kernel models in small sample cases, with only mild increase in computation time.

  • Feature Selection via 1-Penalized Squared-Loss Mutual Information

    Wittawat JITKRITTUM  Hirotaka HACHIYA  Masashi SUGIYAMA  

     
    PAPER-Pattern Recognition

      Vol:
    E96-D No:7
      Page(s):
    1513-1524

    Feature selection is a technique to screen out less important features. Many existing supervised feature selection algorithms use redundancy and relevancy as the main criteria to select features. However, feature interaction, potentially a key characteristic in real-world problems, has not received much attention. As an attempt to take feature interaction into account, we propose 1-LSMI, an 1-regularization based algorithm that maximizes a squared-loss variant of mutual information between selected features and outputs. Numerical results show that 1-LSMI performs well in handling redundancy, detecting non-linear dependency, and considering feature interaction.

  • Least-Squares Independence Test

    Masashi SUGIYAMA  Taiji SUZUKI  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E94-D No:6
      Page(s):
    1333-1336

    Identifying the statistical independence of random variables is one of the important tasks in statistical data analysis. In this paper, we propose a novel non-parametric independence test based on a least-squares density ratio estimator. Our method, called least-squares independence test (LSIT), is distribution-free, and thus it is more flexible than parametric approaches. Furthermore, it is equipped with a model selection procedure based on cross-validation. This is a significant advantage over existing non-parametric approaches which often require manual parameter tuning. The usefulness of the proposed method is shown through numerical experiments.