1-4hit |
Jaak SIMM Masashi SUGIYAMA Hirotaka HACHIYA
Reinforcement learning (RL) is a flexible framework for learning a decision rule in an unknown environment. However, a large number of samples are often required for finding a useful decision rule. To mitigate this problem, the concept of transfer learning has been employed to utilize knowledge obtained from similar RL tasks. However, most approaches developed so far are useful only in low-dimensional settings. In this paper, we propose a novel transfer learning idea that targets problems with high-dimensional states. Our idea is to transfer knowledge between state factors (e.g., interacting objects) within a single RL task. This allows the agent to learn the system dynamics of the target RL task with fewer data samples. The effectiveness of the proposed method is demonstrated through experiments.
Makoto YAMADA Masashi SUGIYAMA Gordon WICHERN Jaak SIMM
Estimating the ratio of two probability density functions (a.k.a. the importance) has recently gathered a great deal of attention since importance estimators can be used for solving various machine learning and data mining problems. In this paper, we propose a new importance estimation method using a mixture of probabilistic principal component analyzers. The proposed method is more flexible than existing approaches, and is expected to work well when the target importance function is correlated and rank-deficient. Through experiments, we illustrate the validity of the proposed approach.
Jaak SIMM Ildefons MAGRANS DE ABRIL Masashi SUGIYAMA
Multi-task learning is an important area of machine learning that tries to learn multiple tasks simultaneously to improve the accuracy of each individual task. We propose a new tree-based ensemble multi-task learning method for classification and regression (MT-ExtraTrees), based on Extremely Randomized Trees. MT-ExtraTrees is able to share data between tasks minimizing negative transfer while keeping the ability to learn non-linear solutions and to scale well to large datasets.
Makoto YAMADA Masashi SUGIYAMA Gordon WICHERN Jaak SIMM
The least-squares probabilistic classifier (LSPC) is a computationally-efficient alternative to kernel logistic regression. However, to assure its learned probabilities to be non-negative, LSPC involves a post-processing step of rounding up negative parameters to zero, which can unexpectedly influence classification performance. In order to mitigate this problem, we propose a simple alternative scheme that directly rounds up the classifier's negative outputs, not negative parameters. Through extensive experiments including real-world image classification and audio tagging tasks, we demonstrate that the proposed modification significantly improves classification accuracy, while the computational advantage of the original LSPC remains unchanged.