We consider the problem of learning a classifier using only positive and unlabeled samples. In this setting, it is known that a classifier can be successfully learned if the class prior is available. However, in practice, the class prior is unknown and thus must be estimated from data. In this paper, we propose a new method to estimate the class prior by partially matching the class-conditional density of the positive class to the input density. By performing this partial matching in terms of the Pearson divergence, which we estimate directly without density estimation via lower-bound maximization, we can obtain an analytical estimator of the class prior. We further show that an existing class prior estimation method can also be interpreted as performing partial matching under the Pearson divergence, but in an indirect manner. The superiority of our direct class prior estimation method is illustrated on several benchmark datasets.
Marthinus Christoffel DU PLESSIS
Tokyo Institute of Technology
Masashi SUGIYAMA
Tokyo Institute of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Marthinus Christoffel DU PLESSIS, Masashi SUGIYAMA, "Class Prior Estimation from Positive and Unlabeled Data" in IEICE TRANSACTIONS on Information,
vol. E97-D, no. 5, pp. 1358-1362, May 2014, doi: 10.1587/transinf.E97.D.1358.
Abstract: We consider the problem of learning a classifier using only positive and unlabeled samples. In this setting, it is known that a classifier can be successfully learned if the class prior is available. However, in practice, the class prior is unknown and thus must be estimated from data. In this paper, we propose a new method to estimate the class prior by partially matching the class-conditional density of the positive class to the input density. By performing this partial matching in terms of the Pearson divergence, which we estimate directly without density estimation via lower-bound maximization, we can obtain an analytical estimator of the class prior. We further show that an existing class prior estimation method can also be interpreted as performing partial matching under the Pearson divergence, but in an indirect manner. The superiority of our direct class prior estimation method is illustrated on several benchmark datasets.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E97.D.1358/_p
Copy
@ARTICLE{e97-d_5_1358,
author={Marthinus Christoffel DU PLESSIS, Masashi SUGIYAMA, },
journal={IEICE TRANSACTIONS on Information},
title={Class Prior Estimation from Positive and Unlabeled Data},
year={2014},
volume={E97-D},
number={5},
pages={1358-1362},
abstract={We consider the problem of learning a classifier using only positive and unlabeled samples. In this setting, it is known that a classifier can be successfully learned if the class prior is available. However, in practice, the class prior is unknown and thus must be estimated from data. In this paper, we propose a new method to estimate the class prior by partially matching the class-conditional density of the positive class to the input density. By performing this partial matching in terms of the Pearson divergence, which we estimate directly without density estimation via lower-bound maximization, we can obtain an analytical estimator of the class prior. We further show that an existing class prior estimation method can also be interpreted as performing partial matching under the Pearson divergence, but in an indirect manner. The superiority of our direct class prior estimation method is illustrated on several benchmark datasets.},
keywords={},
doi={10.1587/transinf.E97.D.1358},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - Class Prior Estimation from Positive and Unlabeled Data
T2 - IEICE TRANSACTIONS on Information
SP - 1358
EP - 1362
AU - Marthinus Christoffel DU PLESSIS
AU - Masashi SUGIYAMA
PY - 2014
DO - 10.1587/transinf.E97.D.1358
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2014
AB - We consider the problem of learning a classifier using only positive and unlabeled samples. In this setting, it is known that a classifier can be successfully learned if the class prior is available. However, in practice, the class prior is unknown and thus must be estimated from data. In this paper, we propose a new method to estimate the class prior by partially matching the class-conditional density of the positive class to the input density. By performing this partial matching in terms of the Pearson divergence, which we estimate directly without density estimation via lower-bound maximization, we can obtain an analytical estimator of the class prior. We further show that an existing class prior estimation method can also be interpreted as performing partial matching under the Pearson divergence, but in an indirect manner. The superiority of our direct class prior estimation method is illustrated on several benchmark datasets.
ER -