The objective of pool-based incremental active learning is to choose a sample to label from a pool of unlabeled samples in an incremental manner so that the generalization error is minimized. In this scenario, the generalization error often hits a minimum in the middle of the incremental active learning procedure and then it starts to increase. In this paper, we address the problem of early labeling stopping in probabilistic classification for minimizing the generalization error and the labeling cost. Among several possible strategies, we propose to stop labeling when the empirical class-posterior approximation error is maximized. Experiments on benchmark datasets demonstrate the usefulness of the proposed strategy.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Tsubasa KOBAYASHI, Masashi SUGIYAMA, "Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier" in IEICE TRANSACTIONS on Information,
vol. E95-D, no. 8, pp. 2065-2073, August 2012, doi: 10.1587/transinf.E95.D.2065.
Abstract: The objective of pool-based incremental active learning is to choose a sample to label from a pool of unlabeled samples in an incremental manner so that the generalization error is minimized. In this scenario, the generalization error often hits a minimum in the middle of the incremental active learning procedure and then it starts to increase. In this paper, we address the problem of early labeling stopping in probabilistic classification for minimizing the generalization error and the labeling cost. Among several possible strategies, we propose to stop labeling when the empirical class-posterior approximation error is maximized. Experiments on benchmark datasets demonstrate the usefulness of the proposed strategy.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E95.D.2065/_p
Copy
@ARTICLE{e95-d_8_2065,
author={Tsubasa KOBAYASHI, Masashi SUGIYAMA, },
journal={IEICE TRANSACTIONS on Information},
title={Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier},
year={2012},
volume={E95-D},
number={8},
pages={2065-2073},
abstract={The objective of pool-based incremental active learning is to choose a sample to label from a pool of unlabeled samples in an incremental manner so that the generalization error is minimized. In this scenario, the generalization error often hits a minimum in the middle of the incremental active learning procedure and then it starts to increase. In this paper, we address the problem of early labeling stopping in probabilistic classification for minimizing the generalization error and the labeling cost. Among several possible strategies, we propose to stop labeling when the empirical class-posterior approximation error is maximized. Experiments on benchmark datasets demonstrate the usefulness of the proposed strategy.},
keywords={},
doi={10.1587/transinf.E95.D.2065},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier
T2 - IEICE TRANSACTIONS on Information
SP - 2065
EP - 2073
AU - Tsubasa KOBAYASHI
AU - Masashi SUGIYAMA
PY - 2012
DO - 10.1587/transinf.E95.D.2065
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E95-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2012
AB - The objective of pool-based incremental active learning is to choose a sample to label from a pool of unlabeled samples in an incremental manner so that the generalization error is minimized. In this scenario, the generalization error often hits a minimum in the middle of the incremental active learning procedure and then it starts to increase. In this paper, we address the problem of early labeling stopping in probabilistic classification for minimizing the generalization error and the labeling cost. Among several possible strategies, we propose to stop labeling when the empirical class-posterior approximation error is maximized. Experiments on benchmark datasets demonstrate the usefulness of the proposed strategy.
ER -