The construction of annotated corpora requires considerable manual effort. This paper presents a pragmatic method to minimize human intervention for the construction of Korean part-of-speech (POS) tagged corpus. Instead of focusing on improving the performance of conventional automatic POS taggers, we devise a discriminative POS tagger which can selectively produce either a single analysis or multiple analyses based on the tagging reliability. The proposed approach uses two decision rules to judge the tagging reliability. Experimental results show that the proposed approach can effectively control the quality of corpus and the amount of manual annotation by the threshold value of the rule.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Do-Gil LEE, Gumwon HONG, Seok Kee LEE, Hae-Chang RIM, "Minimizing Human Intervention for Constructing Korean Part-of-Speech Tagged Corpus" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 8, pp. 2336-2338, August 2010, doi: 10.1587/transinf.E93.D.2336.
Abstract: The construction of annotated corpora requires considerable manual effort. This paper presents a pragmatic method to minimize human intervention for the construction of Korean part-of-speech (POS) tagged corpus. Instead of focusing on improving the performance of conventional automatic POS taggers, we devise a discriminative POS tagger which can selectively produce either a single analysis or multiple analyses based on the tagging reliability. The proposed approach uses two decision rules to judge the tagging reliability. Experimental results show that the proposed approach can effectively control the quality of corpus and the amount of manual annotation by the threshold value of the rule.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.2336/_p
Copy
@ARTICLE{e93-d_8_2336,
author={Do-Gil LEE, Gumwon HONG, Seok Kee LEE, Hae-Chang RIM, },
journal={IEICE TRANSACTIONS on Information},
title={Minimizing Human Intervention for Constructing Korean Part-of-Speech Tagged Corpus},
year={2010},
volume={E93-D},
number={8},
pages={2336-2338},
abstract={The construction of annotated corpora requires considerable manual effort. This paper presents a pragmatic method to minimize human intervention for the construction of Korean part-of-speech (POS) tagged corpus. Instead of focusing on improving the performance of conventional automatic POS taggers, we devise a discriminative POS tagger which can selectively produce either a single analysis or multiple analyses based on the tagging reliability. The proposed approach uses two decision rules to judge the tagging reliability. Experimental results show that the proposed approach can effectively control the quality of corpus and the amount of manual annotation by the threshold value of the rule.},
keywords={},
doi={10.1587/transinf.E93.D.2336},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Minimizing Human Intervention for Constructing Korean Part-of-Speech Tagged Corpus
T2 - IEICE TRANSACTIONS on Information
SP - 2336
EP - 2338
AU - Do-Gil LEE
AU - Gumwon HONG
AU - Seok Kee LEE
AU - Hae-Chang RIM
PY - 2010
DO - 10.1587/transinf.E93.D.2336
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2010
AB - The construction of annotated corpora requires considerable manual effort. This paper presents a pragmatic method to minimize human intervention for the construction of Korean part-of-speech (POS) tagged corpus. Instead of focusing on improving the performance of conventional automatic POS taggers, we devise a discriminative POS tagger which can selectively produce either a single analysis or multiple analyses based on the tagging reliability. The proposed approach uses two decision rules to judge the tagging reliability. Experimental results show that the proposed approach can effectively control the quality of corpus and the amount of manual annotation by the threshold value of the rule.
ER -