Cross-document personal name resolution is the process of identifying whether or not a common personal name mentioned in different documents refers to the same individual. Most previous approaches usually rely on lexical matching such as the occurrence of common words surrounding the entity name to measure the similarity between documents, and then clusters the documents according to their referents. In spite of certain successes, measuring similarity based on lexical comparison sometimes ignores important linguistic phenomena at the semantic level such as synonym or paraphrase. This paper presents a semantics-based approach to the resolution of personal name crossover documents that can make the most of both lexical evidences and semantic clues. In our method, the similarity values between documents are determined by estimating the semantic relatedness between words. Further, the semantic labels attached to sentences allow us to highlight the common personal facts that are potentially available among documents. An evaluation on three web datasets demonstrates that our method achieves the better performance than the previous work.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Xuan-Hieu PHAN, Le-Minh NGUYEN, Susumu HORIGUCHI, "Personal Name Resolution Crossover Documents by a Semantics-Based Approach" in IEICE TRANSACTIONS on Information,
vol. E89-D, no. 2, pp. 825-836, February 2006, doi: 10.1093/ietisy/e89-d.2.825.
Abstract: Cross-document personal name resolution is the process of identifying whether or not a common personal name mentioned in different documents refers to the same individual. Most previous approaches usually rely on lexical matching such as the occurrence of common words surrounding the entity name to measure the similarity between documents, and then clusters the documents according to their referents. In spite of certain successes, measuring similarity based on lexical comparison sometimes ignores important linguistic phenomena at the semantic level such as synonym or paraphrase. This paper presents a semantics-based approach to the resolution of personal name crossover documents that can make the most of both lexical evidences and semantic clues. In our method, the similarity values between documents are determined by estimating the semantic relatedness between words. Further, the semantic labels attached to sentences allow us to highlight the common personal facts that are potentially available among documents. An evaluation on three web datasets demonstrates that our method achieves the better performance than the previous work.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.2.825/_p
Copy
@ARTICLE{e89-d_2_825,
author={Xuan-Hieu PHAN, Le-Minh NGUYEN, Susumu HORIGUCHI, },
journal={IEICE TRANSACTIONS on Information},
title={Personal Name Resolution Crossover Documents by a Semantics-Based Approach},
year={2006},
volume={E89-D},
number={2},
pages={825-836},
abstract={Cross-document personal name resolution is the process of identifying whether or not a common personal name mentioned in different documents refers to the same individual. Most previous approaches usually rely on lexical matching such as the occurrence of common words surrounding the entity name to measure the similarity between documents, and then clusters the documents according to their referents. In spite of certain successes, measuring similarity based on lexical comparison sometimes ignores important linguistic phenomena at the semantic level such as synonym or paraphrase. This paper presents a semantics-based approach to the resolution of personal name crossover documents that can make the most of both lexical evidences and semantic clues. In our method, the similarity values between documents are determined by estimating the semantic relatedness between words. Further, the semantic labels attached to sentences allow us to highlight the common personal facts that are potentially available among documents. An evaluation on three web datasets demonstrates that our method achieves the better performance than the previous work.},
keywords={},
doi={10.1093/ietisy/e89-d.2.825},
ISSN={1745-1361},
month={February},}
Copy
TY - JOUR
TI - Personal Name Resolution Crossover Documents by a Semantics-Based Approach
T2 - IEICE TRANSACTIONS on Information
SP - 825
EP - 836
AU - Xuan-Hieu PHAN
AU - Le-Minh NGUYEN
AU - Susumu HORIGUCHI
PY - 2006
DO - 10.1093/ietisy/e89-d.2.825
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2006
AB - Cross-document personal name resolution is the process of identifying whether or not a common personal name mentioned in different documents refers to the same individual. Most previous approaches usually rely on lexical matching such as the occurrence of common words surrounding the entity name to measure the similarity between documents, and then clusters the documents according to their referents. In spite of certain successes, measuring similarity based on lexical comparison sometimes ignores important linguistic phenomena at the semantic level such as synonym or paraphrase. This paper presents a semantics-based approach to the resolution of personal name crossover documents that can make the most of both lexical evidences and semantic clues. In our method, the similarity values between documents are determined by estimating the semantic relatedness between words. Further, the semantic labels attached to sentences allow us to highlight the common personal facts that are potentially available among documents. An evaluation on three web datasets demonstrates that our method achieves the better performance than the previous work.
ER -