Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existing Web-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Danushka BOLLEGALA, Yutaka MATSUO, Mitsuru ISHIZUKA, "Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy" in IEICE TRANSACTIONS on Information,
vol. E95-D, no. 8, pp. 2116-2123, August 2012, doi: 10.1587/transinf.E95.D.2116.
Abstract: Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existing Web-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E95.D.2116/_p
Copy
@ARTICLE{e95-d_8_2116,
author={Danushka BOLLEGALA, Yutaka MATSUO, Mitsuru ISHIZUKA, },
journal={IEICE TRANSACTIONS on Information},
title={Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy},
year={2012},
volume={E95-D},
number={8},
pages={2116-2123},
abstract={Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existing Web-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.},
keywords={},
doi={10.1587/transinf.E95.D.2116},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy
T2 - IEICE TRANSACTIONS on Information
SP - 2116
EP - 2123
AU - Danushka BOLLEGALA
AU - Yutaka MATSUO
AU - Mitsuru ISHIZUKA
PY - 2012
DO - 10.1587/transinf.E95.D.2116
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E95-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2012
AB - Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existing Web-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.
ER -