In this paper, we propose a machine learning-based method of multi-document summarization integrating sentence extraction with bunsetsu elimination. We employ Support Vector Machines for both of the modules used. To evaluate the effect of bunsetsu elimination, we participated in the multi-document summarization task at TSC-2 by the following two approaches: (1) sentence extraction only, and (2) sentence extraction + bunsetsu elimination. The results of subjective evaluation at TSC-2 show that both approaches are superior to the Lead-based method from the viewpoint of information coverage. In addition, we made extracts from given abstracts to quantitatively examine the effectiveness of bunsetsu elimination. The experimental results showed that our bunsetsu elimination makes summaries more informative. Moreover, we found that extraction based on SVMs trained by short extracts are better than the Lead-based method, but that SVMs trained by long extracts are not.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Tsutomu HIRAO, Kazuhiro TAKEUCHI, Hideki ISOZAKI, Yutaka SASAKI, Eisaku MAEDA, "SVM-Based Multi-Document Summarization Integrating Sentence Extraction with Bunsetsu Elimination" in IEICE TRANSACTIONS on Information,
vol. E86-D, no. 9, pp. 1702-1709, September 2003, doi: .
Abstract: In this paper, we propose a machine learning-based method of multi-document summarization integrating sentence extraction with bunsetsu elimination. We employ Support Vector Machines for both of the modules used. To evaluate the effect of bunsetsu elimination, we participated in the multi-document summarization task at TSC-2 by the following two approaches: (1) sentence extraction only, and (2) sentence extraction + bunsetsu elimination. The results of subjective evaluation at TSC-2 show that both approaches are superior to the Lead-based method from the viewpoint of information coverage. In addition, we made extracts from given abstracts to quantitatively examine the effectiveness of bunsetsu elimination. The experimental results showed that our bunsetsu elimination makes summaries more informative. Moreover, we found that extraction based on SVMs trained by short extracts are better than the Lead-based method, but that SVMs trained by long extracts are not.
URL: https://global.ieice.org/en_transactions/information/10.1587/e86-d_9_1702/_p
Copy
@ARTICLE{e86-d_9_1702,
author={Tsutomu HIRAO, Kazuhiro TAKEUCHI, Hideki ISOZAKI, Yutaka SASAKI, Eisaku MAEDA, },
journal={IEICE TRANSACTIONS on Information},
title={SVM-Based Multi-Document Summarization Integrating Sentence Extraction with Bunsetsu Elimination},
year={2003},
volume={E86-D},
number={9},
pages={1702-1709},
abstract={In this paper, we propose a machine learning-based method of multi-document summarization integrating sentence extraction with bunsetsu elimination. We employ Support Vector Machines for both of the modules used. To evaluate the effect of bunsetsu elimination, we participated in the multi-document summarization task at TSC-2 by the following two approaches: (1) sentence extraction only, and (2) sentence extraction + bunsetsu elimination. The results of subjective evaluation at TSC-2 show that both approaches are superior to the Lead-based method from the viewpoint of information coverage. In addition, we made extracts from given abstracts to quantitatively examine the effectiveness of bunsetsu elimination. The experimental results showed that our bunsetsu elimination makes summaries more informative. Moreover, we found that extraction based on SVMs trained by short extracts are better than the Lead-based method, but that SVMs trained by long extracts are not.},
keywords={},
doi={},
ISSN={},
month={September},}
Copy
TY - JOUR
TI - SVM-Based Multi-Document Summarization Integrating Sentence Extraction with Bunsetsu Elimination
T2 - IEICE TRANSACTIONS on Information
SP - 1702
EP - 1709
AU - Tsutomu HIRAO
AU - Kazuhiro TAKEUCHI
AU - Hideki ISOZAKI
AU - Yutaka SASAKI
AU - Eisaku MAEDA
PY - 2003
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E86-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2003
AB - In this paper, we propose a machine learning-based method of multi-document summarization integrating sentence extraction with bunsetsu elimination. We employ Support Vector Machines for both of the modules used. To evaluate the effect of bunsetsu elimination, we participated in the multi-document summarization task at TSC-2 by the following two approaches: (1) sentence extraction only, and (2) sentence extraction + bunsetsu elimination. The results of subjective evaluation at TSC-2 show that both approaches are superior to the Lead-based method from the viewpoint of information coverage. In addition, we made extracts from given abstracts to quantitatively examine the effectiveness of bunsetsu elimination. The experimental results showed that our bunsetsu elimination makes summaries more informative. Moreover, we found that extraction based on SVMs trained by short extracts are better than the Lead-based method, but that SVMs trained by long extracts are not.
ER -