Naoaki OKAZAKI Yutaka MATSUO Naohiro MATSUMURA Mitsuru ISHIZUKA
Although there has been a great deal of research on automatic summarization, most methods rely on statistical methods, disregarding relationships between extracted textual segments. We propose a novel method to extract a set of comprehensible sentences which centers on several key points to ensure sentence connectivity. It features a similarity network from documents with a lexical dictionary, and spreading activation to rank sentences. We show evaluation results of a multi-document summarization system based on the method participating in a competition of summarization, TSC (Text Summarization Challenge) task, organized by the third NTCIR project.
Youngjoong KO Kono KIM Jungyun SEO
Automatic text summarization has the goal of reducing the size of a document while preserving its content. Generally, producing a summary as extracts is achieved by including only sentences which are the most topic-related. DOCUSUM is our summarization system based on a new topic keyword identification method. The process of DOCUSUM is as follows. First, DOCUSUM converts the content words of a document into elements of a context vector space. It then constructs lexical clusters from the context vector space and identifies core clusters. Next, it selects topic keywords from the core clusters. Finally, it generates a summary of the document using the topic keywords. In the experiments on various compression ratios (the compression of 30%, the compression of 10%, and the extraction of the fixed number of sentences: 4 or 8 sentences), DOCUSUM showed better performance than other methods.
Hiroyuki SAKAI Shigeru MASUYAMA
This paper proposes a statistical method of acquiring knowledge about the abbreviation possibility of some of multiple adnominal phrases. Our method calculates weight values of multiple adnominal phrases by mutual information based on the strength of relation between the adnominal phrases and modified nouns. Among adnominal phrases, those having relatively low weight values are deleted. The evaluation of our method by experiments shows that precision attains about 84.1% and recall attains about 59.2%, respectively.
Tatsumi YOSHIDA Shigeru MASUYAMA
We developed a multiple document summarization system GOLD. This system generates a single summary from relevant newspaper articles with any summarization rate specified by a user. GOLD is incorporated a number of methods to summarize. In particular, some methods for sentence reduction are useful to shorten each sentence. As a result, it increased the number of outputted sentences which include important information. We participated in task B of NTCIR3 TSC2 to evaluate this system. GOLD exhibits a good performance in content-based evaluation which suggests that summarization methods employed by GOLD are promising for practical use.
Teiji FURUGORI Rihua LIN Takeshi ITO Dongli HAN
Described here is an automatic text summarization system for Japanese newspaper articles on sassho-jiken (murders and bodily harms). We extract the pieces of information from a text, inter-connect them to represent the scenes and participants involved in the sassho-jiken, and finally produce a summary by generating sentences from the information extracted. An experiment and its evaluation show that, while a limitation being imposed on the domain, our method works well in depicting important information from the newspaper articles and the summaries produced are better in adequacy and readability than those obtained by extracting sentences.