IEICE global.ieice.org Site

Keyword Search Result

[Keyword] discourse(13hit)

1-13hit

Exploring Hypotactic Structure for Chinese-English Machine Translation with a Structure-Aware Encoder-Decoder Neural Model
Guoyi MIAO Yufeng CHEN Mingtong LIU Jinan XU Yujie ZHANG Wenhe FENG

PAPER-Natural Language Processing

Pubricized:
2022/01/11
Vol:
E105-D No:4
Page(s):
797-806
Translation of long and complex sentence has always been a challenge for machine translation. In recent years, neural machine translation (NMT) has achieved substantial progress in modeling the semantic connection between words in a sentence, but it is still insufficient in capturing discourse structure information between clauses within complex sentences, which often leads to poor discourse coherence when translating long and complex sentences. On the other hand, the hypotactic structure, a main component of the discourse structure, plays an important role in the coherence of discourse translation, but it is not specifically studied. To tackle this problem, we propose a novel Chinese-English NMT approach that incorporates the hypotactic structure knowledge of complex sentences. Specifically, we first annotate and build a hypotactic structure aligned parallel corpus to provide explicit hypotactic structure knowledge of complex sentences for NMT. Then we propose three hypotactic structure-aware NMT models with three different fusion strategies, including source-side fusion, target-side fusion, and both-side fusion, to integrate the annotated structure knowledge into NMT. Experimental results on WMT17, WMT18 and WMT19 Chinese-English translation tasks demonstrate that the proposed method can significantly improve the translation performance and enhance the discourse coherence of machine translation.
Predictors of Pause Duration in Read-Aloud Discourse
Xiaohong YANG Mingxing XU Yufang YANG

PAPER-Speech Synthesis and Related Topics

Vol:
E97-D No:6
Page(s):
1461-1467
The research reported in this paper is an attempt to elucidate the predictors of pause duration in read-aloud discourse. Through simple linear regression analysis and stepwise multiple linear regression, we examined how different factors (namely, syntactic structure, discourse hierarchy, topic structure, preboundary length, and postboundary length) influenced pause duration both separately and jointly. Results from simple regression analysis showed that discourse hierarchy, syntactic structure, topic structure, and postboundary length had significant impacts on boundary pause duration. However, when these factors were tested in a stepwise regression analysis, only discourse hierarchy, syntactic structure, and postboundary length were found to have significant impacts on boundary pause duration. The regression model that best predicted boundary pause duration in discourse context was the one that first included syntactic structure, and then included discourse hierarchy and postboundary length. This model could account for about 80% of the variance of pause duration. Tests of mediation models showed that the effects of topic structure and discourse hierarchy were significantly mediated by syntactic structure, which was most closely correlated with pause duration. These results support an integrated model combining the influence of several factors and can be applied to text-to-speech systems.
Ranking Multiple Dialogue States by Corpus Statistics to Improve Discourse Understanding in Spoken Dialogue Systems
Ryuichiro HIGASHINAKA Mikio NAKANO

PAPER-Natural Language Processing

Vol:
E92-D No:9
Page(s):
1771-1782
This paper discusses the discourse understanding process in spoken dialogue systems. This process enables a system to understand user utterances from the context of a dialogue. Ambiguity in user utterances caused by multiple speech recognition hypotheses and parsing results sometimes makes it difficult for a system to decide on a single interpretation of a user intention. As a solution, the idea of retaining possible interpretations as multiple dialogue states and resolving the ambiguity using succeeding user utterances has been proposed. Although this approach has proven to improve discourse understanding accuracy, carefully created hand-crafted rules are necessary in order to accurately rank the dialogue states. This paper proposes automatically ranking multiple dialogue states using statistical information obtained from dialogue corpora. The experimental results in the train ticket reservation and weather information service domains show that the statistical information can significantly improve the ranking accuracy of dialogue states as well as the slot accuracy and the concept error rate of the top-ranked dialogue states.
A Method for Reinforcing Noun Countability Prediction
Ryo NAGATA Atsuo KAWAI Koichiro MORIHIRO Naoki ISU

PAPER-Natural Language Processing

Vol:
E90-D No:12
Page(s):
2077-2086
This paper proposes a method for reinforcing noun countability prediction, which plays a crucial role in demarcating correct determiners in machine translation and error detection. The proposed method reinforces countability prediction by introducing a novel heuristics called one countability per discourse. It claims that when a noun appears more than once in a discourse, all instances will share identical countability. The basic idea of the proposed method is that mispredictions can be corrected by efficiently using one countability per discourse heuristics. Experiments show that the proposed method successfully reinforces countability prediction and outperforms other methods used for comparison. In addition to its performance, it has two advantages over earlier methods: (i) it is applicable to any countability prediction method, and (ii) it requires no human intervention to reinforce countability prediction.
Mining Causality from Texts for Question Answering System
Chaveevan PECHSIRI Asanee KAWTRAKUL

PAPER

Vol:
E90-D No:10
Page(s):
1523-1533
This research aims to develop automatic knowledge mining of causality from texts for supporting an automatic question answering system (QA) in answering 'why' question, which is among the most crucial forms of questions. The out come of this research will assist people in diagnosing problems, such as in plant diseases, health, industrial and etc. While the previous works have extracted causality knowledge within only one or two adjacent EDUs (Elementary Discourse Units), this research focuses to mine causality knowledge existing within multiple EDUs which takes multiple causes and multiple effects in to consideration, where the adjacency between cause and effect is unnecessary. There are two main problems: how to identify the interesting causality events from documents, and how to identify the boundaries of the causative unit and the effective unit in term of the multiple EDUs. In addition, there are at least three main problems involved in boundaries identification: the implicit boundary delimiter, the nonadjacent cause-consequence, and the effect surrounded by causes. This research proposes using verb-pair rules learnt by comparing the Naïve Bayes classifier (NB) and Support Vector Machine (SVM) to identify causality EDUs in Thai agricultural and health news domains. The boundary identification problems are solved by utilizing verb-pair rules, Centering Theory and cue phrase set. The reason for emphasizing on using verbs to extract causality is that they explicitly make, in a certain way, the consequent events of cause-effect, e.g. 'Aphids suck the sap from rice leaves. Then leaves will shrink. Later, they will become yellow and dry.'. The outcome of the proposed methodology shown that the verb-pair rules extracted from NB outperform those extracted from SVM when the corpus contains high occurence of each verb, while the results from SVM is better than NB when the corpus contains less occurence of each verb. The verb-pair rules extracted from NB for causality extraction has the highest precision (0.88) with the recall of 0.75 from the plant disease corpus whereas from SVM has the highest precision (0.89) with the recall of 0.76 from bird flu news. For boundary determination, our methodology can handle very well with approximate 96% accuracy. In addition, the extracted causality results from this research can be generalized as laws in the Inductive-Statistical theory of Hempel's explanation theory, which will be useful for QA and reasoning.
A Model of Discourse Segmentation and Segment Title Assignment for Lecture Speech Indexing
Kazuhiro TAKEUCHI Yukie NAKAO Hitoshi ISAHARA

PAPER

Vol:
E90-D No:10
Page(s):
1601-1610
Dividing a lecture speech into segments and providing those segments as learning objects are quite general and convenient way to construct e-learning resources. However it is difficult to assign an appropriate title to each object that reflects its content. Since there are various aspects of analyzing discourse segments, it is inevitable that researchers will face the diversity when describing the "meanings" of discourse segments. In this paper, we propose the assignment of discourse segment titles from the representation of their "meanings." In this assigning procedure, we focus on the speaker's evaluation for the event or the speech object. To verify the effectiveness of our idea, we examined identification of the segment boundaries from the titles that were described in our procedure. We confirmed that the result of the identification was more accurate than that of intuitive identification.
Verification of Speech Recognition Results Incorporating In-domain Confidence and Discourse Coherence Measures
Ian R. LANE Tatsuya KAWAHARA

PAPER-Speech Recognition

Vol:
E89-D No:3
Page(s):
931-938
Conventional confidence measures for assessing the reliability of ASR (automatic speech recognition) output are typically derived from "low-level" information which is obtained during speech recognition decoding. In contrast to these approaches, we propose a novel utterance verification framework which incorporates "high-level" knowledge sources. Specifically, we investigate two application-independent measures: in-domain confidence, the degree of match between the input utterance and the application domain of the back-end system, and discourse coherence, the consistency between consecutive utterances in a dialogue session. A joint confidence score is generated by combining these two measures with an orthodox measure based on GPP (generalized posterior probability). The proposed framework was evaluated on an utterance verification task for spontaneous dialogue performed via a (English/Japanese) speech-to-speech translation system. Incorporating the two proposed measures significantly improved utterance verification accuracy compared to using GPP alone, realizing reductions in CER (confidence error-rate) of 11.4% and 8.1% for the English and Japanese sides, respectively. When negligible ASR errors (that do not affect translation) were ignored, further improvement was achieved for the English side, realizing a reduction in CER of up to 14.6% compared to the GPP case.
An Integrated Dialogue Analysis Model for Determining Speech Acts and Discourse Structures
Won Seug CHOI Harksoo KIM Jungyun SEO

PAPER-Natural Language Processing

Vol:
E88-D No:1
Page(s):
150-157
Analysis of speech acts and discourse structures is essential to a dialogue understanding system because speech acts and discourse structures are closely tied with the speaker's intention. However, it has been difficult to infer a speech act and a discourse structure from a surface utterance because they highly depend on the context of the utterance. We propose a statistical dialogue analysis model to determine discourse structures as well as speech acts using a maximum entropy model. The model can automatically acquire probabilistic discourse knowledge from an annotated dialogue corpus. Moreover, the model can analyze speech acts and discourse structures in one framework. In the experiment, the model showed better performance than other previous works.
"Man-Computer Symbiosis" Revisited: Achieving Natural Communication and Collaboration with Computers
Neal LESH Joe MARKS Charles RICH Candace L. SIDNER

INVITED PAPER

Vol:
E87-D No:6
Page(s):
1290-1298
In 1960, the famous computer pioneer J.C.R. Licklider described a vision for human-computer interaction that he called "man-computer symbiosis. " Licklider predicted the development of computer software that would allow people "to think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own. " More than 40 years later, one rarely encounters any computer application that comes close to capturing Licklider's notion of human-like communication and collaboration. We echo Licklider by arguing that true symbiotic interaction requires at least the following three elements: a complementary and effective division of labor between human and machine; an explicit representation in the computer of the user's abilities, intentions, and beliefs; and the utilization of nonverbal communication modalities. We illustrate this argument with various research prototypes currently under development at Mitsubishi Electric Research Laboratories (USA).
Extracting Primary Information Requests from Query Messages by Partial Discourse Processing
Yoshihiko HAYASHI

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E79-D No:9
Page(s):
1344-1352
This paper develops an efficient mechanism for extracting primary information requests from 'Seek-Object' type query messages. The mechanism consists of three steps. The first step extracts sentences which signal that the query is 'Seek-Object' type by recognizing distinctive surface expressions. The second step, biased by the expression patterns, analyzes their internal structures. The third step integrates these fragments by a partial discourse processing and represents writers' goal-directed information request; as these sentences often include referential expressions and the referred expressions are in background goal descriptions. We claim the mechanism can extract information requests fairly accurately, by showing evaluation results.
An Extended Centering Mechanism for Interpreting Pronouns and Zero-Pronouns
Shingo TAKADA Norihisa DOI

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E78-D No:1
Page(s):
58-67
Zero-pronouns and overt pronouns occur frequently in Japanese text. These must be interpreted by recognizing their antecedents to properly understand' a piece of discourse. The notion of centering" has been used to help in the interpretation process for intersentential anaphors. This is based on the premise that in a piece of discourse, some members have a greater amount of attention put on it than other members. In Japanese, the zero-pronoun is said to have the greatest amount of attention put on it. But, when there are more than one zero-pronoun in a sentence, only one of them would be accountable using centering. Overt pronouns and any other zero-pronouns may as well have appeared as ordinary' noun phrases. In this paper, the notion of centering has been extended so that these can also be interpreted. Basically, zero-pronouns and overt pronouns are treated as being more centered" in the discourse than other ordinary' noun phrases. They are put in an ordered list called the Center List. Any other noun phrases appearing in a sentence are put in another list called the Possible Center List. Noun phrases within both lists are ordered according to their degrees of salience. To see the effect of our approach, it was implemented in a simple system with minimal constraints and evaluated. The result showed that when the antecedent is in either the Center List or the Possible Center List, 80% of all zero-pronouns and overt pronouns were properly interpreted.
Manifestation of Linguistic Information in the Voice Fundamental Frequency Contours of Spoken Japanese
Hiroya FUJISAKI Keikichi HIROSE Noboru TAKAHASHI

PAPER

Vol:
E76-A No:11
Page(s):
1919-1926
Prosodic features of the spoken Japanese play an important role in the transmission of linguistic information concerning the lexical word accent, the sentence structure and the discourse structure. In order to construct prosodic rules for synthesizing high-quality speech, therefore, prosodic features of speech should be quantitatively analyzed with respect to the linguistic information. With a special focus on the fundamental frequency contour, we first define four prosodic units for the spoken Japanese, viz., prosodic word, prosodic phrase, prosodic clause and prosodic sentence, based on a decomposition of the fundamental frequency contour using a functional model for the generation process. Syntactic units are also introduced which have rough correspondence to these prosodic units. The relationships between the linguistic information and the characteristics of the components of the fundamental frequency contour are then described on the basis of results obtained by the analysis of two sets of speech material. Analysis of weathercast and newscast sentences showed that prosodic boundaries given by the manner of continuation/termination of phrase components fall into three categories, and are primarily related to the syntactic boundaries. On the other hand, analysis of noun phrases with various combinations of word accent types, syntactic structures, and focal conditions, indicated that the magnitude and the shape of the accent components, which of course reflect the information concerning the lexical accent types of constituent words, are largely influenced by the focal structure. The results also indicated that there are cases where prosody fails to meet all the requirements presented by word accent, syntax and discourse.
A Linguistic Procedure for an Extension Number Guidance System
Naomi INOUE Izuru NOGAITO Masahiko TAKAHASHI

PAPER

Vol:
E76-D No:1
Page(s):
106-111
This paper describes the linguistic procedure of our speech dialogue system. The procedure is composed of two processes, syntactic analysis using a finite state network, and discourse analysis using a plan recognition model. The finite state network is compiled from regular grammar. The regular grammar is described in order to accept sentences with various styles, for example ellipsis and inversion. The regular grammar is automatically generated from the skeleton of the grammar. The discourse analysis module understands the utterance, generates the next question for users and also predicts words which will be in the next utterance. For an extension number guidance task, we obtained correct recognition results for 93% of input sentences without word prediction and for 98% if prediction results include proper words.

Keyword Search Result

[Keyword] discourse(13hit)

Exploring Hypotactic Structure for Chinese-English Machine Translation with a Structure-Aware Encoder-Decoder Neural Model

Predictors of Pause Duration in Read-Aloud Discourse

Ranking Multiple Dialogue States by Corpus Statistics to Improve Discourse Understanding in Spoken Dialogue Systems

A Method for Reinforcing Noun Countability Prediction

Mining Causality from Texts for Question Answering System

A Model of Discourse Segmentation and Segment Title Assignment for Lecture Speech Indexing

Verification of Speech Recognition Results Incorporating In-domain Confidence and Discourse Coherence Measures

An Integrated Dialogue Analysis Model for Determining Speech Acts and Discourse Structures

"Man-Computer Symbiosis" Revisited: Achieving Natural Communication and Collaboration with Computers

Extracting Primary Information Requests from Query Messages by Partial Discourse Processing

An Extended Centering Mechanism for Interpreting Pronouns and Zero-Pronouns

Manifestation of Linguistic Information in the Voice Fundamental Frequency Contours of Spoken Japanese

A Linguistic Procedure for an Extension Number Guidance System

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles