1-3hit |
Naohiro TAWARA Atsunori OGAWA Tomoharu IWATA Hiroto ASHIKAWA Tetsunori KOBAYASHI Tetsuji OGAWA
Most conventional multi-source domain adaptation techniques for recurrent neural network language models (RNNLMs) are domain-centric. In these approaches, each domain is considered independently and this makes it difficult to apply the models to completely unseen target domains that are unobservable during training. Instead, our study exploits domain attributes, which represent common knowledge among such different domains as dialects, types of wordings, styles, and topics, to achieve domain generalization that can robustly represent unseen target domains by combining the domain attributes. To achieve attribute-based domain generalization system in language modeling, we introduce domain attribute-based experts to a multi-stream RNNLM called recurrent adaptive mixture model (RADMM) instead of domain-based experts. In the proposed system, a long short-term memory is independently trained on each domain attribute as an expert model. Then by integrating the outputs from all the experts in response to the context-dependent weight of the domain attributes of the current input, we predict the subsequent words in the unseen target domain and exploit the specific knowledge of each domain attribute. To demonstrate the effectiveness of our proposed domain attributes-centric language model, we experimentally compared the proposed model with conventional domain-centric language model by using texts taken from multiple domains including different writing styles, topics, dialects, and types of wordings. The experimental results demonstrated that lower perplexity can be achieved using domain attributes.
Akisato KIMURA Kevin DUH Tsutomu HIRAO Katsuhiko ISHIGURO Tomoharu IWATA Albert AU YEUNG
Social media such as microblogs have become so pervasive such that it is now possible to use them as sensors for real-world events and memes. While much recent research has focused on developing automatic methods for filtering and summarizing these data streams, we explore a different trend called social curation. In contrast to automatic methods, social curation is characterized as a human-in-the-loop and sometimes crowd-sourced mechanism for exploiting social media as sensors. Although social curation web services like Togetter, Naver Matome and Storify are gaining popularity, little academic research has studied the phenomenon. In this paper, our goal is to investigate the phenomenon and potential of this new field of social curation. First, we perform an in-depth analysis of a large corpus of curated microblog data. We seek to understand why and how people participate in this laborious curation process. We then explore new ways in which information retrieval and machine learning technologies can be used to assist curators. In particular, we propose a novel method based on a learning-to-rank framework that increases the curator's productivity and breadth of perspective by suggesting which novel microblogs should be added to the curated content.
Michael HENTSCHEL Marc DELCROIX Atsunori OGAWA Tomoharu IWATA Tomohiro NAKATANI
Language models are a key technology in various tasks, such as, speech recognition and machine translation. They are usually used on texts covering various domains and as a result domain adaptation has been a long ongoing challenge in language model research. With the rising popularity of neural network based language models, many methods have been proposed in recent years. These methods can be separated into two categories: model based and feature based adaptation methods. Feature based domain adaptation has compared to model based domain adaptation the advantage that it does not require domain labels in the corpus. Most existing feature based adaptation methods are based on bias adaptation. We propose a novel feature based domain adaptation technique using hidden layer factorisation. This method is fundamentally different from existing methods because we use the domain features to calculate a linear combination of linear layers. These linear layers can capture domain specific information and information common to different domains. In the experiments, we compare our proposed method with existing adaptation methods. The compared adaptation techniques are based on two different ideas, that is, bias based adaptation and gating of hidden units. All language models in our comparison use state-of-the-art long short-term memory based recurrent neural networks. We demonstrate the effectiveness of the proposed method with perplexity results for the well-known Penn Treebank and speech recognition results for a corpus of TED talks.