The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Dirichlet process(6hit)

1-6hit
  • Sequential Bayesian Nonparametric Multimodal Topic Models for Video Data Analysis

    Jianfei XUE  Koji EGUCHI  

     
    PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1079-1087

    Topic modeling as a well-known method is widely applied for not only text data mining but also multimedia data analysis such as video data analysis. However, existing models cannot adequately handle time dependency and multimodal data modeling for video data that generally contain image information and speech information. In this paper, we therefore propose a novel topic model, sequential symmetric correspondence hierarchical Dirichlet processes (Seq-Sym-cHDP) extended from sequential conditionally independent hierarchical Dirichlet processes (Seq-CI-HDP) and sequential correspondence hierarchical Dirichlet processes (Seq-cHDP), to improve the multimodal data modeling mechanism via controlling the pivot assignments with a latent variable. An inference scheme for Seq-Sym-cHDP based on a posterior representation sampler is also developed in this work. We finally demonstrate that our model outperforms other baseline models via experiments.

  • Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet Processes

    Jianfei XUE  Koji EGUCHI  

     
    PAPER

      Pubricized:
    2016/10/07
      Vol:
    E100-D No:1
      Page(s):
    33-41

    Video data mining based on topic models as an emerging technique recently has become a very popular research topic. In this paper, we present a novel topic model named sequential correspondence hierarchical Dirichlet processes (Seq-cHDP) to learn the hidden structure within video data. The Seq-cHDP model can be deemed as an extended hierarchical Dirichlet processes (HDP) model containing two important features: one is the time-dependency mechanism that connects neighboring video frames on the basis of a time dependent Markovian assumption, and the other is the correspondence mechanism that provides a solution for dealing with the multimodal data such as the mixture of visual words and speech words extracted from video files. A cascaded Gibbs sampling method is applied for implementing the inference task of Seq-cHDP. We present a comprehensive evaluation for Seq-cHDP through experimentation and finally demonstrate that Seq-cHDP outperforms other baseline models.

  • Improvement of Auctioneer's Revenue under Incomplete Information in Cognitive Radio Networks

    Jun MA  Yonghong ZHANG  Shengheng LIU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2015/11/17
      Vol:
    E99-D No:2
      Page(s):
    533-536

    In this letter, the problem of how to set reserve prices so as to improve the primary user's revenue in the second price-sealed auction under the incomplete information of secondary users' private value functions is investigated. Dirichlet process is used to predict the next highest bid based on historical data of the highest bids. Before the beginning of the next auction round, the primary user can obtain a reserve price by maximizing the additional expected reward. Simulation results show that the proposed scheme can achieve an improvement of the primary user's averaged revenue compared with several counterparts.

  • Hybrid Parallel Inference for Hierarchical Dirichlet Processes Open Access

    Tsukasa OMOTO  Koji EGUCHI  Shotaro TORA  

     
    LETTER

      Vol:
    E97-D No:4
      Page(s):
    815-820

    The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.

  • Bayesian Nonparametric Approach to Blind Separation of Infinitely Many Sparse Sources

    Hirokazu KAMEOKA  Misa SATO  Takuma ONO  Nobutaka ONO  Shigeki SAGAYAMA  

     
    PAPER

      Vol:
    E96-A No:10
      Page(s):
    1928-1937

    This paper deals with the problem of underdetermined blind source separation (BSS) where the number of sources is unknown. We propose a BSS approach that simultaneously estimates the number of sources, separates the sources based on the sparseness of speech, estimates the direction of arrival of each source, and performs permutation alignment. We confirmed experimentally that reasonably good separation was obtained with the present method without specifying the number of sources.

  • A Bayesian Model of Transliteration and Its Human Evaluation When Integrated into a Machine Translation System

    Andrew FINCH  Keiji YASUDA  Hideo OKUMA  Eiichiro SUMITA  Satoshi NAKAMURA  

     
    PAPER

      Vol:
    E94-D No:10
      Page(s):
    1889-1900

    The contribution of this paper is two-fold. Firstly, we conduct a large-scale real-world evaluation of the effectiveness of integrating an automatic transliteration system with a machine translation system. A human evaluation is usually preferable to an automatic evaluation, and in the case of this evaluation especially so, since the common machine translation evaluation methods are affected by the length of the translations they are evaluating, often being biassed towards translations in terms of their length rather than the information they convey. We evaluate our transliteration system on data collected in field experiments conducted all over Japan. Our results conclusively show that using a transliteration system can improve machine translation quality when translating unknown words. Our second contribution is to propose a novel Bayesian model for unsupervised bilingual character sequence segmentation of corpora for transliteration. The system is based on a Dirichlet process model trained using Bayesian inference through blocked Gibbs sampling implemented using an efficient forward filtering/backward sampling dynamic programming algorithm. The Bayesian approach is able to overcome the overfitting problem inherent in maximum likelihood training. We demonstrate the effectiveness of our Bayesian segmentation by using it to build a translation model for a phrase-based statistical machine translation (SMT) system trained to perform transliteration by monotonic transduction from character sequence to character sequence. The Bayesian segmentation was used to construct a phrase-table and we compared the quality of this phrase-table to one generated in the usual manner by the state-of-the-art GIZA++ word alignment process used in combination with phrase extraction heuristics from the MOSES statistical machine translation system, by using both to perform transliteration generation within an identical framework. In our experiments on English-Japanese data from the NEWS2010 transliteration generation shared task, we used our technique to bilingually co-segment the training corpus. We then derived a phrase-table from the segmentation from the sample at the final iteration of the training procedure, and the resulting phrase-table was used to directly substitute for the phrase-table extracted by using GIZA++/MOSES. The phrase-table resulting from our Bayesian segmentation model was approximately 30% smaller than that produced by the SMT system's training procedure, and gave an increase in transliteration quality measured in terms of both word accuracy and F-score.