IEICE global.ieice.org Site

Author Search Result

[Author] Koji Eguchi(10hit)

1-10hit

Multimedia Topic Models Considering Burstiness of Local Features Open Access
Yang XIE Koji EGUCHI

PAPER

Vol:
E97-D No:4
Page(s):
714-720
A number of studies have been conducted on topic modeling for various types of data, including text and image data. We focus particularly on the burstiness of the local features in modeling topics within video data in this paper. Burstiness is a phenomenon that is often discussed for text data. The idea is that if a word is used once in a document, it is more likely to be used again within the document. It is also observed in video data; for example, an object or visual word in video data is more likely to appear repeatedly within the same video data. Based on the idea mentioned above, we propose a new topic model, the Correspondence Dirichlet Compound Multinomial LDA (Corr-DCMLDA), which takes into account the burstiness of the local features in video data. The unknown parameters and latent variables in the model are estimated by conducting a collapsed Gibbs sampling and the hyperparameters are estimated by focusing on the fixed-point iterations. We demonstrate through experimentation on the genre classification of social video data that our model works more effectively than several baselines.
Sequential Bayesian Nonparametric Multimodal Topic Models for Video Data Analysis
Jianfei XUE Koji EGUCHI

PAPER

Pubricized:
2018/01/18
Vol:
E101-D No:4
Page(s):
1079-1087
Topic modeling as a well-known method is widely applied for not only text data mining but also multimedia data analysis such as video data analysis. However, existing models cannot adequately handle time dependency and multimodal data modeling for video data that generally contain image information and speech information. In this paper, we therefore propose a novel topic model, sequential symmetric correspondence hierarchical Dirichlet processes (Seq-Sym-cHDP) extended from sequential conditionally independent hierarchical Dirichlet processes (Seq-CI-HDP) and sequential correspondence hierarchical Dirichlet processes (Seq-cHDP), to improve the multimodal data modeling mechanism via controlling the pivot assignments with a latent variable. An inference scheme for Seq-Sym-cHDP based on a posterior representation sampler is also developed in this work. We finally demonstrate that our model outperforms other baseline models via experiments.
Hybrid Parallel Inference for Hierarchical Dirichlet Processes Open Access
Tsukasa OMOTO Koji EGUCHI Shotaro TORA

LETTER

Vol:
E97-D No:4
Page(s):
815-820
The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.
MPI/OpenMP Hybrid Parallel Inference Methods for Latent Dirichlet Allocation – Approximation and Evaluation
Shotaro TORA Koji EGUCHI

PAPER-Advanced Search

Vol:
E96-D No:5
Page(s):
1006-1015
Recently, probabilistic topic models have been applied to various types of data, including text, and their effectiveness has been demonstrated. Latent Dirichlet allocation (LDA) is a well known topic model. Variational Bayesian inference or collapsed Gibbs sampling is often used to estimate parameters in LDA; however, these inference methods incur high computational cost for large-scale data. Therefore, highly efficient technology is needed for this purpose. We use parallel computation technology for efficient collapsed Gibbs sampling inference for LDA. We assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. In prior work on parallel inference for LDA, either MPI or OpenMP has often been used alone. For an SMP cluster, however, it is more suitable to adopt hybrid parallelization that uses message passing for communication between SMP nodes and loop directives for parallelization within each SMP node. We developed an MPI/OpenMP hybrid parallel inference method for LDA, and evaluated the performance of the inference under various settings of an SMP cluster. We further investigated the approximation that controls the inter-node communications, and found out that it achieved noticeable increase in inference speed while maintaining inference accuracy.
Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure
Koji EGUCHI Keizo OYAMA Emi ISHIDA Noriko KANDO Kazuko KURIYAMA

PAPER

Vol:
E86-D No:9
Page(s):
1804-1813
This paper proposes the evaluation methods for measuring retrieval effectiveness of Web search engine systems, attempting to make them suitable for real Web environment. With this objective, we constructed 100-gigabyte and 10-gigabyte document sets that were mainly gathered from the '.jp' domain, and conducted an evaluation workshop at the third NTCIR Workshop from 2001 to 2002, where we assessed the retrieval effectiveness of a certain number of Web search engine systems using the common data set. Conventional evaluation workshops assessed the relevance of the retrieved documents, which were submitted by the workshop participants, by considering the contents of individual pages. On the other hand, we assessed the relevance of the retrieved pages by considering the relationship between the pages referenced by hyperlinks.
Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet Processes
Jianfei XUE Koji EGUCHI

PAPER

Pubricized:
2016/10/07
Vol:
E100-D No:1
Page(s):
33-41
Video data mining based on topic models as an emerging technique recently has become a very popular research topic. In this paper, we present a novel topic model named sequential correspondence hierarchical Dirichlet processes (Seq-cHDP) to learn the hidden structure within video data. The Seq-cHDP model can be deemed as an extended hierarchical Dirichlet processes (HDP) model containing two important features: one is the time-dependency mechanism that connects neighboring video frames on the basis of a time dependent Markovian assumption, and the other is the correspondence mechanism that provides a solution for dealing with the multimodal data such as the mixture of visual words and speech words extracted from video files. A cascaded Gibbs sampling method is applied for implementing the inference task of Seq-cHDP. We present a comprehensive evaluation for Seq-cHDP through experimentation and finally demonstrate that Seq-cHDP outperforms other baseline models.
Entity Network Prediction Using Multitype Topic Models
Hitohiro SHIOZAKI Koji EGUCHI Takenao OHKAWA

PAPER-Knowledge Discovery and Data Mining

Vol:
E91-D No:11
Page(s):
2589-2598
- HTML
- PDF(518.1KB) >> Buy this Article
- Errata[Uploaded on December 1,2008]
Conveying information about who, what, when and where is a primary purpose of some genres of documents, typically news articles. Statistical models that capture dependencies between named entities and topics can play an important role in handling such information. Although some relationships between who and where should be mentioned in such a document, no statistical topic models explicitly address the textual interactions between a who-entity and a where-entity. This paper presents a statistical model that directly captures the dependencies between an arbitrary number of word types, such as who-entities, where-entities and topics, mentioned in each document. We show that this multitype topic model performs better at making predictions on entity networks, in which each vertex represents an entity and each edge weight represents how a pair of entities at the incident vertices is closely related, through our experiments on predictions of who-entities and links between them. We also demonstrate the scale-free property in the weighted networks of entities extracted from written mentions.
FOREWORD Open Access
Koji Eguchi

FOREWORD

Vol:
E102-D No:4
Page(s):
724-724
Online Inference of Mixed Membership Stochastic Blockmodels for Network Data Streams Open Access
Tomoki KOBAYASHI Koji EGUCHI

PAPER

Vol:
E97-D No:4
Page(s):
752-761
Many kinds of data can be represented as a network or graph. It is crucial to infer the latent structure underlying such a network and to predict unobserved links in the network. Mixed Membership Stochastic Blockmodel (MMSB) is a promising model for network data. Latent variables and unknown parameters in MMSB have been estimated through Bayesian inference with the entire network; however, it is important to estimate them online for evolving networks. In this paper, we first develop online inference methods for MMSB through sequential Monte Carlo methods, also known as particle filters. We then extend them for time-evolving networks, taking into account the temporal dependency of the network structure. We demonstrate through experiments that the time-dependent particle filter outperformed several baselines in terms of prediction performance in an online condition.
Relation Prediction in Multilingual Data Based on Multimodal Relational Topic Models
Yosuke SAKATA Koji EGUCHI

PAPER

Pubricized:
2017/01/17
Vol:
E100-D No:4
Page(s):
741-749
There are increasing demands for improved analysis of multimodal data that consist of multiple representations, such as multilingual documents and text-annotated images. One promising approach for analyzing such multimodal data is latent topic models. In this paper, we propose conditionally independent generalized relational topic models (CI-gRTM) for predicting unknown relations across different multiple representations of multimodal data. We developed CI-gRTM as a multimodal extension of discriminative relational topic models called generalized relational topic models (gRTM). We demonstrated through experiments with multilingual documents that CI-gRTM can more effectively predict both multilingual representations and relations between two different language representations compared with several state-of-the-art baseline models that enable to predict either multilingual representations or unimodal relations.

Author Search Result

[Author] Koji Eguchi(10hit)

Multimedia Topic Models Considering Burstiness of Local Features Open Access

Sequential Bayesian Nonparametric Multimodal Topic Models for Video Data Analysis

Hybrid Parallel Inference for Hierarchical Dirichlet Processes Open Access

MPI/OpenMP Hybrid Parallel Inference Methods for Latent Dirichlet Allocation – Approximation and Evaluation

Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure

Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet Processes

Entity Network Prediction Using Multitype Topic Models

FOREWORD Open Access

Online Inference of Mixed Membership Stochastic Blockmodels for Network Data Streams Open Access

Relation Prediction in Multilingual Data Based on Multimodal Relational Topic Models

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles