1-3hit |
Go IRIE Hiroyuki ARAI Yukinobu TANIGUCHI
This paper presents an unsupervised approach to feature binary coding for efficient semantic image retrieval. Although the majority of the existing methods aim to preserve neighborhood structures of the feature space, semantically similar images are not always in such neighbors but are rather distributed in non-linear low-dimensional manifolds. Moreover, images are rarely alone on the Internet and are often surrounded by text data such as tags, attributes, and captions, which tend to carry rich semantic information about the images. On the basis of these observations, the approach presented in this paper aims at learning binary codes for semantic image retrieval using multimodal information sources while preserving the essential low-dimensional structures of the data distributions in the Hamming space. Specifically, after finding the low-dimensional structures of the data by using an unsupervised sparse coding technique, our approach learns a set of linear projections for binary coding by solving an optimization problem which is designed to jointly preserve the extracted data structures and multimodal data correlations between images and texts in the Hamming space as much as possible. We show that the joint optimization problem can readily be transformed into a generalized eigenproblem that can be efficiently solved. Extensive experiments demonstrate that our method yields significant performance gains over several existing methods.
Kazuki EGASHIRA Atsuyuki MIYAI Qing YU Go IRIE Kiyoharu AIZAWA
We propose a novel classification problem setting where Undesirable Classes (UCs) are defined for each class. UC is the class you specifically want to avoid misclassifying. To address this setting, we propose a framework to reduce the probabilities for UCs while increasing the probability for a correct class.
Go IRIE Yukito WATANABE Takayuki KUROZUMI Tetsuya KINEBUCHI
Encoding multiple SIFT descriptors into a single vector is a key technique for efficient object image retrieval. In this paper, we propose an extension of local coordinate system (LCS) for image representation. The previous LCS approaches encode each SIFT descriptor by a single local coordinate, which is not adequate for localizing its position in the descriptor space. Instead, we use multiple local coordinates to represent each descriptor with PCA-based decorrelation. Experiments show that this simple modification can improve retrieval performance significantly.