The search functionality is under construction.

Keyword Search Result

[Keyword] image retrieval(51hit)

1-20hit(51hit)

  • Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval

    Longjiao ZHAO  Yu WANG  Jien KATO  Yoshiharu ISHIKAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/02/14
      Vol:
    E106-D No:5
      Page(s):
    1069-1080

    Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.

  • Searching and Learning Discriminative Regions for Fine-Grained Image Retrieval and Classification

    Kangbo SUN  Jie ZHU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/10/18
      Vol:
    E105-D No:1
      Page(s):
    141-149

    Local discriminative regions play important roles in fine-grained image analysis tasks. How to locate local discriminative regions with only category label and learn discriminative representation from these regions have been hot spots. In our work, we propose Searching Discriminative Regions (SDR) and Learning Discriminative Regions (LDR) method to search and learn local discriminative regions in images. The SDR method adopts attention mechanism to iteratively search for high-response regions in images, and uses this as a clue to locate local discriminative regions. Moreover, the LDR method is proposed to learn compact within category and sparse between categories representation from the raw image and local images. Experimental results show that our proposed approach achieves excellent performance in both fine-grained image retrieval and classification tasks, which demonstrates its effectiveness.

  • AdaLSH: Adaptive LSH for Solving c-Approximate Maximum Inner Product Search Problem

    Kejing LU  Mineichi KUDO  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2020/10/13
      Vol:
    E104-D No:1
      Page(s):
    138-145

    Maximum inner product search (MIPS) problem has gained much attention in a wide range of applications. In order to overcome the curse of dimensionality in high-dimensional spaces, most of existing methods first transform the MIPS problem into another approximate nearest neighbor search (ANNS) problem and then solve it by Locality Sensitive Hashing (LSH). However, due to the error incurred by the transmission and incomprehensive search strategies, these methods suffer from low precision and have loose probability guarantees. In this paper, we propose a novel search method named Adaptive-LSH (AdaLSH) to solve MIPS problem more efficiently and more precisely. AdaLSH examines objects in the descending order of both norms and (the probably correctly estimated) cosine angles with a query object in support of LSH with extendable windows. Such extendable windows bring not only efficiency in searching but also the probability guarantee of finding exact or approximate MIP objects. AdaLSH gives a better probability guarantee of success than those in conventional algorithms, bringing less running times on various datasets compared with them. In addition, AdaLSH can even support exact MIPS with probability guarantee.

  • Rethinking the Rotation Invariance of Local Convolutional Features for Content-Based Image Retrieval

    Longjiao ZHAO  Yu WANG  Jien KATO  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2020/10/14
      Vol:
    E104-D No:1
      Page(s):
    174-182

    Recently, local features computed using convolutional neural networks (CNNs) show good performance to image retrieval. The local convolutional features obtained by the CNNs (LC features) are designed to be translation invariant, however, they are inherently sensitive to rotation perturbations. This leads to miss-judgements in retrieval tasks. In this work, our objective is to enhance the robustness of LC features against image rotation. To do this, we conduct a thorough experimental evaluation of three candidate anti-rotation strategies (in-model data augmentation, in-model feature augmentation, and post-model feature augmentation), over two kinds of rotation attack (dataset attack and query attack). In the training procedure, we implement a data augmentation protocol and network augmentation method. In the test procedure, we develop a local transformed convolutional (LTC) feature extraction method, and evaluate it over different network configurations. We end up a series of good practices with steady quantitative supports, which lead to the best strategy for computing LC features with high rotation invariance in image retrieval.

  • A Reversible Data Hiding Method in Compressible Encrypted Images

    Shoko IMAIZUMI  Yusuke IZAWA  Ryoichi HIRASAWA  Hitoshi KIYA  

     
    PAPER-Cryptography and Information Security

      Vol:
    E103-A No:12
      Page(s):
    1579-1588

    We propose a reversible data hiding (RDH) method in compressible encrypted images called the encryption-then-compression (EtC) images. The proposed method allows us to not only embed a payload in encrypted images but also compress the encrypted images containing the payload. In addition, the proposed RDH method can be applied to both plain images and encrypted ones, and the payload can be extracted flexibly in the encrypted domain or from the decrypted images. Various RDH methods have been studied in the encrypted domain, but they are not considered to be two-domain data hiding, and the resultant images cannot be compressed by using image coding standards, such as JPEG-LS and JPEG 2000. In our experiment, the proposed method shows high performance in terms of lossless compression efficiency by using JPEG-LS and JPEG 2000, data hiding capacity, and marked image quality.

  • Deep Attention Residual Hashing

    Yang LI  Zhuang MIAO  Ming HE  Yafei ZHANG  Hang LI  

     
    LETTER-Image

      Vol:
    E101-A No:3
      Page(s):
    654-657

    How to represent images into highly compact binary codes is a critical issue in many computer vision tasks. Existing deep hashing methods typically focus on designing loss function by using pairwise or triplet labels. However, these methods ignore the attention mechanism in the human visual system. In this letter, we propose a novel Deep Attention Residual Hashing (DARH) method, which directly learns hash codes based on a simple pointwise classification loss function. Compared to previous methods, our method does not need to generate all possible pairwise or triplet labels from the training dataset. Specifically, we develop a new type of attention layer which can learn human eye fixation and significantly improves the representation ability of hash codes. In addition, we embedded the attention layer into the residual network to simultaneously learn discriminative image features and hash codes in an end-to-end manner. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application.

  • Deep Discriminative Supervised Hashing via Siamese Network

    Yang LI  Zhuang MIAO  Jiabao WANG  Yafei ZHANG  Hang LI  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2017/09/12
      Vol:
    E100-D No:12
      Page(s):
    3036-3040

    The latest deep hashing methods perform hash codes learning and image feature learning simultaneously by using pairwise or triplet labels. However, generating all possible pairwise or triplet labels from the training dataset can quickly become intractable, where the majority of those samples may produce small costs, resulting in slow convergence. In this letter, we propose a novel deep discriminative supervised hashing method, called DDSH, which directly learns hash codes based on a new combined loss function. Compared to previous methods, our method can take full advantages of the annotated data in terms of pairwise similarity and image identities. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application. Remarkably, our 16-bits binary representation can surpass the performance of existing 48-bits binary representation, which demonstrates that our method can effectively improve the speed and precision of large scale image retrieval systems.

  • Image Retrieval Framework Based on Dual Representation Descriptor

    Yuichi YOSHIDA  Tsuyoshi TOYOFUKU  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2017/07/06
      Vol:
    E100-D No:10
      Page(s):
    2605-2613

    Descriptor aggregation techniques such as the Fisher vector and vector of locally aggregated descriptors (VLAD) are used in most image retrieval frameworks. It takes some time to extract local descriptors, and the geometric verification requires storage if a real-valued descriptor such as SIFT is used. Moreover, if we apply binary descriptors to such a framework, the performance of image retrieval is not better than if we use a real-valued descriptor. Our approach tackles these issues by using a dual representation descriptor that has advantages of being both a real-valued and a binary descriptor. The real value of the dual representation descriptor is aggregated into a VLAD in order to achieve high accuracy in the image retrieval, and the binary one is used to find correspondences in the geometric verification stage in order to reduce the amount of storage needed. We implemented a dual representation descriptor extracted in semi-real time by using the CARD descriptor. We evaluated the accuracy of our image retrieval framework including the geometric verification on three datasets (holidays, ukbench and Stanford mobile visual search). The results indicate that our framework is as accurate as the framework that uses SIFT. In addition, the experiments show that the image retrieval speed and storage requirements of our framework are as efficient as those of a framework that uses ORB.

  • Biomimetics Image Retrieval Platform Open Access

    Miki HASEYAMA  Takahiro OGAWA  Sho TAKAHASHI  Shuhei NOMURA  Masatsugu SHIMOMURA  

     
    INVITED PAPER

      Pubricized:
    2017/05/19
      Vol:
    E100-D No:8
      Page(s):
    1563-1573

    Biomimetics is a new research field that creates innovation through the collaboration of different existing research fields. However, the collaboration, i.e., the exchange of deep knowledge between different research fields, is difficult for several reasons such as differences in technical terms used in different fields. In order to overcome this problem, we have developed a new retrieval platform, “Biomimetics image retrieval platform,” using a visualization-based image retrieval technique. A biological database contains a large volume of image data, and by taking advantage of these image data, we are able to overcome limitations of text-only information retrieval. By realizing such a retrieval platform that does not depend on technical terms, individual biological databases of various species can be integrated. This will allow not only the use of data for the study of various species by researchers in different biological fields but also access for a wide range of researchers in fields ranging from materials science, mechanical engineering and manufacturing. Therefore, our platform provides a new path bridging different fields and will contribute to the development of biomimetics since it can overcome the limitation of the traditional retrieval platform.

  • Multimodal Learning of Geometry-Preserving Binary Codes for Semantic Image Retrieval Open Access

    Go IRIE  Hiroyuki ARAI  Yukinobu TANIGUCHI  

     
    INVITED PAPER

      Pubricized:
    2017/01/06
      Vol:
    E100-D No:4
      Page(s):
    600-609

    This paper presents an unsupervised approach to feature binary coding for efficient semantic image retrieval. Although the majority of the existing methods aim to preserve neighborhood structures of the feature space, semantically similar images are not always in such neighbors but are rather distributed in non-linear low-dimensional manifolds. Moreover, images are rarely alone on the Internet and are often surrounded by text data such as tags, attributes, and captions, which tend to carry rich semantic information about the images. On the basis of these observations, the approach presented in this paper aims at learning binary codes for semantic image retrieval using multimodal information sources while preserving the essential low-dimensional structures of the data distributions in the Hamming space. Specifically, after finding the low-dimensional structures of the data by using an unsupervised sparse coding technique, our approach learns a set of linear projections for binary coding by solving an optimization problem which is designed to jointly preserve the extracted data structures and multimodal data correlations between images and texts in the Hamming space as much as possible. We show that the joint optimization problem can readily be transformed into a generalized eigenproblem that can be efficiently solved. Extensive experiments demonstrate that our method yields significant performance gains over several existing methods.

  • Privacy-Enhanced Similarity Search Scheme for Cloud Image Databases

    Hao LIU  Hideaki GOTO  

     
    LETTER-Information Network

      Pubricized:
    2016/09/12
      Vol:
    E99-D No:12
      Page(s):
    3188-3191

    The privacy of users' data has become a big issue for cloud service. This research focuses on image cloud database and the function of similarity search. To enhance security for such database, we propose a framework of privacy-enhanced search scheme, while all the images in the database are encrypted, and similarity image search is still supported.

  • Local Multi-Coordinate System for Object Retrieval

    Go IRIE  Yukito WATANABE  Takayuki KUROZUMI  Tetsuya KINEBUCHI  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2016/07/06
      Vol:
    E99-D No:10
      Page(s):
    2656-2660

    Encoding multiple SIFT descriptors into a single vector is a key technique for efficient object image retrieval. In this paper, we propose an extension of local coordinate system (LCS) for image representation. The previous LCS approaches encode each SIFT descriptor by a single local coordinate, which is not adequate for localizing its position in the descriptor space. Instead, we use multiple local coordinates to represent each descriptor with PCA-based decorrelation. Experiments show that this simple modification can improve retrieval performance significantly.

  • A Similarity Study of Interactive Content-Based Image Retrieval Scheme for Classification of Breast Lesions

    Hyun-chong CHO  Lubomir HADJIISKI  Berkman SAHINER  Heang-Ping CHAN  Chintana PARAMAGUL  Mark HELVIE  Alexis V. NEES  Hyun Chin CHO  

     
    PAPER-Biological Engineering

      Pubricized:
    2016/02/29
      Vol:
    E99-D No:6
      Page(s):
    1663-1670

    To study the similarity between queries and retrieved masses, we design an interactive CBIR (Content-based Image Retrieval) CADx (Computer-aided Diagnosis) system using relevance feedback for the characterization of breast masses in ultrasound (US) images based on radiologists' visual similarity assessment. The CADx system retrieves masses that are similar to query masses from a reference library based on six computer-extracted features that describe the texture, width-to-height, and posterior shadowing of the mass. The k-NN retrieval with Euclidean distance similarity measure and the Rocchio relevance feedback algorithm (RRF) are used. To train the RRF parameters, the similarities of 1891 image pairs from 62 (31 malignant and 31 benign) masses are rated by 3 MQSA (Mammography Quality Standards Act) radiologists using a 9-point scale (9=most similar). The best RRF parameters are chosen based on 3 observer experiments. For testing, 100 independent query masses (49 malignant and 51 benign) and 121 reference masses on 230 (79 malignant and 151 benign) images were collected. Three radiologists rated the similarity between the query masses and the computer-retrieved masses. Average similarity ratings without and with RRF were 5.39 and 5.64 for the training set and 5.78 and 6.02 for the test set, respectively. Average AUC values without and with RRF were, respectively, 0.86±0.03 and 0.87±0.03 for the training set and 0.91±0.03 and 0.90±0.03 for the test set. On average, masses retrieved using the CBIR system were moderately similar to the query masses based on radiologists' similarity assessments. RRF improved the similarity of the retrieved masses.

  • Integrating Multiple Global and Local Features by Product Sparse Coding for Image Retrieval

    Li TIAN  Qi JIA  Sei-ichiro KAMATA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/12/21
      Vol:
    E99-D No:3
      Page(s):
    731-738

    In this study, we propose a simple, yet general and powerful framework of integrating multiple global and local features by Product Sparse Coding (PSC) for image retrieval. In our framework, multiple global and local features are extracted from images and then are transformed to Trimmed-Root (TR)-features. After that, the features are encoded into compact codes by PSC. Finally, a two-stage ranking strategy is proposed for indexing in retrieval. We make three major contributions in this study. First, we propose TR representation of multiple image features and show that the TR representation offers better performance than the original features. Second, the integrated features by PSC is very compact and effective with lower complexity than by the standard sparse coding. Finally, the two-stage ranking strategy can balance the efficiency and memory usage in storage. Experiments demonstrate that our compact image representation is superior to the state-of-the-art alternatives for large-scale image retrieval.

  • Query Bootstrapping: A Visual Mining Based Query Expansion

    Siriwat KASAMWATTANAROTE  Yusuke UCHIDA  Shin'ichi SATOH  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2015/11/10
      Vol:
    E99-D No:2
      Page(s):
    454-466

    Bag of Visual Words (BoVW) is an effective framework for image retrieval. Query expansion (QE) further boosts retrieval performance by refining a query with relevant visual words found from the geometric consistency check between the query image and highly ranked retrieved images obtained from the first round of retrieval. Since QE checks the pairwise consistency between query and highly ranked images, its performance may deteriorate when there are slight degradations in the query image. We propose Query Bootstrapping as a variant of QE to circumvent this problem by using the consistency of highly ranked images instead of pairwise consistency. In so doing, we regard frequently co-occurring visual words in highly ranked images as relevant visual words. Frequent itemset mining (FIM) is used to find such visual words efficiently. However, the FIM-based approach requires sensitive parameters to be fine-tuned, namely, support (min/max-support) and the number of top ranked images (top-k). Here, we propose an adaptive support algorithm that adaptively determines both the minimum support and maximum support by referring to the first round's retrieval list. Selecting relevant images by using a geometric consistency check further boosts retrieval performance by reducing outlier images from a mining process. An important parameter for the LO-RANSAC algorithm that is used for the geometric consistency check, namely, inlier threshold, is automatically determined by our algorithm. We further introduce tf-fi-idf on top of tf-idf in order to take into account the frequency of inliers (fi) in the retrieved images. We evaluated the performance of QB in terms of mean average precision (mAP) on three benchmark datasets and found that it gave significant performance boosts of 5.37%, 9.65%, and 8.52% over that of state-of-the-art QE on Oxford 5k, Oxford 105k, and Paris 6k, respectively.

  • An Efficient Filtering Method for Scalable Face Image Retrieval

    Deokmin HAAM  Hyeon-Gyu KIM  Myoung-Ho KIM  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2014/12/11
      Vol:
    E98-D No:3
      Page(s):
    733-736

    This paper presents a filtering method for efficient face image retrieval over large volume of face databases. The proposed method employs a new face image descriptor, called a cell-orientation vector (COV). It has a simple form: a 72-dimensional vector of integers from 0 to 8. Despite of its simplicity, it achieves high accuracy and efficiency. Our experimental results show that the proposed method based on COVs provides better performance than a recent approach based on identity-based quantization in terms of both accuracy and efficiency.

  • Multiple Binary Codes for Fast Approximate Similarity Search

    Shinichi SHIRAKAWA  

     
    PAPER-Pattern Recognition

      Pubricized:
    2014/12/11
      Vol:
    E98-D No:3
      Page(s):
    671-680

    One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.

  • Object Extraction Using an Edge-Based Feature for Query-by-Sketch Image Retrieval

    Takuya TAKASU  Yoshiki KUMAGAI  Gosuke OHASHI  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2014/10/15
      Vol:
    E98-D No:1
      Page(s):
    214-217

    We previously proposed a query-by-sketch image retrieval system that uses an edge relation histogram (ERH). However, it is difficult for this method to retrieve partial objects from an image, because the ERH is a feature of the entire image, not of each object. Therefore, we propose an object-extraction method that uses edge-based features in order to enable the query-by-sketch system to retrieve partial images. This method is applied to 20,000 images from the Corel Photo Gallery. We confirm that retrieval accuracy is improved by using the edge-based features for extracting objects, enabling the query-by-sketch system to retrieve partial images.

  • Erasable Photograph Tagging: A Mobile Application Framework Employing Owner's Voice

    Zhenfei ZHAO  Hao LUO  Hua ZHONG  Bian YANG  Zhe-Ming LU  

     
    LETTER-Speech and Hearing

      Vol:
    E97-D No:2
      Page(s):
    370-372

    This letter proposes a mobile application framework named erasable photograph tagging (EPT) for photograph annotation and fast retrieval. The smartphone owner's voice is employed as tags and hidden in the host photograph without an extra feature database aided for retrieval. These digitized tags can be erased anytime with no distortion remaining in the recovered photograph.

  • Mining Knowledge on Relationships between Objects from the Web

    Xinpeng ZHANG  Yasuhito ASANO  Masatoshi YOSHIKAWA  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:1
      Page(s):
    77-88

    How do global warming and agriculture influence each other? It is possible to answer the question by searching knowledge about the relationship between global warming and agriculture. As exemplified by this question, strong demands exist for searching relationships between objects. Mining knowledge about relationships on Wikipedia has been studied. However, it is desired to search more diverse knowledge about relationships on the Web. By utilizing the objects constituting relationships mined from Wikipedia, we propose a new method to search images with surrounding text that include knowledge about relationships on the Web. Experimental results show that our method is effective and applicable in searching knowledge about relationships. We also construct a relationship search system named “Enishi” based on the proposed new method. Enishi supplies a wealth of diverse knowledge including images with surrounding text to help users to understand relationships deeply, by complementarily utilizing knowledge from Wikipedia and the Web.

1-20hit(51hit)