The search functionality is under construction.

Keyword Search Result

[Keyword] SIFT(23hit)

1-20hit(23hit)

  • Co-Propagation with Distributed Seeds for Salient Object Detection

    Yo UMEKI  Taichi YOSHIDA  Masahiro IWAHASHI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2018/03/09
      Vol:
    E101-D No:6
      Page(s):
    1640-1647

    In this paper, we propose a method of salient object detection based on distributed seeds and a co-propagation of seed information. Salient object detection is a technique which estimates important objects for human by calculating saliency values of pixels. Previous salient object detection methods often produce incorrect saliency values near salient objects in the case of images which have some objects, called the leakage of saliencies. Therefore, a method based on a co-propagation, the scale invariant feature transform, the high dimensional color transform, and machine learning is proposed to reduce the leakage. Firstly, the proposed method estimates regions clearly located in salient objects and the background, which are called as seeds and resultant seeds, are distributed over images. Next, the saliency information of seeds is simultaneously propagated, which is then referred as a co-propagation. The proposed method can reduce the leakage caused because of the above methods when the co-propagation of each information collide with each other near the boundary. Experiments show that the proposed method significantly outperforms the state-of-the-art methods in mean absolute error and F-measure, which perceptually reduces the leakage.

  • Codebook Learning for Image Recognition Based on Parallel Key SIFT Analysis

    Feng YANG  Zheng MA  Mei XIE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/01/10
      Vol:
    E100-D No:4
      Page(s):
    927-930

    The quality of codebook is very important in visual image classification. In order to boost the classification performance, a scheme of codebook generation for scene image recognition based on parallel key SIFT analysis (PKSA) is presented in this paper. The method iteratively applies classical k-means clustering algorithm and similarity analysis to evaluate key SIFT descriptors (KSDs) from the input images, and generates the codebook by a relaxed k-means algorithm according to the set of KSDs. With the purpose of evaluating the performance of the PKSA scheme, the image feature vector is calculated by sparse code with Spatial Pyramid Matching (ScSPM) after the codebook is constructed. The PKSA-based ScSPM method is tested and compared on three public scene image datasets. The experimental results show the proposed scheme of PKSA can significantly save computational time and enhance categorization rate.

  • XY-Separable Scale-Space Filtering by Polynomial Representations and Its Applications Open Access

    Gou KOUTAKI  Keiichi UCHIMURA  

     
    INVITED PAPER

      Pubricized:
    2017/01/11
      Vol:
    E100-D No:4
      Page(s):
    645-654

    In this paper, we propose the application of principal component analysis (PCA) to scale-spaces. PCA is a standard method used in computer vision. Because the translation of an input image into scale-space is a continuous operation, it requires the extension of conventional finite matrix-based PCA to an infinite number of dimensions. Here, we use spectral theory to resolve this infinite eigenvalue problem through the use of integration, and we propose an approximate solution based on polynomial equations. In order to clarify its eigensolutions, we apply spectral decomposition to Gaussian scale-space and scale-normalized Laplacian of Gaussian (sLoG) space. As an application of this proposed method, we introduce a method for generating Gaussian blur images and sLoG images, demonstrating that the accuracy of such an image can be made very high by using an arbitrary scale calculated through simple linear combination. Furthermore, to make the scale-space filtering efficient, we approximate the basis filter set using Gaussian lobes approximation and we can obtain XY-Separable filters. As a more practical example, we propose a new Scale Invariant Feature Transform (SIFT) detector.

  • Hardware-Efficient Local Extrema Detection for Scale-Space Extrema Detection in SIFT Algorithm

    Kazuhito ITO  Hiroki HAYASHI  

     
    LETTER

      Vol:
    E99-A No:12
      Page(s):
    2507-2510

    In this paper a hardware-efficient local extrema detection (LED) method used for scale-space extrema detection in the SIFT algorithm is proposed. By reformulating the reuse of the intermediate results in taking the local maximum and minimum, the necessary operations in LED are reduced without degrading the detection accuracy. The proposed method requires 25% to 35% less logic resources than the conventional method when implemented in an FPGA with a slight increase in latency.

  • Full-HD 60fps FPGA Implementation of Spatio-Temporal Keypoint Extraction Based on Gradient Histogram and Parallelization of Keypoint Connectivity

    Takahiro SUZUKI  Takeshi IKENAGA  

     
    PAPER-Vision

      Vol:
    E99-A No:11
      Page(s):
    1937-1946

    Recently, cloud systems have started to be utilized for services which analyze user's data in the field of computer vision. In these services, keypoints are extracted from images or videos, and the data is identified by machine learning with a large database in the cloud. To reduce the number of keypoints which are sent to the cloud, Keypoints of Interest (KOI) extraction has been proposed. However, since its computational complexity is large, hardware implementation is required for real-time processing. Moreover, the hardware resource must be low because it is embedded in devices of users. This paper proposes a hardware-friendly KOI algorithm with low amount of computations and its real-time hardware implementation based on dual threshold keypoint detection by gradient histogram and parallelization of connectivity of adjacent keypoint-utilizing register counters. The algorithm utilizes dual-histogram based detection and keypoint-matching based calculation of motion information and dense-clustering based keypoint smoothing. The hardware architecture is composed of a detection module utilizing descriptor, and grid-region-parallelization based density clustering. Finally, the evaluation results of hardware implementation show that the implemented hardware achieves Full-HD (1920x1080)-60 fps spatio-temporal keypoint extraction. Further, it is 47 times faster than low complexity keypoint extraction on software and 12 times faster than spatio-temporal keypoint extraction on software, and the hardware resources are almost the same as SIFT hardware implementation, maintaining accuracy.

  • Improvements of Local Descriptor in HOG/SIFT by BOF Approach

    Zhouxin YANG  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:5
      Page(s):
    1293-1303

    Numerous studies have been focusing on the improvement of bag of features (BOF), histogram of oriented gradient (HOG) and scale invariant feature transform (SIFT). However, few works have attempted to learn the connection between them even though the latter two are widely used as local feature descriptor for the former one. Motivated by the resemblance between BOF and HOG/SIFT in the descriptor construction, we improve the performance of HOG/SIFT by a) interpreting HOG/SIFT as a variant of BOF in descriptor construction, and then b) introducing recently proposed approaches of BOF such as locality preservation, data-driven vocabulary, and spatial information preservation into the descriptor construction of HOG/SIFT, which yields the BOF-driven HOG/SIFT. Experimental results show that the BOF-driven HOG/SIFT outperform the original ones in pedestrian detection (for HOG), scene matching and image classification (for SIFT). Our proposed BOF-driven HOG/SIFT can be easily applied as replacements of the original HOG/SIFT in current systems since they are generalized versions of the original ones.

  • Low Complexity Keypoint Extraction Based on SIFT Descriptor and Its Hardware Implementation for Full-HD 60 fps Video

    Takahiro SUZUKI  Takeshi IKENAGA  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1376-1383

    Scale-Invariant Feature Transform (SIFT) has lately attracted attention in computer vision as a robust keypoint detection algorithm which is invariant for scale, rotation and illumination changes. However, its computational complexity is too high to apply in practical real-time applications. This paper proposes a low complexity keypoint extraction algorithm based on SIFT descriptor and utilization of the database, and its real-time hardware implementation for Full-HD resolution video. The proposed algorithm computes SIFT descriptor on the keypoint obtained by corner detection and selects a scale from the database. It is possible to parallelize the keypoint detection and descriptor computation modules in the hardware. These modules do not depend on each other in the proposed algorithm in contrast with SIFT that computes a scale. The processing time of descriptor computation in this hardware is independent of the number of keypoints because its descriptor generation is pipelining structure of pixel. Evaluation results show that the proposed algorithm on software is 12 times faster than SIFT. Moreover, the proposed hardware on FPGA is 427 times faster than SIFT and 61 times faster than the proposed algorithm on software. The proposed hardware performs keypoint extraction and matching at 60 fps for Full-HD video.

  • Concurrent Detection and Recognition of Individual Object Based on Colour and p-SIFT Features

    Jienan ZHANG  Shouyi YIN  Peng OUYANG  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1357-1365

    In this paper we propose a method to use features of an individual object to locate and recognize this object concurrently in a static image with Multi-feature fusion based on multiple objects sample library. This method is proposed based on the observation that lots of previous works focuses on category recognition and takes advantage of common characters of special category to detect the existence of it. However, these algorithms cease to be effective if we search existence of individual objects instead of categories in complex background. To solve this problem, we abandon the concept of category and propose an effective way to use directly features of an individual object as clues to detection and recognition. In our system, we import multi-feature fusion method based on colour histogram and prominent SIFT (p-SIFT) feature to improve detection and recognition accuracy rate. p-SIFT feature is an improved SIFT feature acquired by further feature extraction of correlation information based on Feature Matrix aiming at low computation complexity with good matching rate that is proposed by ourselves. In process of detecting object, we abandon conventional methods and instead take full use of multi-feature to start with a simple but effective way-using colour feature to reduce amounts of patches of interest (POI). Our method is evaluated on several publicly available datasets including Pascal VOC 2005 dataset, Objects101 and datasets provided by Achanta et al.

  • Parallelization of Computing-Intensive Tasks of SIFT Algorithm on a Reconfigurable Architecture System

    Peng OUYANG  Shouyi YIN  Hui GAO  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1393-1402

    Scale Invariant Feature Transform (SIFT) algorithm is a very excellent approach for feature detection. It is characterized by data intensive computation. The current studies of accelerating SIFT algorithm are mainly reflected in three aspects: optimizing the parallel parts of the algorithm based on general-purpose multi-core processors, designing the customized multi-core processor dedicated for SIFT, and implementing it based on the FPGA platform. The real-time performance of SIFT has been highly improved. However, the factors such as the input image size, the number of octaves and scale factors in the SIFT algorithm are restricted for some solutions, the flexibility that ensures the high execution performance under variable factors should be improved. This paper proposes a reconfigurable solution to solve this problem. We fully exploit the algorithm and adopt several techniques, such as full parallel execution, block computation and CORDIC transformation, etc., to improve the execution efficiency on a REconfigurable MUltimedia System called REMUS. Experimental results show that the execution performance of the SIFT is improved by 33%, 50% and 8 times comparing with that executed in the multi-core platform, FPGA and ASIC separately. The scheme of dynamic reconfiguration in this work can configure the circuits to meet the computation requirements under different input image size, different number of octaves and scale factors in the process of computing.

  • SIFT-Based Non-blind Watermarking Robust to Non-linear Geometrical Distortions

    Toshihiko YAMASAKI  Kiyoharu AIZAWA  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E96-D No:6
      Page(s):
    1368-1375

    This paper presents a non-blind watermarking technique that is robust to non-linear geometric distortion attacks. This is one of the most challenging problems for copyright protection of digital content because it is difficult to estimate the distortion parameters for the embedded blocks. In our proposed scheme, the location of the blocks are recorded by the translation parameters from multiple Scale Invariant Feature Transform (SIFT) feature points. This method is based on two assumptions: SIFT features are robust to non-linear geometric distortion and even such non-linear distortion can be regarded as “linear” distortion in local regions. We conducted experiments using 149,800 images (7 standard images and 100 images downloaded from Flickr, 10 different messages, 10 different embedding block patterns, and 14 attacks). The results show that the watermark detection performance is drastically improved, while the baseline method can achieve only chance level accuracy.

  • An Improved Face Clustering Method Using Weighted Graph for Matched SIFT Keypoints in Face Region

    Ji-Soo KEUM  Hyon-Soo LEE  

     
    LETTER-Pattern Recognition

      Vol:
    E96-D No:4
      Page(s):
    967-971

    In this paper, we propose an improved face clustering method using a weighted graph-based approach. We combine two parameters as the weight of a graph to improve clustering performance. One is average similarity, which is calculated with two constraints of geometric and symmetric properties, and the other is a newly proposed parameter called the orientation matching ratio, which is calculated from orientation analysis for matched keypoints in the face region. According to the results of face clustering for several datasets, the proposed method shows improved results compared to the previous method.

  • A Composite Illumination Invariant Color Feature and Its Application to Partial Image Matching

    Masaki KOBAYASHI  Keisuke KAMEYAMA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E95-D No:10
      Page(s):
    2522-2532

    In camera-based object recognition and classification, surface color is one of the most important characteristics. However, apparent object color may differ significantly according to the illumination and surface conditions. Such a variation can be an obstacle in utilizing color features. Geusebroek et al.'s color invariants can be a powerful tool for characterizing the object color regardless of illumination and surface conditions. In this work, we analyze the estimation process of the color invariants from RGB images, and propose a novel invariant feature of color based on the elementary invariants to meet the circular continuity residing in the mapping between colors and their invariants. Experiments show that the use of the proposed invariant in combination with luminance, contributes to improve the retrieval performances of partial object image matching under varying illumination conditions.

  • Global Selection vs Local Ordering of Color SIFT Independent Components for Object/Scene Classification

    Dan-ni AI  Xian-hua HAN  Guifang DUAN  Xiang RUAN  Yen-wei CHEN  

     
    PAPER-Pattern Recognition

      Vol:
    E94-D No:9
      Page(s):
    1800-1808

    This paper addresses the problem of ordering the color SIFT descriptors in the independent component analysis for image classification. Component ordering is of great importance for image classification, since it is the foundation of feature selection. To select distinctive and compact independent components (IC) of the color SIFT descriptors, we propose two ordering approaches based on local variation, named as the localization-based IC ordering and the sparseness-based IC ordering. We evaluate the performance of proposed methods, the conventional IC selection method (global variation based components selection) and original color SIFT descriptors on object and scene databases, and obtain the following two main results. First, the proposed methods are able to obtain acceptable classification results in comparison with original color SIFT descriptors. Second, the highest classification rate can be obtained by using the global selection method in the scene database, while the local ordering methods give the best performance for the object database.

  • A Low-Power Real-Time SIFT Descriptor Generation Engine for Full-HDTV Video Recognition

    Kosuke MIZUNO  Hiroki NOGUCHI  Guangji HE  Yosuke TERACHI  Tetsuya KAMINO  Tsuyoshi FUJINAGA  Shintaro IZUMI  Yasuo ARIKI  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER

      Vol:
    E94-C No:4
      Page(s):
    448-457

    This paper describes a SIFT (Scale Invariant Feature Transform) descriptor generation engine which features a VLSI oriented SIFT algorithm, three-stage pipelined architecture and novel systolic array architectures for Gaussian filtering and key-point extraction. The ROI-based scheme has been employed for the VLSI oriented algorithm. The novel systolic array architecture drastically reduces the number of operation cycle and memory access. The cycle counts of Gaussian filtering module is reduced by 82%, compared with the SIMD architecture. The number of memory accesses of the Gaussian filtering module and the key-point extraction module are reduced by 99.8% and 66% respectively, compared with the results obtained assuming the SIMD architecture. The proposed schemes provide processing capability for HDTV resolution video (1920 1080 pixels) at 30 frames per second (fps). The test chip has been fabricated in 65 nm CMOS technology and occupies 4.2 4.2 mm2 containing 1.1 M gates and 1.38 Mbit on-chip memory. The measured data demonstrates 38.2 mW power consumption at 78 MHz and 1.2 V.

  • Multilinear Supervised Neighborhood Embedding with Local Descriptor Tensor for Face Recognition

    Xian-Hua HAN  Xu QIAO  Yen-Wei CHEN  

     
    LETTER-Pattern Recognition

      Vol:
    E94-D No:1
      Page(s):
    158-161

    Subspace learning based face recognition methods have attracted considerable interest in recent years, including Principal Component Analysis (PCA), Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA), and some extensions for 2D analysis. However, a disadvantage of all these approaches is that they perform subspace analysis directly on the reshaped vector or matrix of pixel-level intensity, which is usually unstable under illumination or pose variance. In this paper, we propose to represent a face image as a local descriptor tensor, which is a combination of the descriptor of local regions (K*K-pixel patch) in the image, and is more efficient than the popular Bag-Of-Feature (BOF) model for local descriptor combination. Furthermore, we propose to use a multilinear subspace learning algorithm (Supervised Neighborhood Embedding-SNE) for discriminant feature extraction from the local descriptor tensor of face images, which can preserve local sample structure in feature space. We validate our proposed algorithm on Benchmark database Yale and PIE, and experimental results show recognition rate with our method can be greatly improved compared conventional subspace analysis methods especially for small training sample number.

  • Unsupervised Feature Selection and Category Classification for a Vision-Based Mobile Robot

    Masahiro TSUKADA  Yuya UTSUMI  Hirokazu MADOKORO  Kazuhito SATO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E94-D No:1
      Page(s):
    127-136

    This paper presents an unsupervised learning-based method for selection of feature points and object category classification without previous setting of the number of categories. Our method consists of the following procedures: 1)detection of feature points and description of features using a Scale-Invariant Feature Transform (SIFT), 2)selection of target feature points using One Class-Support Vector Machines (OC-SVMs), 3)generation of visual words of all SIFT descriptors and histograms in each image of selected feature points using Self-Organizing Maps (SOMs), 4)formation of labels using Adaptive Resonance Theory-2 (ART-2), and 5)creation and classification of categories on a category map of Counter Propagation Networks (CPNs) for visualizing spatial relations between categories. Classification results of static images using a Caltech-256 object category dataset and dynamic images using time-series images obtained using a robot according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category classification of appearance changes of objects.

  • Position-Invariant Robust Features for Long-Term Recognition of Dynamic Outdoor Scenes

    Aram KAWEWONG  Sirinart TANGRUAMSUB  Osamu HASEGAWA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E93-D No:9
      Page(s):
    2587-2601

    A novel Position-Invariant Robust Feature, designated as PIRF, is presented to address the problem of highly dynamic scene recognition. The PIRF is obtained by identifying existing local features (i.e. SIFT) that have a wide baseline visibility within a place (one place contains more than one sequential images). These wide-baseline visible features are then represented as a single PIRF, which is computed as an average of all descriptors associated with the PIRF. Particularly, PIRFs are robust against highly dynamical changes in scene: a single PIRF can be matched correctly against many features from many dynamical images. This paper also describes an approach to using these features for scene recognition. Recognition proceeds by matching an individual PIRF to a set of features from test images, with subsequent majority voting to identify a place with the highest matched PIRF. The PIRF system is trained and tested on 2000+ outdoor omnidirectional images and on COLD datasets. Despite its simplicity, PIRF offers a markedly better rate of recognition for dynamic outdoor scenes (ca. 90%) than the use of other features. Additionally, a robot navigation system based on PIRF (PIRF-Nav) can outperform other incremental topological mapping methods in terms of time (70% less) and memory. The number of PIRFs can be reduced further to reduce the time while retaining high accuracy, which makes it suitable for long-term recognition and localization.

  • Color Independent Components Based SIFT Descriptors for Object/Scene Classification

    Dan-ni AI  Xian-hua HAN  Xiang RUAN  Yen-wei CHEN  

     
    PAPER-Pattern Recognition

      Vol:
    E93-D No:9
      Page(s):
    2577-2586

    In this paper, we present a novel color independent components based SIFT descriptor (termed CIC-SIFT) for object/scene classification. We first learn an efficient color transformation matrix based on independent component analysis (ICA), which is adaptive to each category in a database. The ICA-based color transformation can enhance contrast between the objects and the background in an image. Then we compute CIC-SIFT descriptors over all three transformed color independent components. Since the ICA-based color transformation can boost the objects and suppress the background, the proposed CIC-SIFT can extract more effective and discriminative local features for object/scene classification. The comparison is performed among seven SIFT descriptors, and the experimental classification results show that our proposed CIC-SIFT is superior to other conventional SIFT descriptors.

  • How the Number of Interest Points Affect Scene Classification

    Wenjie XIE  De XU  Shuoyan LIU  Yingjun TANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E93-D No:4
      Page(s):
    930-933

    This paper focuses on the relationship between the number of interest points and the accuracy rate in scene classification. Here, we accept the common belief that more interest points can generate higher accuracy. But, few effort have been done in this field. In order to validate this viewpoint, in our paper, extensive experiments based on bag of words method are implemented. In particular, three different SIFT descriptors and five feature selection methods are adopted to change the number of interest points. As innovation point, we propose a novel dense SIFT descriptor named Octave Dense SIFT, which can generate more interest points and higher accuracy, and a new feature selection method called number mutual information (NMI), which has better robustness than other feature selection methods. Experimental results show that the number of interest points can aggressively affect classification accuracy.

  • Face Alignment Based on Statistical Models Using SIFT Descriptors

    Zisheng LI  Jun-ichi IMAI  Masahide KANEKO  

     
    PAPER-Processing

      Vol:
    E92-A No:12
      Page(s):
    3336-3343

    Active Shape Model (ASM) is a powerful statistical tool for image interpretation, especially in face alignment. In the standard ASM, local appearances are described by intensity profiles, and the model parameter estimation is based on the assumption that the profiles follow a Gaussian distribution. It suffers from variations of poses, illumination, expressions and obstacles. In this paper, an improved ASM framework, GentleBoost based SIFT-ASM is proposed. Local appearances of landmarks are originally represented by SIFT (Scale-Invariant Feature Transform) descriptors, which are gradient orientation histograms based representations of image neighborhood. They can provide more robust and accurate guidance for search than grey-level profiles. Moreover, GentleBoost classifiers are applied to model and search the SIFT features instead of the unnecessary assumption of Gaussian distribution. Experimental results show that SIFT-ASM significantly outperforms the original ASM in aligning and localizing facial features.

1-20hit(23hit)