The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] tract(469hit)

141-160hit(469hit)

  • Face Retrieval in Large-Scale News Video Datasets

    Thanh Duc NGO  Hung Thanh VU  Duy-Dinh LE  Shin'ichi SATOH  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E96-D No:8
      Page(s):
    1811-1825

    Face retrieval in news video has been identified as a challenging task due to the huge variations in the visual appearance of the human face. Although several approaches have been proposed to deal with this problem, their extremely high computational cost limits their scalability to large-scale video datasets that may contain millions of faces of hundreds of characters. In this paper, we introduce approaches for face retrieval that are scalable to such datasets while maintaining competitive performances with state-of-the-art approaches. To utilize the variability of face appearances in video, we use a set of face images called face-track to represent the appearance of a character in a video shot. Our first proposal is an approach for extracting face-tracks. We use a point tracker to explore the connections between detected faces belonging to the same character and then group them into one face-track. We present techniques to make the approach robust against common problems caused by flash lights, partial occlusions, and scattered appearances of characters in news videos. In the second proposal, we introduce an efficient approach to match face-tracks for retrieval. Instead of using all the faces in the face-tracks to compute their similarity, our approach obtains a representative face for each face-track. The representative face is computed from faces that are sampled from the original face-track. As a result, we significantly reduce the computational cost of face-track matching while taking into account the variability of faces in face-tracks to achieve high matching accuracy. Experiments are conducted on two face-track datasets extracted from real-world news videos, of such scales that have never been considered in the literature. One dataset contains 1,497 face-tracks of 41 characters extracted from 370 hours of TRECVID videos. The other dataset provides 5,567 face-tracks of 111 characters observed from a television news program (NHK News 7) over 11 years. We make both datasets publically accessible by the research community. The experimental results show that our proposed approaches achieved a remarkable balance between accuracy and efficiency.

  • Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation

    Zezhong LI  Hideto IKEDA  Junichi FUKUMOTO  

     
    PAPER-Natural Language Processing

      Vol:
    E96-D No:7
      Page(s):
    1536-1543

    In most phrase-based statistical machine translation (SMT) systems, the translation model relies on word alignment, which serves as a constraint for the subsequent building of a phrase table. Word alignment is usually inferred by GIZA++, which implements all the IBM models and HMM model in the framework of Expectation Maximum (EM). In this paper, we present a fully Bayesian inference for word alignment. Different from the EM approach, the Bayesian inference makes use of all possible parameter values rather than estimating a single parameter value, from which we expect a more robust inference. After inferring the word alignment, current SMT systems usually train the phrase table from Viterbi word alignment, which is prone to learn incorrect phrases due to the word alignment mistakes. To overcome this drawback, a new phrase extraction method is proposed based on multiple Gibbs samples from Bayesian inference for word alignment. Empirical results show promising improvements over baselines in alignment quality as well as the translation performance.

  • Extracting Events from Web Documents for Social Media Monitoring Using Structured SVM

    Yoonjae CHOI  Pum-Mo RYU  Hyunki KIM  Changki LEE  

     
    LETTER-Natural Language Processing

      Vol:
    E96-D No:6
      Page(s):
    1410-1414

    Event extraction is vital to social media monitoring and social event prediction. In this paper, we propose a method for social event extraction from web documents by identifying binary relations between named entities. There have been many studies on relation extraction, but their aims were mostly academic. For practical application, we try to identify 130 relation types that comprise 31 predefined event types, which address business and public issues. We use structured Support Vector Machine, the state of the art classifier to capture relations. We apply our method on news, blogs and tweets collected from the Internet and discuss the results.

  • Low Complexity Keypoint Extraction Based on SIFT Descriptor and Its Hardware Implementation for Full-HD 60 fps Video

    Takahiro SUZUKI  Takeshi IKENAGA  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1376-1383

    Scale-Invariant Feature Transform (SIFT) has lately attracted attention in computer vision as a robust keypoint detection algorithm which is invariant for scale, rotation and illumination changes. However, its computational complexity is too high to apply in practical real-time applications. This paper proposes a low complexity keypoint extraction algorithm based on SIFT descriptor and utilization of the database, and its real-time hardware implementation for Full-HD resolution video. The proposed algorithm computes SIFT descriptor on the keypoint obtained by corner detection and selects a scale from the database. It is possible to parallelize the keypoint detection and descriptor computation modules in the hardware. These modules do not depend on each other in the proposed algorithm in contrast with SIFT that computes a scale. The processing time of descriptor computation in this hardware is independent of the number of keypoints because its descriptor generation is pipelining structure of pixel. Evaluation results show that the proposed algorithm on software is 12 times faster than SIFT. Moreover, the proposed hardware on FPGA is 427 times faster than SIFT and 61 times faster than the proposed algorithm on software. The proposed hardware performs keypoint extraction and matching at 60 fps for Full-HD video.

  • A Method of Data Embedding and Extracting for Information Retrieval Considering Mobile Devices

    Mitsuji MUNEYASU  Hiroshi KUDO  Takafumi SHONO  Yoshiko HANADA  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1214-1221

    In this paper, we propose an improved data embedding and extraction method for information retrieval considering the use of mobile devices. Although the conventional method has demonstrated good results for images captured by cellular phones, some problems remain with this method. One problem is the lack of consideration of the construction of the code grouping in the code grouping method. In this paper, a new construction method for code grouping is proposed, and it is shown that a suitable grouping of the codes can be found. Another problem is the correction method of lens distortion, which is time-consuming. Therefore, to improve the processing speed, the golden section search method is adopted to estimate the distortion coefficients. In addition, a new tuning algorithm for the gain coefficient in the embedding process is also proposed. Experimental results show an increase in the detection rate for embedding data and a reduction of the processing time.

  • Bidirectional Local Template Patterns: An Effective and Discriminative Feature for Pedestrian Detection

    Jiu XU  Ning JIANG  Satoshi GOTO  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1204-1213

    In this paper, a novel feature named bidirectional local template patterns (B-LTP) is proposed for use in pedestrian detection in still images. B-LTP is a combination and modification of two features, histogram of templates (HOT) and center-symmetric local binary patterns (CS-LBP). For each pixel, B-LTP defines four templates, each of which contains the pixel itself and two neighboring center-symmetric pixels. For each template, it then calculates information from the relationships among these three pixels and from the two directional transitions across these pixels. Moreover, because the feature length of B-LTP is small, it consumes less memory and computational power. Experimental results on an INRIA dataset show that the speed and detection rate of our proposed B-LTP feature outperform those of other features such as histogram of orientated gradient (HOG), HOT, and covariance matrix (COV).

  • Homomorphic Filtered Spectral Peaks Energy for Automatic Detection of Vowel Onset Point in Continuous Speech

    Xian ZANG  Kil To CHONG  

     
    PAPER-Speech and Hearing

      Vol:
    E96-D No:4
      Page(s):
    949-956

    During the production of speech signals, the vowel onset point is an important event containing important information for many speech processing tasks, such as consonant-vowel unit recognition and speech end-points detection. In order to realize accurate automatic detection of vowel onset points, this paper proposes a reliable method using the energy characteristics of homomorphic filtered spectral peaks. The homomorphic filtering helps to separate the slowly varying vocal tract system characteristics from the rapidly fluctuating excitation characteristics in the cepstral domain. The distinct vocal tract shape related to vowels is obtained and the peaks in the estimated vocal tract spectrum provide accurate and stable information for VOP detection. Performance of the proposed method is compared with the existing method which uses the combination of evidence from the excitation source, spectral peaks, and modulation spectrum energies. The detection rate with different time resolutions, together with the missing rate and spurious rate, are used for comprehensive evaluation of the performance on continuous speech taken from the TIMIT database. The detection accuracy of the proposed method is 74.14% for ±10 ms resolution and it increases to 96.33% for ±40 ms resolution with 3.67% missing error and 4.14% spurious error, much better than the results obtained by the combined approach at each specified time resolution, especially the higher resolutions of ±10±30 ms. In the cases of speech corrupted by white noise, pink noise and f-16 noise, the proposed method also shows significant improvement in the performance compared with the existing method.

  • Query-by-Sketch Image Retrieval Using Edge Relation Histogram

    Yoshiki KUMAGAI  Gosuke OHASHI  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E96-D No:2
      Page(s):
    340-348

    There has recently been much research on content-based image retrieval (CBIR) that uses image features including color, shape, and texture. In CBIR, feature extraction is important because the retrieval result depends on the image feature. Query-by-sketch image retrieval is one of CBIR and query-by-sketch image retrieval is efficient because users simply have to draw a sketch to retrieve the desired images. In this type of retrieval, selecting the optimum feature extraction method is important because the retrieval result depends on the image feature. We have developed a query-by-sketch image retrieval method that uses an edge relation histogram (ERH) as a global and local feature intended for binary line images. This histogram is based on the patterns of distribution of other line pixels centered on each line pixel that have been obtained by global and local processing. ERH, which is a shift- and scale-invariant feature, focuses on the relation among the edge pixels. It is fairly simple to describe rotation- and symmetry-invariant features, and query-by-sketch image retrieval using ERH makes it possible to perform retrievals that are not affected by position, size, rotation, or mirroring. We applied the proposed method to 20,000 images in the Corel Photo Gallery. Experimental results showed that it was an effective means of retrieving images.

  • New POI Construction with Street-Level Imagery

    Chillo GA  Jeongho LEE  Won Hee LEE  Kiyun YU  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:1
      Page(s):
    129-133

    We present a novel point of interest (POI) construction approach based on street-level imagery (SLI) such as Google StreetView. Our method consists of: (1) the creation of a conflation map between an SLI trace and a vector map; (2) the detection of the corresponding buildings between the SLI scene and the conflation map; and (3) POI name extraction from a signboard in the SLI scene by user-interactive text recognition. Finally, a POI is generated through a combination of the POI name and attributes of the building object on a vector map. The proposed method showed recall of 92.99% and precision of 97.10% for real-world POIs.

  • GREAT-CEO: larGe scale distRibuted dEcision mAking Techniques for Wireless Chief Executive Officer Problems Open Access

    Xiaobo ZHOU  Xin HE  Khoirul ANWAR  Tad MATSUMOTO  

     
    INVITED PAPER

      Vol:
    E95-B No:12
      Page(s):
    3654-3662

    In this paper, we reformulate the issue related to wireless mesh networks (WMNs) from the Chief Executive Officer (CEO) problem viewpoint, and provide a practical solution to a simple case of the problem. It is well known that the CEO problem is a theoretical basis for sensor networks. The problem investigated in this paper is described as follows: an originator broadcasts its binary information sequence to several forwarding nodes (relays) over Binary Symmetric Channels (BSC); the originator's information sequence suffers from independent random binary errors; at the forwarding nodes, they just further interleave, encode the received bit sequence, and then forward it, without making heavy efforts for correcting errors that may occur in the originator-relay links, to the final destination (FD) over Additive White Gaussian Noise (AWGN) channels. Hence, this strategy reduces the complexity of the relay significantly. A joint iterative decoding technique at the FD is proposed by utilizing the knowledge of the correlation due to the errors occurring in the link between the originator and forwarding nodes (referred to as intra-link). The bit-error-rate (BER) performances show that the originator's information can be reconstructed at the FD even by using a very simple coding scheme. We provide BER performance comparison between joint decoding and separate decoding strategies. The simulation results show that excellent performance can be achieved by the proposed system. Furthermore, extrinsic information transfer (EXIT) chart analysis is performed to investigate convergence property of the proposed technique, with the aim of, in part, optimizing the code rate at the originator.

  • Integer Programming-Based Approach to Attractor Detection and Control of Boolean Networks

    Tatsuya AKUTSU  Yang ZHAO  Morihiro HAYASHIDA  Takeyuki TAMURA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E95-D No:12
      Page(s):
    2960-2970

    The Boolean network (BN) can be used to create discrete mathematical models of gene regulatory networks. In this paper, we consider three problems on BNs that are known to be NP-hard: detection of a singleton attractor, finding a control strategy that shifts a BN from a given initial state to the desired state, and control of attractors. We propose integer programming-based methods which solve these problems in a unified manner. Then, we present results of computational experiments which suggest that the proposed methods are useful for solving moderate size instances of these problems. We also show that control of attractors is -hard, which suggests that control of attractors is harder than the other two problems.

  • On Improving JPEG Entropy Coding by means of Sub-Stream Extraction

    Youngjin KIM  Hyun Joon SHIN  Jung-Ju CHOI  Youngcheul WEE  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E95-D No:11
      Page(s):
    2737-2740

    We introduce an entropy coding method to enhance the compression efficiency of JPEG. Because run-length coding and early-termination work more effectively for longer zero sequences, we extract ones and negative ones from the coefficients and reduce the magnitude of all coefficients by one. The extracted coefficients are encoded with a designated entropy coding method. The proposed method can transmit images in two parts progressively, where the first contains JPEG-compatible image with a small amount of degradation and the second is used to add fine details. Our method improves the compression ratio by more than 5% without sacrificing the efficiency of JPEG.

  • Topic Extraction for Documents Based on Compressibility Vector

    Nuo ZHANG  Toshinori WATANABE  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:10
      Page(s):
    2438-2446

    Nowadays, there are a great deal of e-documents being accessed on the Internet. It would be helpful if those documents and significant extract contents could be automatically analyzed. Similarity analysis and topic extraction are widely used as document relation analysis techniques. Most of the methods being proposed need some processes such as stemming, stop words removal, and etc. In those methods, natural language processing (NLP) technology is necessary and hence they are dependent on the language feature and the dataset. In this study, we propose novel document relation analysis and topic extraction methods based on text compression. Our proposed approaches do not require NLP, and can also automatically evaluate documents. We challenge our proposal with model documents, URCS and Reuters-21578 dataset, for relation analysis and topic extraction. The effectiveness of the proposed methods is shown by the simulations.

  • Crosstalk Analysis and Measurement Technique for High Frequency Signal Transfer in MEMs Probe Pins

    Duc Long LUONG  Hyeonju BAE  Wansoo NAH  

     
    PAPER

      Vol:
    E95-C No:9
      Page(s):
    1459-1464

    This paper develops a methodology of crosstalk analysis/measurement techniques for the design and fabrication of the MEMs (Micro-ElectroMichanical system) probe card. By introducing more ground pins into the connector pins, the crosstalk characteristics can be enhanced and a design guide for the parameters, such as pin's size and pitch is proposed to satisfy the given crosstalk limitation of -30 dB for reliable high speed signal transfer. The paper also presents a novel method to characterize scattering parameters of multiport interconnect circuits with a 4-port VNA (Vector Network Analyzer). By employing the re-normalization of scattering matrices with different reference impedances at other ports, data obtained from 4-port configuration measurements can be synthesized to build a full scattering matrix of the DUT (Device-Under-Test, MEMs probe connector pins). In comparison to the conventional 2-port VNA re-normalization method, proposed technique has two advantages: saving of measuring time, and enhanced accuracy even with open-ended unmeasured ports. A good agreement of the estimated and correct S parameters verifies the validness of the proposed algorithm.

  • Self-Clustering Symmetry Detection

    Bei HE  Guijin WANG  Chenbo SHI  Xuanwu YIN  Bo LIU  Xinggang LIN  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E95-D No:9
      Page(s):
    2359-2362

    This paper presents a self-clustering algorithm to detect symmetry in images. We combine correlations of orientations, scales and descriptors as a triple feature vector to evaluate each feature pair while low confidence pairs are regarded as outliers and removed. Additionally, all confident pairs are preserved to extract potential symmetries since one feature point may be shared by different pairs. Further, each feature pair forms one cluster and is merged and split iteratively based on the continuity in the Cartesian and concentration in the polar coordinates. Pseudo symmetric axes and outlier midpoints are eliminated during the process. Experiments demonstrate the robustness and accuracy of our algorithm visually and quantitatively.

  • A Constant-Round Resettably-Sound Resettable Zero-Knowledge Argument in the BPK Model

    Seiko ARITA  

     
    PAPER-Cryptography and Information Security

      Vol:
    E95-A No:8
      Page(s):
    1390-1401

    In resetting attacks against a proof system, a prover or a verifier is reset and enforced to use the same random tape on various inputs as many times as an adversary may want. Recent deployment of cloud computing gives these attacks a new importance. This paper shows that argument systems for any NP language that are both resettably-sound and resettable zero-knowledge are possible by a constant-round protocol in the BPK model. For that sake, we define and construct a resettably-extractable conditional commitment scheme.

  • Chaotic Behavior in a Switching Delay Circuit

    Akihito MATSUO  Hiroyuki ASAHARA  Takuji KOUSAKA  

     
    PAPER-Nonlinear Problems

      Vol:
    E95-A No:8
      Page(s):
    1329-1336

    This paper clarifies the bifurcation structure of the chaotic attractor in an interrupted circuit with switching delay from theoretical and experimental view points. First, we introduce the circuit model and its dynamics. Next, we define the return map in order to investigate the bifurcation structure of the chaotic attractor. Finally, we discuss the dynamical effect of switching delay in the existence region of the chaotic attractor compared with that of a circuit with ideal switching.

  • Pedestrian Detection Using Gradient Local Binary Patterns

    Ning JIANG  Jiu XU  Satoshi GOTO  

     
    PAPER-Coding & Processing

      Vol:
    E95-A No:8
      Page(s):
    1280-1287

    In recent years, local pattern based features have attracted increasing interest in object detection and recognition systems. Local Binary Pattern (LBP) feature is widely used in texture classification and face detection. But the original definition of LBP is not suitable for human detection. In this paper, we propose a novel feature named gradient local binary patterns (GLBP) for human detection. In this feature, original 256 local binary patterns are reduced to 56 patterns. These 56 patterns named uniform patterns are used for generating a 56-bin histogram. And gradient value of each pixel is set as the weight which is always same in LBP based features in histogram calculation to computing the values in 56 bins for histogram. Experiments are performed on INRIA dataset, which shows the proposal GLBP feature is discriminative than histogram of orientated gradient (HOG), Semantic Local Binary Patterns (S-LBP) and histogram of template (HOT). In our experiments, the window size is fixed. That means the performance can be improved by boosting methods. And the computation of GLBP feature is parallel, which make it easy for hardware acceleration. These factors make GLBP feature possible for real-time pedestrian detection.

  • Automatic Road Area Extraction from Printed Maps Based on Linear Feature Detection

    Sebastien CALLIER  Hideo SAITO  

     
    PAPER-Segmentation

      Vol:
    E95-D No:7
      Page(s):
    1758-1765

    Raster maps are widely available in the everyday life, and can contain a huge amount of information of any kind using labels, pictograms, or color code e.g. However, it is not an easy task to extract roads from those maps due to those overlapping features. In this paper, we focus on an automated method to extract roads by using linear features detection to search for seed points having a high probability to belong to roads. Those linear features are lines of pixels of homogenous color in each direction around each pixel. After that, the seeds are then expanded before choosing to keep or to discard the extracted element. Because this method is not mainly based on color segmentation, it is also suitable for handwritten maps for example. The experimental results demonstrate that in most cases our method gives results similar to usual methods without needing any previous data or user input, but do need some knowledge on the target maps; and does work with handwritten maps if drawn following some basic rules whereas usual methods fail.

  • Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts

    Nattapong TONGTEP  Thanaruk THEERAMUNKONG  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:7
      Page(s):
    1932-1946

    Extracting named entities (NEs) and their relations is more difficult in Thai than in other languages due to several Thai specific characteristics, including no explicit boundaries for words, phrases and sentences; few case markers and modifier clues; high ambiguity in compound words and serial verbs; and flexible word orders. Unlike most previous works which focused on NE relations of specific actions, such as work_for, live_in, located_in, and kill, this paper proposes more general types of NE relations, called predicate-oriented relation (PoR), where an extracted action part (verb) is used as a core component to associate related named entities extracted from Thai Texts. Lacking a practical parser for the Thai language, we present three types of surface features, i.e. punctuation marks (such as token spaces), entity types and the number of entities and then apply five alternative commonly used learning schemes to investigate their performance on predicate-oriented relation extraction. The experimental results show that our approach achieves the F-measure of 97.76%, 99.19%, 95.00% and 93.50% on four different types of predicate-oriented relation (action-location, location-action, action-person and person-action) in crime-related news documents using a data set of 1,736 entity pairs. The effects of NE extraction techniques, feature sets and class unbalance on the performance of relation extraction are explored.

141-160hit(469hit)