The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] extraction(301hit)

81-100hit(301hit)

  • Predominant Melody Extraction from Polyphonic Music Signals Based on Harmonic Structure

    Jea-Yul YOON  Chai-Jong SONG  Hochong PARK  

     
    LETTER-Music Information Processing

      Vol:
    E96-D No:11
      Page(s):
    2504-2507

    A new method for predominant melody extraction from polyphonic music signals based on harmonic structure is proposed. The proposed method first extracts a set of fundamental frequency candidates by analyzing the distance between spectral peaks. Then, the predominant fundamental frequency is selected by pitch tracking according to the harmonic strength of the selected candidates. Finally, the method runs pitch smoothing on a large temporal scale for eliminating pitch doubling error, and conducts voicing frame detection. The proposed method shows the best overall performance for ADC 2004 DB in the MIREX 2011 audio melody extraction task.

  • A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour

    Pramual CHOORAT  Werapon CHIRACHARIT  Kosin CHAMNONGTHAI  Takao ONOYE  

     
    PAPER-Image Processing

      Vol:
    E96-A No:11
      Page(s):
    2169-2178

    In tooth contour extraction there is insufficient intensity difference in x-ray images between the tooth and dental bone. This difference must be enhanced in order to improve the accuracy of tooth segmentation. This paper proposes a method to improve the intensity between the tooth and dental bone. This method consists of an estimation of tooth orientation (intensity projection, smoothing filter, and peak detection) and PCA-Stacked Gabor with ellipse Gabor banks. Tooth orientation estimation is performed to determine the angle of a single oriented tooth. PCA-Stacked Gabor with ellipse Gabor banks is then used, in particular to enhance the border between the tooth and dental bone. Finally, active contour extraction is performed in order to determine tooth contour. In the experiment, in comparison with the conventional active contour without edge (ACWE) method, the average mean square error (MSE) values of extracted tooth contour points are reduced from 26.93% and 16.02% to 19.07% and 13.42% for tooth x-ray type I and type H images, respectively.

  • Face Retrieval in Large-Scale News Video Datasets

    Thanh Duc NGO  Hung Thanh VU  Duy-Dinh LE  Shin'ichi SATOH  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E96-D No:8
      Page(s):
    1811-1825

    Face retrieval in news video has been identified as a challenging task due to the huge variations in the visual appearance of the human face. Although several approaches have been proposed to deal with this problem, their extremely high computational cost limits their scalability to large-scale video datasets that may contain millions of faces of hundreds of characters. In this paper, we introduce approaches for face retrieval that are scalable to such datasets while maintaining competitive performances with state-of-the-art approaches. To utilize the variability of face appearances in video, we use a set of face images called face-track to represent the appearance of a character in a video shot. Our first proposal is an approach for extracting face-tracks. We use a point tracker to explore the connections between detected faces belonging to the same character and then group them into one face-track. We present techniques to make the approach robust against common problems caused by flash lights, partial occlusions, and scattered appearances of characters in news videos. In the second proposal, we introduce an efficient approach to match face-tracks for retrieval. Instead of using all the faces in the face-tracks to compute their similarity, our approach obtains a representative face for each face-track. The representative face is computed from faces that are sampled from the original face-track. As a result, we significantly reduce the computational cost of face-track matching while taking into account the variability of faces in face-tracks to achieve high matching accuracy. Experiments are conducted on two face-track datasets extracted from real-world news videos, of such scales that have never been considered in the literature. One dataset contains 1,497 face-tracks of 41 characters extracted from 370 hours of TRECVID videos. The other dataset provides 5,567 face-tracks of 111 characters observed from a television news program (NHK News 7) over 11 years. We make both datasets publically accessible by the research community. The experimental results show that our proposed approaches achieved a remarkable balance between accuracy and efficiency.

  • Track Extraction for Accelerated Targets in Dense Environments Using Variable Gating MLPDA

    Masanori MORI  Takashi MATSUZAKI  Hiroshi KAMEDA  Toru UMEZAWA  

     
    PAPER-Sensing

      Vol:
    E96-B No:8
      Page(s):
    2173-2179

    MLPDA (Maximum Likelihood Probabilistic Data Association) has attracted a great deal of attention as an effective target track extraction method in high false density environments. However, to extract an accelerated target track on a 2-dimensional plane, the computational load of the conventional MLPDA is extremely high, since it needs to search for the most-likely position, velocity and acceleration of the target in 6-dimensional space. In this paper, we propose VG-MLPDA (Variable Gating MLPDA), which consists of the following two steps. The first step is to search the target's position and velocity among candidates with the assumed acceleration by using variable gates, which take into account both the observation noise and the difference between assumed and true acceleration. The second step is to search the most-likely position, velocity and acceleration using a maximization algorithm while reducing the gate volume. Simulation results show the validity of our method.

  • Creating Chinese-English Comparable Corpora

    Degen HUANG  Shanshan WANG  Fuji REN  

     
    PAPER-Natural Language Processing

      Vol:
    E96-D No:8
      Page(s):
    1853-1861

    Comparable Corpora are valuable resources for many NLP applications, and extensive research has been done on information mining based on comparable corpora in recent years. While there are not enough large-scale available public comparable corpora at present, this paper presents a bi-directional CLIR-based method for creating comparable corpora from two independent news collections in different languages. The original Chinese document collections and English documents collections are crawled from XinHuaNet respectively and formatted in a consistent manner. For each document from the two collections, the best query keywords are extracted to represent the essential content of the document, and then the keywords are translated into the language of the other collection. The translated queries are run against the collection in the same language to pick up the candidate documents in the other language and candidates are aligned based on their publication dates and the similarity scores. Results show that our approach significantly outperforms previous approaches to the construction of Chinese-English comparable corpora.

  • Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation

    Zezhong LI  Hideto IKEDA  Junichi FUKUMOTO  

     
    PAPER-Natural Language Processing

      Vol:
    E96-D No:7
      Page(s):
    1536-1543

    In most phrase-based statistical machine translation (SMT) systems, the translation model relies on word alignment, which serves as a constraint for the subsequent building of a phrase table. Word alignment is usually inferred by GIZA++, which implements all the IBM models and HMM model in the framework of Expectation Maximum (EM). In this paper, we present a fully Bayesian inference for word alignment. Different from the EM approach, the Bayesian inference makes use of all possible parameter values rather than estimating a single parameter value, from which we expect a more robust inference. After inferring the word alignment, current SMT systems usually train the phrase table from Viterbi word alignment, which is prone to learn incorrect phrases due to the word alignment mistakes. To overcome this drawback, a new phrase extraction method is proposed based on multiple Gibbs samples from Bayesian inference for word alignment. Empirical results show promising improvements over baselines in alignment quality as well as the translation performance.

  • Low Complexity Keypoint Extraction Based on SIFT Descriptor and Its Hardware Implementation for Full-HD 60 fps Video

    Takahiro SUZUKI  Takeshi IKENAGA  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1376-1383

    Scale-Invariant Feature Transform (SIFT) has lately attracted attention in computer vision as a robust keypoint detection algorithm which is invariant for scale, rotation and illumination changes. However, its computational complexity is too high to apply in practical real-time applications. This paper proposes a low complexity keypoint extraction algorithm based on SIFT descriptor and utilization of the database, and its real-time hardware implementation for Full-HD resolution video. The proposed algorithm computes SIFT descriptor on the keypoint obtained by corner detection and selects a scale from the database. It is possible to parallelize the keypoint detection and descriptor computation modules in the hardware. These modules do not depend on each other in the proposed algorithm in contrast with SIFT that computes a scale. The processing time of descriptor computation in this hardware is independent of the number of keypoints because its descriptor generation is pipelining structure of pixel. Evaluation results show that the proposed algorithm on software is 12 times faster than SIFT. Moreover, the proposed hardware on FPGA is 427 times faster than SIFT and 61 times faster than the proposed algorithm on software. The proposed hardware performs keypoint extraction and matching at 60 fps for Full-HD video.

  • A Method of Data Embedding and Extracting for Information Retrieval Considering Mobile Devices

    Mitsuji MUNEYASU  Hiroshi KUDO  Takafumi SHONO  Yoshiko HANADA  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1214-1221

    In this paper, we propose an improved data embedding and extraction method for information retrieval considering the use of mobile devices. Although the conventional method has demonstrated good results for images captured by cellular phones, some problems remain with this method. One problem is the lack of consideration of the construction of the code grouping in the code grouping method. In this paper, a new construction method for code grouping is proposed, and it is shown that a suitable grouping of the codes can be found. Another problem is the correction method of lens distortion, which is time-consuming. Therefore, to improve the processing speed, the golden section search method is adopted to estimate the distortion coefficients. In addition, a new tuning algorithm for the gain coefficient in the embedding process is also proposed. Experimental results show an increase in the detection rate for embedding data and a reduction of the processing time.

  • Bidirectional Local Template Patterns: An Effective and Discriminative Feature for Pedestrian Detection

    Jiu XU  Ning JIANG  Satoshi GOTO  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1204-1213

    In this paper, a novel feature named bidirectional local template patterns (B-LTP) is proposed for use in pedestrian detection in still images. B-LTP is a combination and modification of two features, histogram of templates (HOT) and center-symmetric local binary patterns (CS-LBP). For each pixel, B-LTP defines four templates, each of which contains the pixel itself and two neighboring center-symmetric pixels. For each template, it then calculates information from the relationships among these three pixels and from the two directional transitions across these pixels. Moreover, because the feature length of B-LTP is small, it consumes less memory and computational power. Experimental results on an INRIA dataset show that the speed and detection rate of our proposed B-LTP feature outperform those of other features such as histogram of orientated gradient (HOG), HOT, and covariance matrix (COV).

  • Extracting Events from Web Documents for Social Media Monitoring Using Structured SVM

    Yoonjae CHOI  Pum-Mo RYU  Hyunki KIM  Changki LEE  

     
    LETTER-Natural Language Processing

      Vol:
    E96-D No:6
      Page(s):
    1410-1414

    Event extraction is vital to social media monitoring and social event prediction. In this paper, we propose a method for social event extraction from web documents by identifying binary relations between named entities. There have been many studies on relation extraction, but their aims were mostly academic. For practical application, we try to identify 130 relation types that comprise 31 predefined event types, which address business and public issues. We use structured Support Vector Machine, the state of the art classifier to capture relations. We apply our method on news, blogs and tweets collected from the Internet and discuss the results.

  • Query-by-Sketch Image Retrieval Using Edge Relation Histogram

    Yoshiki KUMAGAI  Gosuke OHASHI  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E96-D No:2
      Page(s):
    340-348

    There has recently been much research on content-based image retrieval (CBIR) that uses image features including color, shape, and texture. In CBIR, feature extraction is important because the retrieval result depends on the image feature. Query-by-sketch image retrieval is one of CBIR and query-by-sketch image retrieval is efficient because users simply have to draw a sketch to retrieve the desired images. In this type of retrieval, selecting the optimum feature extraction method is important because the retrieval result depends on the image feature. We have developed a query-by-sketch image retrieval method that uses an edge relation histogram (ERH) as a global and local feature intended for binary line images. This histogram is based on the patterns of distribution of other line pixels centered on each line pixel that have been obtained by global and local processing. ERH, which is a shift- and scale-invariant feature, focuses on the relation among the edge pixels. It is fairly simple to describe rotation- and symmetry-invariant features, and query-by-sketch image retrieval using ERH makes it possible to perform retrievals that are not affected by position, size, rotation, or mirroring. We applied the proposed method to 20,000 images in the Corel Photo Gallery. Experimental results showed that it was an effective means of retrieving images.

  • New POI Construction with Street-Level Imagery

    Chillo GA  Jeongho LEE  Won Hee LEE  Kiyun YU  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:1
      Page(s):
    129-133

    We present a novel point of interest (POI) construction approach based on street-level imagery (SLI) such as Google StreetView. Our method consists of: (1) the creation of a conflation map between an SLI trace and a vector map; (2) the detection of the corresponding buildings between the SLI scene and the conflation map; and (3) POI name extraction from a signboard in the SLI scene by user-interactive text recognition. Finally, a POI is generated through a combination of the POI name and attributes of the building object on a vector map. The proposed method showed recall of 92.99% and precision of 97.10% for real-world POIs.

  • On Improving JPEG Entropy Coding by means of Sub-Stream Extraction

    Youngjin KIM  Hyun Joon SHIN  Jung-Ju CHOI  Youngcheul WEE  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E95-D No:11
      Page(s):
    2737-2740

    We introduce an entropy coding method to enhance the compression efficiency of JPEG. Because run-length coding and early-termination work more effectively for longer zero sequences, we extract ones and negative ones from the coefficients and reduce the magnitude of all coefficients by one. The extracted coefficients are encoded with a designated entropy coding method. The proposed method can transmit images in two parts progressively, where the first contains JPEG-compatible image with a small amount of degradation and the second is used to add fine details. Our method improves the compression ratio by more than 5% without sacrificing the efficiency of JPEG.

  • Topic Extraction for Documents Based on Compressibility Vector

    Nuo ZHANG  Toshinori WATANABE  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:10
      Page(s):
    2438-2446

    Nowadays, there are a great deal of e-documents being accessed on the Internet. It would be helpful if those documents and significant extract contents could be automatically analyzed. Similarity analysis and topic extraction are widely used as document relation analysis techniques. Most of the methods being proposed need some processes such as stemming, stop words removal, and etc. In those methods, natural language processing (NLP) technology is necessary and hence they are dependent on the language feature and the dataset. In this study, we propose novel document relation analysis and topic extraction methods based on text compression. Our proposed approaches do not require NLP, and can also automatically evaluate documents. We challenge our proposal with model documents, URCS and Reuters-21578 dataset, for relation analysis and topic extraction. The effectiveness of the proposed methods is shown by the simulations.

  • Crosstalk Analysis and Measurement Technique for High Frequency Signal Transfer in MEMs Probe Pins

    Duc Long LUONG  Hyeonju BAE  Wansoo NAH  

     
    PAPER

      Vol:
    E95-C No:9
      Page(s):
    1459-1464

    This paper develops a methodology of crosstalk analysis/measurement techniques for the design and fabrication of the MEMs (Micro-ElectroMichanical system) probe card. By introducing more ground pins into the connector pins, the crosstalk characteristics can be enhanced and a design guide for the parameters, such as pin's size and pitch is proposed to satisfy the given crosstalk limitation of -30 dB for reliable high speed signal transfer. The paper also presents a novel method to characterize scattering parameters of multiport interconnect circuits with a 4-port VNA (Vector Network Analyzer). By employing the re-normalization of scattering matrices with different reference impedances at other ports, data obtained from 4-port configuration measurements can be synthesized to build a full scattering matrix of the DUT (Device-Under-Test, MEMs probe connector pins). In comparison to the conventional 2-port VNA re-normalization method, proposed technique has two advantages: saving of measuring time, and enhanced accuracy even with open-ended unmeasured ports. A good agreement of the estimated and correct S parameters verifies the validness of the proposed algorithm.

  • Self-Clustering Symmetry Detection

    Bei HE  Guijin WANG  Chenbo SHI  Xuanwu YIN  Bo LIU  Xinggang LIN  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E95-D No:9
      Page(s):
    2359-2362

    This paper presents a self-clustering algorithm to detect symmetry in images. We combine correlations of orientations, scales and descriptors as a triple feature vector to evaluate each feature pair while low confidence pairs are regarded as outliers and removed. Additionally, all confident pairs are preserved to extract potential symmetries since one feature point may be shared by different pairs. Further, each feature pair forms one cluster and is merged and split iteratively based on the continuity in the Cartesian and concentration in the polar coordinates. Pseudo symmetric axes and outlier midpoints are eliminated during the process. Experiments demonstrate the robustness and accuracy of our algorithm visually and quantitatively.

  • Pedestrian Detection Using Gradient Local Binary Patterns

    Ning JIANG  Jiu XU  Satoshi GOTO  

     
    PAPER-Coding & Processing

      Vol:
    E95-A No:8
      Page(s):
    1280-1287

    In recent years, local pattern based features have attracted increasing interest in object detection and recognition systems. Local Binary Pattern (LBP) feature is widely used in texture classification and face detection. But the original definition of LBP is not suitable for human detection. In this paper, we propose a novel feature named gradient local binary patterns (GLBP) for human detection. In this feature, original 256 local binary patterns are reduced to 56 patterns. These 56 patterns named uniform patterns are used for generating a 56-bin histogram. And gradient value of each pixel is set as the weight which is always same in LBP based features in histogram calculation to computing the values in 56 bins for histogram. Experiments are performed on INRIA dataset, which shows the proposal GLBP feature is discriminative than histogram of orientated gradient (HOG), Semantic Local Binary Patterns (S-LBP) and histogram of template (HOT). In our experiments, the window size is fixed. That means the performance can be improved by boosting methods. And the computation of GLBP feature is parallel, which make it easy for hardware acceleration. These factors make GLBP feature possible for real-time pedestrian detection.

  • Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts

    Nattapong TONGTEP  Thanaruk THEERAMUNKONG  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:7
      Page(s):
    1932-1946

    Extracting named entities (NEs) and their relations is more difficult in Thai than in other languages due to several Thai specific characteristics, including no explicit boundaries for words, phrases and sentences; few case markers and modifier clues; high ambiguity in compound words and serial verbs; and flexible word orders. Unlike most previous works which focused on NE relations of specific actions, such as work_for, live_in, located_in, and kill, this paper proposes more general types of NE relations, called predicate-oriented relation (PoR), where an extracted action part (verb) is used as a core component to associate related named entities extracted from Thai Texts. Lacking a practical parser for the Thai language, we present three types of surface features, i.e. punctuation marks (such as token spaces), entity types and the number of entities and then apply five alternative commonly used learning schemes to investigate their performance on predicate-oriented relation extraction. The experimental results show that our approach achieves the F-measure of 97.76%, 99.19%, 95.00% and 93.50% on four different types of predicate-oriented relation (action-location, location-action, action-person and person-action) in crime-related news documents using a data set of 1,736 entity pairs. The effects of NE extraction techniques, feature sets and class unbalance on the performance of relation extraction are explored.

  • Automatic Road Area Extraction from Printed Maps Based on Linear Feature Detection

    Sebastien CALLIER  Hideo SAITO  

     
    PAPER-Segmentation

      Vol:
    E95-D No:7
      Page(s):
    1758-1765

    Raster maps are widely available in the everyday life, and can contain a huge amount of information of any kind using labels, pictograms, or color code e.g. However, it is not an easy task to extract roads from those maps due to those overlapping features. In this paper, we focus on an automated method to extract roads by using linear features detection to search for seed points having a high probability to belong to roads. Those linear features are lines of pixels of homogenous color in each direction around each pixel. After that, the seeds are then expanded before choosing to keep or to discard the extracted element. Because this method is not mainly based on color segmentation, it is also suitable for handwritten maps for example. The experimental results demonstrate that in most cases our method gives results similar to usual methods without needing any previous data or user input, but do need some knowledge on the target maps; and does work with handwritten maps if drawn following some basic rules whereas usual methods fail.

  • Bias-Voltage-Dependent Subcircuit Model for Millimeter-Wave CMOS Circuit

    Kosuke KATAYAMA  Mizuki MOTOYOSHI  Kyoya TAKANO  Ryuichi FUJIMOTO  Minoru FUJISHIMA  

     
    PAPER

      Vol:
    E95-C No:6
      Page(s):
    1077-1085

    In this paper, we propose a new method for the bias-dependent parameter extraction of a MOSFET, which covers DC to over 100 GHz. The DC MOSFET model provided by the chip foundry is assumed to be correct, and the core DC characteristics are designed to be asymptotically recovered at low frequencies. This is carried out by representing the corrections required at high frequencies using a bias-dependent Y matrix, assuming that a parasitic nonlinear two-port matrix (Y-wrapper) is connected in parallel with the core MOSFET. The Y-wrapper can also handle the nonreciprocity of the parasitic components, that is, the asymmetry of the Y matrix. The reliability of the Y-wrapper model is confirmed through the simulation and measurement of a one-stage common-source amplifier operating at several bias points. This paper will not discuss about non-linearity.

81-100hit(301hit)