The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.72

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E95-D No.5  (Publication Date:2012/05/01)

    Special Section on Recent Advances in Multimedia Signal Processing Techniques and Applications
  • FOREWORD Open Access

    Haizhou LI  

     
    FOREWORD

      Page(s):
    1181-1181
  • Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech

    Sadaoki FURUI  

     
    PAPER-Speech Processing

      Page(s):
    1182-1194

    This paper presents our recent work in regard to building Large Vocabulary Continuous Speech Recognition (LVCSR) systems for the Thai, Indonesian, and Chinese languages. For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and applied topic and speaking style adaptation to the language model to recognize spoken-style utterances. For Indonesian, we have applied proper noun-specific adaptation to acoustic modeling, and rule-based English-to-Indonesian phoneme mapping to solve the problem of large variation in proper noun and English word pronunciation in a spoken-query information retrieval system. In spoken Chinese, long organization names are frequently abbreviated, and abbreviated utterances cannot be recognized if the abbreviations are not included in the dictionary. We have proposed a new method for automatically generating Chinese abbreviations, and by expanding the vocabulary using the generated abbreviations, we have significantly improved the performance of spoken query-based search.

  • Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques

    Kuan-Yu CHEN  Hsin-Min WANG  Berlin CHEN  

     
    PAPER-Speech Processing

      Page(s):
    1195-1205

    This paper describes the application of two attractive categories of topic modeling techniques to the problem of spoken document retrieval (SDR), viz. document topic model (DTM) and word topic model (WTM). Apart from using the conventional unsupervised training strategy, we explore a supervised training strategy for estimating these topic models, imagining a scenario that user query logs along with click-through information of relevant documents can be utilized to build an SDR system. This attempt has the potential to associate relevant documents with queries even if they do not share any of the query words, thereby improving on retrieval quality over the baseline system. Likewise, we also study a novel use of pseudo-supervised training to associate relevant documents with queries through a pseudo-feedback procedure. Moreover, in order to lessen SDR performance degradation caused by imperfect speech recognition, we investigate leveraging different levels of index features for topic modeling, including words, syllable-level units, and their combination. We provide a series of experiments conducted on the TDT (TDT-2 and TDT-3) Chinese SDR collections. The empirical results show that the methods deduced from our proposed modeling framework are very effective when compared with a few existing retrieval approaches.

  • Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features

    Xiaoxuan WANG  Lei XIE  Mimi LU  Bin MA  Eng Siong CHNG  Haizhou LI  

     
    PAPER-Speech Processing

      Page(s):
    1206-1215

    In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.

  • Foreign Language Tutoring in Oral Conversations Using Spoken Dialog Systems

    Sungjin LEE  Hyungjong NOH  Jonghoon LEE  Kyusong LEE  Gary Geunbae LEE  

     
    PAPER-Speech Processing

      Page(s):
    1216-1228

    Although there have been enormous investments into English education all around the world, not many differences have been made to change the English instruction style. Considering the shortcomings for the current teaching-learning methodology, we have been investigating advanced computer-assisted language learning (CALL) systems. This paper aims at summarizing a set of POSTECH approaches including theories, technologies, systems, and field studies and providing relevant pointers. On top of the state-of-the-art technologies of spoken dialog system, a variety of adaptations have been applied to overcome some problems caused by numerous errors and variations naturally produced by non-native speakers. Furthermore, a number of methods have been developed for generating educational feedback that help learners develop to be proficient. Integrating these efforts resulted in intelligent educational robots – Mero and Engkey – and virtual 3D language learning games, Pomy. To verify the effects of our approaches on students' communicative abilities, we have conducted a field study at an elementary school in Korea. The results showed that our CALL approaches can be enjoyable and fruitful activities for students. Although the results of this study bring us a step closer to understanding computer-based education, more studies are needed to consolidate the findings.

  • Selective Gammatone Envelope Feature for Robust Sound Event Recognition

    Yi Ren LENG  Huy Dat TRAN  Norihide KITAOKA  Haizhou LI  

     
    PAPER-Audio Processing

      Page(s):
    1229-1237

    Conventional features for Automatic Speech Recognition and Sound Event Recognition such as Mel-Frequency Cepstral Coefficients (MFCCs) have been shown to perform poorly in noisy conditions. We introduce an auditory feature based on the gammatone filterbank, the Selective Gammatone Envelope Feature (SGEF), for Robust Sound Event Recognition where channel selection and the filterbank envelope is used to reduce the effect of noise for specific noise environments. In the experiments with Hidden Markov Model (HMM) recognizers, we shall show that our feature outperforms MFCCs significantly in four different noisy environments at various signal-to-noise ratios.

  • Optimizing a Virtual Re-Convergence System to Reduce Visual Fatigue in Stereoscopic Camera

    Jae Gon KIM  Jun-Dong CHO  

     
    PAPER-Image Processing

      Page(s):
    1238-1247

    In this paper, we propose an optimized virtual re-convergence system especially to reduce the visual fatigue caused by binocular stereoscopy. Our unique idea to reduce visual fatigue is to utilize the virtual re-convergence based on the optimized disparity-map that contains more depth information in the negative disparity area than in the positive area. Therefore, our system facilitates a unique search-range scheme, especially for negative disparity exploration. In addition, we used a dedicated method, using a so-called Global-Shift Value (GSV), which are the total shift values of each image in stereoscopy to converge a main object that can mostly affect visual fatigue. The experimental result, which is a subjective assessment by participants, shows that the proposed method makes stereoscopy significantly comfortable and attractive to view than existing methods.

  • Novel Algorithm for Polar and Spherical Fourier Analysis on Two and Three Dimensional Images

    Zhuo YANG  Sei-ichiro KAMATA  

     
    PAPER-Image Processing

      Page(s):
    1248-1255

    Polar and Spherical Fourier analysis can be used to extract rotation invariant features for image retrieval and pattern recognition tasks. They are demonstrated to show superiorities comparing with other methods on describing rotation invariant features of two and three dimensional images. Based on mathematical properties of trigonometric functions and associated Legendre polynomials, fast algorithms are proposed for multimedia applications like real time systems and large multimedia databases in order to increase the computation speed. The symmetric points are computed simultaneously. Inspired by relative prime number theory, systematic analysis are given in this paper. Novel algorithm is deduced that provide even faster speed. Proposed method are 9–15% faster than previous work. The experimental results on two and three dimensional images are given to illustrate the effectiveness of the proposed method. Multimedia signal processing applications that need real time polar and spherical Fourier analysis can be benefit from this work.

  • Automatic Determination of the Appropriate Number of Clusters for Multispectral Image Data

    Kitti KOONSANIT  Chuleerat JARUSKULCHAI  

     
    PAPER-Image Processing

      Page(s):
    1256-1263

    Nowadays, clustering is a popular tool for exploratory data analysis, with one technique being K-means clustering. Determining the appropriate number of clusters is a significant problem in K-means clustering because the results of the k-means technique depend on different numbers of clusters. Automatic determination of the appropriate number of clusters in a K-means clustering application is often needed in advance as an input parameter to the K-means algorithm. We propose a new method for automatic determination of the appropriate number of clusters using an extended co-occurrence matrix technique called a tri-co-occurrence matrix technique for multispectral imagery in the pre-clustering steps. The proposed method was tested using a dataset from a known number of clusters. The experimental results were compared with ground truth images and evaluated in terms of accuracy, with the numerical result of the tri-co-occurrence providing an accuracy of 84.86%. The results from the tests confirmed the effectiveness of the proposed method in finding the appropriate number of clusters and were compared with the original co-occurrence matrix technique and other algorithms.

  • A Linear Manifold Color Descriptor for Medicine Package Recognition

    Kenjiro SUGIMOTO  Koji INOUE  Yoshimitsu KUROKI  Sei-ichiro KAMATA  

     
    PAPER-Image Processing

      Page(s):
    1264-1271

    This paper presents a color-based method for medicine package recognition, called a linear manifold color descriptor (LMCD). It describes a color distribution (a set of color pixels) of a color package image as a linear manifold (an affine subspace) in the color space, and recognizes an anonymous package by linear manifold matching. Mainly due to low dimensionality of color spaces, LMCD can provide more compact description and faster computation than description styles based on histogram and dominant-color. This paper also proposes distance-based dissimilarities for linear manifold matching. Specially designed for color distribution matching, the proposed dissimilarities are theoretically appropriate more than J-divergence and canonical angles. Experiments on medicine package recognition validates that LMCD outperforms competitors including MPEG-7 color descriptors in terms of description size, computational cost and recognition rate.

  • A Correlation-Based Watermarking Technique of 3-D Meshes via Cyclic Signal Processing

    Toshiyuki UTO  Yuka TAKEMURA  Hidekazu KAMITANI  Kenji OHUE  

     
    PAPER-Image Processing

      Page(s):
    1272-1279

    This paper describes a blind watermarking scheme through cyclic signal processing. Due to various rapid networks, there is a growing demand of copyright protection for multimedia data. As efficient watermarking of images, there exist two major approaches: a quantization-based method and a correlation-based method. In this paper, we proposes a correlation-based watermarking technique of three-dimensional (3-D) polygonal models using the fast Fourier transforms (FFTs). For generating a watermark with desirable properties, similar to a pseudonoise signal, an impulse signal on a two-dimensional (2-D) space is spread through the FFT, the multiplication of a complex sinusoid signal, and the inverse FFT. This watermark, i.e., spread impulse signal, in a transform domain is converted to a spatial domain by an inverse wavelet transform, and embedded into 3-D data aligned by the principle component analysis (PCA). In the detection procedure, after realigning the watermarked mesh model through the PCA, we map the 3-D data on the 2-D space via block segmentation and averaging operation. The 2-D data are processed by the inverse system, i.e., the FFT, the division of the complex sinusoid signal, and the inverse FFT. From the resulting 2-D signal, we detect the position of the maximum value as a signature. For 3-D bunny models, detection rates and information capacity are shown to evaluate the performance of the proposed method.

  • Efficiently Finding Individuals from Video Dataset

    Pengyi HAO  Sei-ichiro KAMATA  

     
    PAPER-Video Processing

      Page(s):
    1280-1287

    We are interested in retrieving video shots or videos containing particular people from a video dataset. Owing to the large variations in pose, illumination conditions, occlusions, hairstyles and facial expressions, face tracks have recently been researched in the fields of face recognition, face retrieval and name labeling from videos. However, when the number of face tracks is very large, conventional methods, which match all or some pairs of faces in face tracks, will not be effective. Therefore, in this paper, an efficient method for finding a given person from a video dataset is presented. In our study, in according to performing research on face tracks in a single video, we also consider how to organize all the faces in videos in a dataset and how to improve the search quality in the query process. Different videos may include the same person; thus, the management of individuals in different videos will be useful for their retrieval. The proposed method includes the following three points. (i) Face tracks of the same person appearing for a period in each video are first connected on the basis of scene information with a time constriction, then all the people in one video are organized by a proposed hierarchical clustering method. (ii) After obtaining the organizational structure of all the people in one video, the people are organized into an upper layer by affinity propagation. (iii) Finally, in the process of querying, a remeasuring method based on the index structure of videos is performed to improve the retrieval accuracy. We also build a video dataset that contains six types of videos: films, TV shows, educational videos, interviews, press conferences and domestic activities. The formation of face tracks in the six types of videos is first researched, then experiments are performed on this video dataset containing more than 1 million faces and 218,786 face tracks. The results show that the proposed approach has high search quality and a short search time.

  • Efficient Tracking of News Topics Based on Chronological Semantic Structures in a Large-Scale News Video Archive

    Ichiro IDE  Tomoyoshi KINOSHITA  Tomokazu TAKAHASHI  Hiroshi MO  Norio KATAYAMA  Shin'ichi SATOH  Hiroshi MURASE  

     
    PAPER-Video Processing

      Page(s):
    1288-1300

    Recent advance in digital storage technology has enabled us to archive a large volume of video data. Thanks to this trend, we have archived more than 1,800 hours of video data from a daily Japanese news show in the last ten years. When considering the effective use of such a large news video archive, we assumed that analysis of its chronological and semantic structure becomes important. We also consider that providing the users with the development of news topics is more important to help their understanding of current affairs, rather than providing a list of relevant news stories as in most of the current news video retrieval systems. Therefore, in this paper, we propose a structuring method for a news video archive, together with an interface that visualizes the structure, so that users could track the development of news topics according to their interest, efficiently. The proposed news video structure, namely the “topic thread structure”, is obtained as a result of an analysis of the chronological and semantic relation between news stories. Meanwhile, the proposed interface, namely “mediaWalker II”, allows users to track the development of news topics along the topic thread structure, and at the same time watch the video footage corresponding to each news story. Analyses on the topic thread structures obtained by applying the proposed method to actual news video footages revealed interesting and comprehensible relations between news topics in the real world. At the same time, analyses on their size quantified the efficiency of tracking a user's topic-of-interest based on the proposed topic thread structure. We consider this as a first step towards facilitating video authoring by users based on existing contents in a large-scale news video archive.

  • Layered Multicast Encryption of Motion JPEG2000 Code Streams for Flexible Access Control

    Takayuki NAKACHI  Kan TOYOSHIMA  Yoshihide TONOMURA  Tatsuya FUJII  

     
    PAPER-Video Processing

      Page(s):
    1301-1312

    In this paper, we propose a layered multicast encryption scheme that provides flexible access control to motion JPEG2000 code streams. JPEG2000 generates layered code streams and offers flexible scalability in characteristics such as resolution and SNR. The layered multicast encryption proposal allows a sender to multicast the encrypted JPEG2000 code streams such that only designated groups of users can decrypt the layered code streams. While keeping the layering functionality, the proposed method offers useful properties such as 1) video quality control using only one private key, 2) guaranteed security, and 3) low computational complexity comparable to conventional non-layered encryption. Simulation results show the usefulness of the proposed method.

  • Low-Complexity Coarse-Level Mode-Mapping Based H.264/AVC to H.264/SVC Spatial Transcoding for Video Conferencing

    Lei SUN  Jie LENG  Jia SU  Yiqing HUANG  Hiroomi MOTOHASHI  Takeshi IKENAGA  

     
    PAPER-Video Processing

      Page(s):
    1313-1323

    Scalable Video Coding (SVC) was standardized as an extension of H.264/AVC with the intention to provide flexible adaptation to heterogeneous networks and different end-user requirements, which provides great scalability in multi-point applications such as video conferencing. However, due to the existence of H.264/AVC-based systems, transcoding between AVC and SVC becomes necessary. Most existing works focus on temporal transcoding, quality transcoding or SVC-to-AVC spatial transcoding while the straightforward re-encoding method requires high computational cost. This paper proposes a low-complexity AVC-to-SVC spatial transcoder based on coarse-level mode mapping for video conferencing scenes. First, to omit unnecessary motion estimations (ME) for layers with reduced resolution, an ME skipping scheme based on AVC mode distribution is proposed with an adaptive search range. Then a probability-profile based scheme is proposed for further mode skipping. After that 3 coarse-level mode-mapping methods are presented for fast mode decision and the adaptive usage of the 3 methods is discussed. Finally, motion vector (MV) refinement is introduced for further lower-layer time reduction. As for the top layer, direct encapsulation is proposed to preserve better quality and another scheme involving inter-layer predictions is also provided for bandwidth-crucial applications. Simulation results show that proposed transcoder achieves up to 92.6% time reduction without significant coding efficiency loss compared to re-encoding method.

  • An Immersive VR System for Sports Education

    Peng SONG  Shuhong XU  Wee Teck FONG  Ching Ling CHIN  Gim Guan CHUA  Zhiyong HUANG  

     
    PAPER-Signal Processing

      Page(s):
    1324-1331

    The development of new technologies has undoubtedly promoted the advances of modern education, among which Virtual Reality (VR) technologies have made the education more visually accessible for students. However, classroom education has been the focus of VR applications whereas not much research has been done in promoting sports education using VR technologies. In this paper, an immersive VR system is designed and implemented to create a more intuitive and visual way of teaching tennis. A scalable system architecture is proposed in addition to the hardware setup layout, which can be used for various immersive interactive applications such as architecture walkthroughs, military training simulations, other sports game simulations, interactive theaters, and telepresent exhibitions. Realistic interaction experience is achieved through accurate and robust hybrid tracking technology, while the virtual human opponent is animated in real time using shader-based skin deformation. Potential future extensions are also discussed to improve the teaching/learning experience.

  • Registration Method of Sparse Representation Classification Method

    Jing WANG  Guangda SU  

     
    LETTER-Image Processing

      Page(s):
    1332-1335

    Sparse representation based classification (SRC) has emerged as a new paradigm for solving face recognition problems. Further research found that the main limitation of SRC is the assumption of pixel-accurate alignment between the test image and the training set. A. Wagner used a series of linear programs that iteratively minimize the sparsity of the registration error. In this paper, we propose another face registration method called three-point positioning method. Experiments show that our proposed method achieves better performance.

  • Geometrical Positioning Schemes Based on Hybrid Lines of Position

    Chien-Sheng CHEN  Jium-Ming LIN  Wen-Hsiung LIU  Ching-Lung CHI  

     
    LETTER-Signal Processing

      Page(s):
    1336-1340

    To achieve more accurate measurements of the mobile station (MS) location, it is possible to integrate many kinds of measurements. In this paper we proposed several simpler methods that utilized time of arrival (TOA) at three base stations (BSs) and the angle of arrival (AOA) information at the serving BS to give location estimation of the MS in non-line-of-sight (NLOS) environments. From the viewpoint of geometric approach, for each a TOA value measured at any BS, one can generate a circle. Rather than applying the nonlinear circular lines of position (LOP), the proposed methods are much easier by using linear LOP to determine the MS. Numerical results demonstrate that the calculation time of using linear LOP is much less than employing circular LOP. Although the location precision of using linear LOP is only reduced slightly. However, the proposed efficient methods by using linear LOP can still provide precise solution of MS location and reduce the computational effort greatly. In addition, the proposed methods with less effort can mitigate the NLOS effect, simply by applying the weighted sum of the intersections between different linear LOP and the AOA line, without requiring priori knowledge of NLOS error statistics. Simulation results show that the proposed methods can always yield superior performance in comparison with Taylor series algorithm (TSA) and the hybrid lines of position algorithm (HLOP).

  • Special Section on Formal Approach
  • FOREWORD Open Access

    Shoji YUEN  

     
    FOREWORD

      Page(s):
    1341-1341
  • Formal Verification of Effectiveness of Control Activities in Business Processes

    Yasuhito ARIMOTO  Shusaku IIDA  Kokichi FUTATSUGI  

     
    PAPER-Formal Methods

      Page(s):
    1342-1354

    It has been an important issue to deal with risks in business processes for achieving companies' goals. This paper introduces a method for applying a formal method to analysis of risks and control activities in business processes in order to evaluate control activities consistently, exhaustively, and to give us potential to have scientific discussion on the result of the evaluation. We focus on document flows in business activities and control activities and risks related to documents because documents play important roles in business. In our method, document flows including control activities are modeled and it is verified by OTS/CafeOBJ Method that risks about falsification of documents are avoided by control activities in the model. The verification is done by interaction between humans and CafeOBJ system with theorem proving, and it raises potential to discuss the result scientifically because the interaction gives us rigorous reasons why the result is derived from the verification.

  • Efficient Multi-Valued Bounded Model Checking for LTL over Quasi-Boolean Algebras

    Jefferson O. ANDRADE  Yukiyoshi KAMEYAMA  

     
    PAPER-Model Checking

      Page(s):
    1355-1364

    Multi-valued Model Checking extends classical, two-valued model checking to multi-valued logic such as Quasi-Boolean logic. The added expressivity is useful in dealing with such concepts as incompleteness and uncertainty in target systems, while it comes with the cost of time and space. Chechik and others proposed an efficient reduction from multi-valued model checking problems to two-valued ones, but to the authors' knowledge, no study was done for multi-valued bounded model checking. In this paper, we propose a novel, efficient algorithm for multi-valued bounded model checking. A notable feature of our algorithm is that it is not based on reduction of multi-values into two-values; instead, it generates a single formula which represents multi-valuedness by a suitable encoding, and asks a standard SAT solver to check its satisfiability. Our experimental results show a significant improvement in the number of variables and clauses and also in execution time compared with the reduction-based one.

  • Decidability of the Security against Inference Attacks Using a Functional Dependency on XML Databases

    Kenji HASHIMOTO  Hiroto KAWAI  Yasunori ISHIHARA  Toru FUJIWARA  

     
    PAPER-Database Security

      Page(s):
    1365-1374

    This paper discusses verification of the security against inference attacks on XML databases in the presence of a functional dependency. So far, we have provided the verification method for k-secrecy, which is a metric for the security against inference attacks on databases. Intuitively, k-secrecy means that the number of candidates of sensitive data (i.e., the result of unauthorized query) of a given database instance cannot be narrowed down to k-1 by using available information such as authorized queries and their results. In this paper, we consider a functional dependency on database instances as one of the available information. Functional dependencies help attackers to reduce the number of the candidates for the sensitive information. The verification method we have provided cannot be naively extended to the k-secrecy problem with a functional dependency. The method requires that the candidate set can be captured by a tree automaton, but the candidate set when a functional dependency is considered cannot be always captured by any tree automaton. We show that the ∞-secrecy problem in the presence of a functional dependency is decidable when a given unauthorized query is represented by a deterministic topdown tree transducer, without explicitly computing the candidate set.

  • Refactoring Problem of Acyclic Extended Free-Choice Workflow Nets to Acyclic Well-Structured Workflow Nets

    Shingo YAMAGUCHI  

     
    LETTER-Formal Methods

      Page(s):
    1375-1379

    A workflow net (WF-net for short) is a Petri net which represents a workflow. There are two important subclasses of WF-nets: extended free-choice (EFC for short) and well-structured (WS for short). It is known that most actual workflows can be modeled as EFC WF-nets; Acyclic WS is a subclass of acyclic EFC but has more analysis methods. An acyclic EFC WF-net may be transformed to an acyclic WS WF-net without changing the external behavior of the net. We name such a transformation Acyclic EFC WF-net refactoring. We give a formal definition of acyclic EFC WF-net refactoring problem. We also give a necessary condition and a sufficient condition for solving the problem. Those conditions can be checked in polynomial time. These result in the enhancement of the analysis power of acyclic EFC WF-nets.

  • Stochastic Power Minimization of Real-Time Tasks with Probabilistic Computations under Discrete Clock Frequencies

    Hyung Goo PAEK  Jeong Mo YEO  Kyong Hoon KIM  Wan Yeon LEE  

     
    LETTER-System Analysis

      Page(s):
    1380-1383

    The proposed scheduling scheme minimizes the mean power consumption of real-time tasks with probabilistic computation amounts while meeting their deadlines. Our study formally solves the minimization problem under finitely discrete clock frequencies with irregular power consumptions, whereas state-of-the-arts studies did under infinitely continuous clock frequencies with regular power consumptions.

  • Regular Section
  • A Survey on Mining Software Repositories Open Access

    Woosung JUNG  Eunjoo LEE  Chisu WU  

     
    SURVEY PAPER-Software Engineering

      Page(s):
    1384-1406

    This paper presents fundamental concepts, overall process and recent research issues of Mining Software Repositories. The data sources such as source control systems, bug tracking systems or archived communications, data types and techniques used for general MSR problems are also presented. Finally, evaluation approaches, opportunities and challenge issues are given.

  • Stable Adaptive Work-Stealing for Concurrent Many-Core Runtime Systems

    Yangjie CAO  Hongyang SUN  Depei QIAN  Weiguo WU  

     
    PAPER-Fundamentals of Information Systems

      Page(s):
    1407-1416

    The proliferation of many-core architectures has led to the explosive development of parallel applications using programming models, such as OpenMP, TBB, and Cilk/Cilk++. With increasing number of cores, however, it becomes even harder to efficiently schedule parallel applications on these resources since current many-core runtime systems still lack effective mechanisms to support collaborative scheduling of these applications. In this paper, we study feedback-driven adaptive scheduling based on work stealing, which provides an efficient solution for concurrently executing a set of applications on many-core systems. To dynamically estimate the number of cores desired by each application, a stable feedback-driven adaptive algorithm, called SAWS, is proposed using active workers and the length of active deques, which well captures the runtime characteristics of the applications. Furthermore, a prototype system is built by extending the Cilk runtime system, and the experimental results, which are obtained on a Sun Fire server, show that SAWS has more advantages for scheduling concurrent parallel applications. Specifically, compared with existing algorithms A-Steal and WS-EQUI, SAWS improves the performances by up to 12.43% and 21.32% with respect to mean response time respectively, and 25.78% and 46.98% with respect to processor utilization, respectively.

  • A Power-Saving Technique for the OSGi Platform

    Kuo-Yi CHEN  Chin-Yang LIN  Tien-Yan MA  Ting-Wei HOU  

     
    PAPER-Software System

      Page(s):
    1417-1426

    With more digital home appliances and network devices having OSGi as the software management platform, the power-saving capability of the OSGi platform has become a critical issue. This paper is aimed at improving the power-efficiency of the OSGi platform, i.e. reducing the energy consumption with minimum performance degradation. The key to this study is an efficient power-saving technique which exploits the runtime information already available in a Java virtual machine (JVM), the base software of the OSGi platform, to best determine the timing of performing DVFS (Dynamic Voltage and Frequency Scaling). This, technically, involves a phase detection scheme that identifies the memory phase of the OSGi-enabled device/server in a correct and almost effortless way. The overhead of the power-saving procedure is thus minimized, and the system performance is well maintained. We have implemented and evaluated the proposed power-saving approach on an OSGi server, where the Apache Felix OSGi implementation and the DaCapo benchmarks were applied. The results show that this approach can achieve real power-efficiency for the OSGi platform, in which the power consumption is significantly reduced and the performance remains highly competitive, compared with the other power-saving techniques.

  • PrefixSummary: A Directory Structure for Selective Probing on Wireless Stream of Heterogeneous XML Data

    Chang-Sup PARK  Jun Pyo PARK  Yon Dohn CHUNG  

     
    PAPER-Data Engineering, Web Information Systems

      Page(s):
    1427-1435

    Wireless broadcasting of heterogeneous XML data has become popular in many applications, where energy-efficient processing of user queries at the mobile client is a critical issue. This paper proposes a new index structure for wireless stream of heterogeneous XML data to enhance tuning time performance in processing path queries on the stream. The index called PrefixSummary stores for each location path in the XML data the address of a bucket in the stream which contains an XML node satisfying the location path and appearing first in the stream. We present algorithms to generate broadcast stream with the proposed index and to process a path query on the stream efficiently by exploiting the index. We also suggest a replication scheme of PrefixSummary within a broadcast cycle to reduce latency in query processing. By analysis and experiment we show the proposed PrefixSummary approach can reduce tuning time for processing path queries significantly while it can also achieve reasonable access time performance by means of replication of the index over the broadcast stream.

  • Logarithmic Adaptive Quantization Projection for Audio Watermarking

    Xuemin ZHAO  Yuhong GUO  Jian LIU  Yonghong YAN  Qiang FU  

     
    PAPER-Information Network

      Page(s):
    1436-1445

    In this paper, a logarithmic adaptive quantization projection (LAQP) algorithm for digital watermarking is proposed. Conventional quantization index modulation uses a fixed quantization step in the watermarking embedding procedure, which leads to poor fidelity. Moreover, the conventional methods are sensitive to value-metric scaling attack. The LAQP method combines the quantization projection scheme with a perceptual model. In comparison to some conventional quantization methods with a perceptual model, the LAQP only needs to calculate the perceptual model in the embedding procedure, avoiding the decoding errors introduced by the difference of the perceptual model used in the embedding and decoding procedure. Experimental results show that the proposed watermarking scheme keeps a better fidelity and is robust against the common signal processing attack. More importantly, the proposed scheme is invariant to value-metric scaling attack.

  • Caching-Based Multi-Swarm Collaboration for Improving Content Availability in BitTorrent

    HyunYong LEE  Masahiro YOSHIDA  Akihiro NAKAO  

     
    PAPER-Information Network

      Page(s):
    1446-1453

    Despite its great success, BitTorrent suffers from the content unavailability problem where peers cannot complete their content downloads due to some missing chunks, which is caused by a shortage of seeders who hold the content in its entirety. The multi-swarm collaboration approach is a natural choice for improving content availability, since content unavailability cannot be overcome by one swarm easily. Most existing multi-swarm collaboration approaches, however, suffer from content-related limitations, which limit their application scopes. In this paper, we introduce a new kind of multi-swarm collaboration utilizing a swarm as temporal storage. In a nutshell, the collaborating swarms cache some chunks of each other that are likely to be unavailable before the content unavailability happens and share the cached chunks when the content unavailability happens. Our approach enables any swarms to collaborate with each other without the content-related limitations. Simulation results show that our approach increases the number of download completions by over 50% (26%) compared to normal BitTorrent (existing bundling approach) with low overhead. In addition, our approach shows around 30% improved download completion time compared to the existing bundling approach. The results also show that our approach enables the peers participating in our approach to enjoy better performance than other peers, which can be a peer incentive.

  • Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation

    Kai LI  Yanmeng GUO  Qiang FU  Junfeng LI  Yonghong YAN  

     
    PAPER-Speech and Hearing

      Page(s):
    1454-1464

    Traditional two-microphone noise reduction algorithms to deal with highly nonstationary directional noises generally use the direction of arrival or phase difference information. The performance of these algorithms deteriorate when diffuse noises coexist with nonstationary directional noises in realistic adverse environments. In this paper, we present a two-channel noise reduction algorithm using a spatial information-based speech estimator and a spatial-information-controlled soft-decision noise estimator to improve the noise reduction performance in realistic non-stationary noisy environments. A target presence probability estimator based on Bayes rules using both phase difference and magnitude squared coherence is proposed for soft-decision of noise estimation, so that they can share complementary advantages when both directional noises and diffuse noises are present. Performances of the proposed two-microphone noise reduction algorithm are evaluated by noise reduction, log-spectral distance (LSD) and word recognition rate (WRR) of a distant-talking ASR system in a real room's noisy environment. Experimental results show that the proposed algorithm achieves better noises suppression without further distorting the desired signal components over the comparative dual-channel noise reduction algorithms.

  • Model Shrinkage for Discriminative Language Models

    Takanobu OBA  Takaaki HORI  Atsushi NAKAMURA  Akinori ITO  

     
    PAPER-Speech and Hearing

      Page(s):
    1465-1474

    This paper describes a technique for overcoming the model shrinkage problem in automatic speech recognition (ASR), which allows application developers and users to control the model size with less degradation of accuracy. Recently, models for ASR systems tend to be large and this can constitute a bottleneck for developers and users without special knowledge of ASR with respect to introducing the ASR function. Specifically, discriminative language models (DLMs) are usually designed in a high-dimensional parameter space, although DLMs have gained increasing attention as an approach for improving recognition accuracy. Our proposed method can be applied to linear models including DLMs, in which the score of an input sample is given by the inner product of its features and the model parameters, but our proposed method can shrink models in an easy computation by obtaining simple statistics, which are square sums of feature values appearing in a data set. Our experimental results show that our proposed method can shrink a DLM with little degradation in accuracy and perform properly whether or not the data for obtaining the statistics are the same as the data for training the model.

  • Implementation and Optimization of Image Processing Algorithms on Embedded GPU

    Nitin SINGHAL  Jin Woo YOO  Ho Yeol CHOI  In Kyu PARK  

     
    PAPER-Image Processing and Video Processing

      Page(s):
    1475-1484

    In this paper, we analyze the key factors underlying the implementation, evaluation, and optimization of image processing and computer vision algorithms on embedded GPU using OpenGL ES 2.0 shader model. First, we present the characteristics of the embedded GPU and its inherent advantage when compared to embedded CPU. Additionally, we propose techniques to achieve increased performance with optimized shader design. To show the effectiveness of the proposed techniques, we employ cartoon-style non-photorealistic rendering (NPR), speeded-up robust feature (SURF) detection, and stereo matching as our example algorithms. Performance is evaluated in terms of the execution time and speed-up achieved in comparison with the implementation on embedded CPU.

  • A Comparative Study of Rotation Angle Estimation Methods Based on Complex Moments

    Jong-Min LEE  Whoi-Yul KIM  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1485-1493

    Determining the rotation angle between two images is essential when comparing images that may include rotational variation. While there are three representative methods that utilize the phases of Zernike moments (ZMs) to estimate rotation angles, very little work has been done to compare the performances of these methods. In this paper, we compare the performances of these three methods and propose a new, angular radial transform (ART)-based method. Our method extends Revaud et al.'s method [1] and uses the phase of angular radial transform coefficients instead of ZMs. We show that our proposed method outperforms the ZM-based method using the MPEG-7 shape dataset when computation times are compared or in terms of the root mean square error vs. coverage.

  • Image Description with Local Patterns: An Application to Face Recognition

    Wei ZHOU  Alireza AHRARY  Sei-ichiro KAMATA  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1494-1505

    In this paper, we propose a novel approach for presenting the local features of digital image using 1D Local Patterns by Multi-Scans (1DLPMS). We also consider the extentions and simplifications of the proposed approach into facial images analysis. The proposed approach consists of three steps. At the first step, the gray values of pixels in image are represented as a vector giving the local neighborhood intensity distrubutions of the pixels. Then, multi-scans are applied to capture different spatial information on the image with advantage of less computation than other traditional ways, such as Local Binary Patterns (LBP). The second step is encoding the local features based on different encoding rules using 1D local patterns. This transformation is expected to be less sensitive to illumination variations besides preserving the appearance of images embedded in the original gray scale. At the final step, Grouped 1D Local Patterns by Multi-Scans (G1DLPMS) is applied to make the proposed approach computationally simpler and easy to extend. Next, we further formulate boosted algorithm to extract the most discriminant local features. The evaluated results demonstrate that the proposed approach outperforms the conventional approaches in terms of accuracy in applications of face recognition, gender estimation and facial expression.

  • Self Evolving Modular Network

    Kazuhiro TOKUNAGA  Nobuyuki KAWABATA  Tetsuo FURUKAWA  

     
    PAPER-Biocybernetics, Neurocomputing

      Page(s):
    1506-1518

    We propose a novel modular network called the Self-Evolving Modular Network (SEEM). The SEEM has a modular network architecture with a graph structure and these following advantages: (1) new modules are added incrementally to allow the network to adapt in a self-organizing manner, and (2) graph's paths are formed based on the relationships between the models represented by modules. The SEEM is expected to be applicable to evolving functions of an autonomous robot in a self-organizing manner through interaction with the robot's environment and categorizing large-scale information. This paper presents the architecture and an algorithm for the SEEM. Moreover, performance characteristic and effectiveness of the network are shown by simulations using cubic functions and a set of 3D-objects.

  • A 1-Cycle 1.25 GHz Bufferless Router for 3D Network-on-Chip

    Chaochao FENG  Zhonghai LU  Axel JANTSCH  Minxuan ZHANG  

     
    LETTER-Computer System

      Page(s):
    1519-1522

    In this paper, we propose a 1-cycle high-performance 3D bufferless router with a 3-stage permutation network. The proposed router utilizes the 3-stage permutation network instead of the serialized switch allocator and 77 crossbar to achieve the frequency of 1.25 GHz in TSMC 65 nm technology. Compared with the other two 3D bufferless routers, the proposed router occupies less area and consumes less power consumption. Simulation results under both synthetic and application workloads illustrate that the proposed router achieves less average packet latency than the other two 3D bufferless routers.

  • Spectral Magnitude Adjustment for MCLT-Based Acoustic Data Transmission

    Hwan Sik YUN  Kiho CHO  Nam Soo KIM  

     
    LETTER-Information Network

      Page(s):
    1523-1526

    Acoustic data transmission is a technique which embeds data in a sound wave imperceptibly and detects it at a receiver. The data are embedded in an original audio signal and transmitted through the air by playing back the data-embedded audio using a loudspeaker. At the receiver, the data are extracted from the received audio signal captured by a microphone. In our previous work, we proposed an acoustic data transmission system designed based on phase modification of the modulated complex lapped transform (MCLT) coefficients. In this paper, we propose the spectral magnitude adjustment (SMA) technique which not only enhances the quality of the data-embedded audio signal but also improves the transmission performance of the system.

  • A Reliable Tag Anti-Collision Algorithm for Mobile Tags

    Xiaodong DENG  Mengtian RONG  Tao LIU  

     
    LETTER-Information Network

      Page(s):
    1527-1530

    As RFID technology is being more widely adopted, it is fairly common to read mobile tags using RFID systems, such as packages on conveyer belt and unit loads on pallet jack or forklift truck. In RFID systems, multiple tags use a shared medium for communicating with a reader. It is quite possible that tags will exit the reading area without being read, which results in tag leaking. In this letter, a reliable tag anti-collision algorithm for mobile tags is proposed. It reliably estimates the expectation of the number of tags arriving during a time slot when new tags continually enter the reader's reading area and no tag leaves without being read. In addition, it gives priority to tags that arrived early among read cycles and applies the expectation of the number of tags arriving during a time slot to the determination of the number of slots in the initial inventory round of the next read cycle. Simulation results show that the reliability of the proposed algorithm is close to that of DFSA algorithm when the expectation of the number of tags entering the reading area during a time slot is a given, and is better than that of DFSA algorithm when the number of time slots in the initial inventory round of next read cycle is set to 1 assuming that the number of tags arriving during a time slot follows Poisson distribution.

  • Classification Based on Predictive Association Rules of Incomplete Data

    Jeonghun YOON  Dae-Won KIM  

     
    LETTER-Artificial Intelligence, Data Mining

      Page(s):
    1531-1535

    Classification based on predictive association rules (CPAR) is a widely used associative classification method. Despite its efficiency, the analysis results obtained by CPAR will be influenced by missing values in the data sets, and thus it is not always possible to correctly analyze the classification results. In this letter, we improve CPAR to deal with the problem of missing data. The effectiveness of the proposed method is demonstrated using various classification examples.

  • A Tree-Structured Deterministic Small-World Network

    Shi-Ze GUO  Zhe-Ming LU  Guang-Yu KANG  Zhe CHEN  Hao LUO  

     
    LETTER-Artificial Intelligence, Data Mining

      Page(s):
    1536-1538

    Small-world is a common property existing in many real-life social, technological and biological networks. Small-world networks distinguish themselves from others by their high clustering coefficient and short average path length. In the past dozen years, many probabilistic small-world networks and some deterministic small-world networks have been proposed utilizing various mechanisms. In this Letter, we propose a new deterministic small-world network model by first constructing a binary-tree structure and then adding links between each pair of brother nodes and links between each grandfather node and its four grandson nodes. Furthermore, we give the analytic solutions to several topological characteristics, which shows that the proposed model is a small-world network.

  • Discovery of Information Diffusion Process in Social Networks

    Kwanho KIM  Jae-Yoon JUNG  Jonghun PARK  

     
    LETTER-Office Information Systems, e-Business Modeling

      Page(s):
    1539-1542

    Information diffusion analysis in social networks is of significance since it enables us to deeply understand dynamic social interactions among users. In this paper, we introduce approaches to discovering information diffusion process in social networks based on process mining. Process mining techniques are applied from three perspectives: social network analysis, process discovery and community recognition. We then present experimental results by using a real-life social network data. The proposed techniques are expected to employ as new analytical tools in online social networks such as blog and wikis for company marketers, politicians, news reporters and online writers.

  • Speaker Change Detection Based on a Weighted Distance Measure over the Centroid Model

    Jin Soo SEO  

     
    LETTER-Speech and Hearing

      Page(s):
    1543-1546

    Speaker change detection involves the identification of the time indices of an audio stream, where the identity of the speaker changes. This paper proposes novel measures for speaker change detection over the centroid model, which divides the feature space into non-overlapping clusters for effective speaker-change comparison. The centroid model is a computationally-efficient variant of the widely-used mixture-distribution based background models for speaker recognition. Experiments on both synthetic and real-world data were performed; the results show that the proposed approach yields promising results compared with the conventional statistical measures.

  • Discriminative Projection Selection Based Face Image Hashing

    Cagatay KARABAT  Hakan ERDOGAN  

     
    LETTER-Image Recognition, Computer Vision

      Page(s):
    1547-1551

    Face image hashing is an emerging method used in biometric verification systems. In this paper, we propose a novel face image hashing method based on a new technique called discriminative projection selection. We apply the Fisher criterion for selecting the rows of a random projection matrix in a user-dependent fashion. Moreover, another contribution of this paper is to employ a bimodal Gaussian mixture model at the quantization step. Our simulation results on three different databases demonstrate that the proposed method has superior performance in comparison to previously proposed random projection based methods.

  • An Interleaving Updating Framework of Disparity and Confidence Map for Stereo Matching

    Chenbo SHI  Guijin WANG  Xiaokang PEI  Bei HE  Xinggang LIN  

     
    LETTER-Image Recognition, Computer Vision

      Page(s):
    1552-1555

    In this paper, we propose an interleaving updating framework of disparity and confidence map (IUFDCM) for stereo matching to eliminate the redundant and interfere information from unreliable pixels. Compared with other propagation algorithms using matching cost as messages, IUFDCM updates the disparity map and the confidence map in an interleaving manner instead. Based on the Confidence-based Support Window (CSW), disparity map is updated adaptively to alleviate the effect of input parameters. The reassignment for unreliable pixels with larger probability keeps ground truth depending on reliable messages. Consequently, the confidence map is updated according to the previous disparity map and the left-right consistency. The top ranks on Middlebury benchmark corresponding to different error thresholds demonstrate that our algorithm is competitive with the best stereo matching algorithms at present.

  • Global-Context Based Salient Region Detection in Nature Images

    Hong BAO  De XU  Yingjun TANG  

     
    LETTER-Image Recognition, Computer Vision

      Page(s):
    1556-1559

    Visually saliency detection provides an alternative methodology to image description in many applications such as adaptive content delivery and image retrieval. One of the main aims of visual attention in computer vision is to detect and segment the salient regions in an image. In this paper, we employ matrix decomposition to detect salient object in nature images. To efficiently eliminate high contrast noise regions in the background, we integrate global context information into saliency detection. Therefore, the most salient region can be easily selected as the one which is globally most isolated. The proposed approach intrinsically provides an alternative methodology to model attention with low implementation complexity. Experiments show that our approach achieves much better performance than that from the existing state-of-art methods.