The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.72

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E98-D No.5  (Publication Date:2015/05/01)

    Special Section on Data Engineering and Information Management
  • FOREWORD Open Access

    Shinsuke NAKAJIMA  

     
    FOREWORD

      Page(s):
    1000-1000
  • Cache-Conscious Data Access for DBMS in Multicore Environments

    Fang XI  Takeshi MISHIMA  Haruo YOKOTA  

     
    PAPER

      Pubricized:
    2015/01/21
      Page(s):
    1001-1012

    In recent years, dramatic improvements have been made to computer hardware. In particular, the number of cores on a chip has been growing exponentially, enabling an ever-increasing number of processes to be executed in parallel. Having been originally developed for single-core processors, database (DB) management systems (DBMSs) running on multicore processors suffer from cache conflicts as the number of concurrently executing DB processes (DBPs) increases. Therefore, a cache-efficient solution for arranging the execution of concurrent DBPs on multicore platforms would be highly attractive for DBMSs. In this paper, we propose CARIC-DA, middleware for achieving higher performance in DBMSs on multicore processors, by reducing cache misses with a new cache-conscious dispatcher for concurrent queries. CARIC-DA logically range-partitions the dataset into multiple subsets. This enables different processor cores to access different subsets by ensuring that different DBPs are pinned to different cores and by dispatching queries to DBPs according to the data-partitioning information. In this way, CARIC-DA is expected to achieve better performance via a higher cache hit rate for the private cache of each core. It can also balance the loads between cores by changing the range of each subset. Note that CARIC-DA is pure middleware, meaning that it avoids any modification to existing operating systems (OSs) and DBMSs, thereby making it more practical. This is important because the source code for existing DBMSs is large and complex, making it very expensive to modify. We implemented a prototype that uses unmodified existing Linux and PostgreSQL environments, and evaluated the effectiveness of our proposal on three different multicore platforms. The performance evaluation against benchmarks revealed that CARIC-DA achieved improved cache hit rates and higher performance.

  • Accordion: An Efficient Gear-Shifting for a Power-Proportional Distributed Data-Placement Method

    Hieu Hanh LE  Satoshi HIKIDA  Haruo YOKOTA  

     
    PAPER

      Pubricized:
    2015/01/21
      Page(s):
    1013-1026

    Power-aware distributed file systems for efficient Big Data processing are increasingly moving towards power-proportional designs. However, current data placement methods for such systems have not given careful consideration to the effect of gear-shifting during operations. If the system wants to shift to a higher gear, it must reallocate the updated datasets that were modified in a lower gear when a subset of the nodes was inactive, but without disrupting the servicing of requests from clients. Inefficient gear-shifting that requires a large amount of data reallocation greatly degrades the system performance. To address this challenge, this paper proposes a data placement method known as Accordion, which uses data replication to arrange the data layout comprehensively and provide efficient gear-shifting. Compared with current methods, Accordion reduces the amount of data transferred, which significantly shortens the period required to reallocate the updated data during gear-shifting then able to improve the performance of the systems. The effect of this reduction is larger with higher gears, so Accordion is suitable for smooth gear-shifting in multigear systems. Moreover, the times when the active nodes serve the requests are well distributed, so Accordion is capable of higher scalability than existing methods based on the I/O throughput performance. Accordion does not require any strict constraint on the number of nodes in the system therefore our proposed method is expected to work well in practical environments. Extensive empirical experiments using actual machines with an Accordion prototype based on the Hadoop Distributed File System demonstrated that our proposed method significantly reduced the period required to transfer updated data, i.e., by 66% compared with an existing method.

  • k-Dominant Skyline Query Computation in MapReduce Environment

    Md. Anisuzzaman SIDDIQUE  Hao TIAN  Yasuhiko MORIMOTO  

     
    PAPER

      Pubricized:
    2015/01/21
      Page(s):
    1027-1034

    Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.

  • 3D Objects Tracking by MapReduce GPGPU-Enhanced Particle Filter

    Jieyun ZHOU  Xiaofeng LI  Haitao CHEN  Rutong CHEN  Masayuki NUMAO  

     
    PAPER

      Pubricized:
    2015/01/21
      Page(s):
    1035-1044

    Objects tracking methods have been wildly used in the field of video surveillance, motion monitoring, robotics and so on. Particle filter is one of the promising methods, but it is difficult to apply to real-time objects tracking because of its high computation cost. In order to reduce the processing cost without sacrificing the tracking quality, this paper proposes a new method for real-time 3D objects tracking, using parallelized particle filter algorithms by MapReduce architecture which is running on GPGPU. Our methods are as follows. First, we use a Kinect to get the 3D information of objects. Unlike the conventional 2D-based objects tracking, 3D objects tracking adds depth information. It can track not only from the x and y axis but also from the z axis, and the depth information can correct some errors in 2D objects tracking. Second, to solve the high computation cost problem, we use the MapReduce architecture on GPGPU to parallelize the particle filter algorithm. We implement the particle filter algorithms on GPU and evaluate the performance by actually running a program on CUDA5.5.

  • A Linguistics-Driven Approach to Statistical Parsing for Low-Resourced Languages

    Prachya BOONKWAN  Thepchai SUPNITHI  

     
    PAPER

      Pubricized:
    2015/01/21
      Page(s):
    1045-1052

    Developing a practical and accurate statistical parser for low-resourced languages is a hard problem, because it requires large-scale treebanks, which are expensive and labor-intensive to build from scratch. Unsupervised grammar induction theoretically offers a way to overcome this hurdle by learning hidden syntactic structures from raw text automatically. The accuracy of grammar induction is still impractically low because frequent collocations of non-linguistically associable units are commonly found, resulting in dependency attachment errors. We introduce a novel approach to building a statistical parser for low-resourced languages by using language parameters as a guide for grammar induction. The intuition of this paper is: most dependency attachment errors are frequently used word orders which can be captured by a small prescribed set of linguistic constraints, while the rest of the language can be learned statistically by grammar induction. We then show that covering the most frequent grammar rules via our language parameters has a strong impact on the parsing accuracy in 12 languages.

  • Regular Section
  • A Detection and Measurement Approach for Memory Leaked Objects in Java Programs

    Qiao YU  Shujuan JIANG  Yingqi LIU  

     
    PAPER-Software Engineering

      Pubricized:
    2015/02/04
      Page(s):
    1053-1061

    Memory leak occurs when useless objects cannot be released for a long time during program execution. Memory leaked objects may cause memory overflow, system performance degradation and even cause the system to crash when they become serious. This paper presents a dynamic approach for detecting and measuring memory leaked objects in Java programs. First, our approach tracks the program by JDI and records heap information to find out the potentially leaked objects. Second, we present memory leaking confidence to measure the influence of these objects on the program. Finally, we select three open-source programs to evaluate the efficiency of our approach. Furthermore, we choose ten programs from DaCapo 9.12 benchmark suite to reveal the time overhead of our approach. The experimental results show that our approach is able to detect and measure memory leaked objects efficiently.

  • A Similarity-Based Concepts Mapping Method between Ontologies

    Jie LIU  Linlin QIN  Jing GAO  Aidong ZHANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2015/01/26
      Page(s):
    1062-1072

    Ontology mapping is important in many areas, such as information integration, semantic web and knowledge management. Thus the effectiveness of ontology mapping needs to be further studied. This paper puts forward a mapping method between different ontology concepts in the same field. Firstly, the algorithms of calculating four individual similarities (the similarities of concept name, property, instance and structure) between two concepts are proposed. The algorithm features of four individual similarities are as follows: a new WordNet-based method is used to compute semantic similarity between concept names; property similarity algorithm is used to form property similarity matrix between concepts, then the matrix will be processed into a numerical similarity; a new vector space model algorithm is proposed to compute the individual similarity of instance; structure parameters are added to structure similarity calculation, structure parameters include the number of properties, instances, sub-concepts, and the hierarchy depth of two concepts. Then similarity of each of ontology concept pairs is represented by a vector. Finally, Support Vector Machine (SVM) is used to accomplish mapping discovery by training and learning the similarity vectors. In this algorithm, Harmony and reliability are used as the weights of the four individual similarities, which increases the accuracy and reliability of the algorithm. Experiments achieve good results and the results show that the proposed method outperforms many other methods of similarity-based algorithms.

  • Direct Density Ratio Estimation with Convolutional Neural Networks with Application in Outlier Detection

    Hyunha NAM  Masashi SUGIYAMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2015/01/28
      Page(s):
    1073-1079

    Recently, the ratio of probability density functions was demonstrated to be useful in solving various machine learning tasks such as outlier detection, non-stationarity adaptation, feature selection, and clustering. The key idea of this density ratio approach is that the ratio is directly estimated so that difficult density estimation is avoided. So far, parametric and non-parametric direct density ratio estimators with various loss functions have been developed, and the kernel least-squares method was demonstrated to be highly useful both in terms of accuracy and computational efficiency. On the other hand, recent study in pattern recognition exhibited that deep architectures such as a convolutional neural network can significantly outperform kernel methods. In this paper, we propose to use the convolutional neural network in density ratio estimation, and experimentally show that the proposed method tends to outperform the kernel-based method in outlying image detection.

  • Robust Visual Tracking via Coupled Randomness

    Chao ZHANG  Yo YAMAGATA  Takuya AKASHI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/02/04
      Page(s):
    1080-1088

    Tracking algorithms for arbitrary objects are widely researched in the field of computer vision. At the beginning, an initialized bounding box is given as the input. After that, the algorithms are required to track the objective in the later frames on-the-fly. Tracking-by-detection is one of the main research branches of online tracking. However, there still exist two issues in order to improve the performance. 1) The limited processing time requires the model to extract low-dimensional and discriminative features from the training samples. 2) The model is required to be able to balance both the prior and new objectives' appearance information in order to maintain the relocation ability and avoid the drifting problem. In this paper, we propose a real-time tracking algorithm called coupled randomness tracking (CRT) which focuses on dealing with these two issues. One randomness represents random projection, and the other randomness represents online random forests (ORFs). In CRT, the gray-scale feature is compressed by a sparse measurement matrix, and ORFs are used to train the sample sequence online. During the training procedure, we introduce a tree discarding strategy which helps the ORFs to adapt fast appearance changes caused by illumination, occlusion, etc. Our method can constantly adapt to the objective's latest appearance changes while keeping the prior appearance information. The experimental results show that our algorithm performs robustly with many publicly available benchmark videos and outperforms several state-of-the-art algorithms. Additionally, our algorithm can be easily utilized into a parallel program.

  • A Hybrid Topic Model for Multi-Document Summarization

    JinAn XU  JiangMing LIU  Kenji ARAKI  

     
    PAPER-Natural Language Processing

      Pubricized:
    2015/02/09
      Page(s):
    1089-1094

    Topic features are useful in improving text summarization. However, independency among topics is a strong restriction on most topic models, and alleviating this restriction can deeply capture text structure. This paper proposes a hybrid topic model to generate multi-document summaries using a combination of the Hidden Topic Markov Model (HTMM), the surface texture model and the topic transition model. Based on the topic transition model, regular topic transition probability is used during generating summary. This approach eliminates the topic independence assumption in the Latent Dirichlet Allocation (LDA) model. Meanwhile, the results of experiments show the advantage of the combination of the three kinds of models. This paper includes alleviating topic independency, and integrating surface texture and shallow semantic in documents to improve summarization. In short, this paper attempts to realize an advanced summarization system.

  • Noise Tolerant Heart Rate Extraction Algorithm Using Short-Term Autocorrelation for Wearable Healthcare Systems

    Shintaro IZUMI  Masanao NAKANO  Ken YAMASHITA  Yozaburo NAKAI  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER-Biological Engineering

      Pubricized:
    2015/01/26
      Page(s):
    1095-1103

    This report describes a robust method of instantaneous heart rate (IHR) extraction from noisy electrocardiogram (ECG) signals. Generally, R-waves are extracted from ECG using a threshold to calculate the IHR from the interval of R-waves. However, noise increases the incidence of misdetection and false detection in wearable healthcare systems because the power consumption and electrode distance are limited to reduce the size and weight. To prevent incorrect detection, we propose a short-time autocorrelation (STAC) technique. The proposed method extracts the IHR by determining the search window shift length which maximizes the correlation coefficient between the template window and the search window. It uses the similarity of the QRS complex waveform beat-by-beat. Therefore, it has no threshold calculation process. Furthermore, it is robust against noisy environments. The proposed method was evaluated using MIT-BIH arrhythmia and noise stress test databases. Simulation results show that the proposed method achieves a state-of-the-art success rate of IHR extraction in a noise stress test using a muscle artifact and a motion artifact.

  • On the Probability of Certificate Revocation in Combinatorial Certificate Management Schemes

    Dae Hyun YUM  

     
    LETTER-Information Network

      Pubricized:
    2015/02/18
      Page(s):
    1104-1107

    To enhance the privacy of vehicle owners, combinatorial certificate management schemes assign each certificate to a large enough group of vehicles so that it will be difficult to link a certificate to any particular vehicle. When an innocent vehicle shares a certificate with a misbehaving vehicle and the certificate on the misbehaving vehicle has been revoked, the certificate on the innocent vehicle also becomes invalid and is said to be covered. When a group of misbehaving vehicles collectively share all the certificates assigned to an innocent vehicle and these certificates are revoked, the innocent vehicle is said to be covered. We point out that the previous analysis of the vehicle cover probability is not correct and then provide a new and exact analysis of the vehicle cover probability.

  • A Deduplication-Enabled P2P Protocol for VM Image Distribution

    Choonhwa LEE  Sungho KIM  Eunsam KIM  

     
    LETTER-Information Network

      Pubricized:
    2015/02/19
      Page(s):
    1108-1111

    This paper presents a novel peer-to-peer protocol to efficiently distribute virtual machine images in a datacenter. A primary idea of it is to improve the performance of peer-to-peer content delivery by employing deduplication to take advantage of similarity both among and within VM images in cloud datacenters. The efficacy of the proposed scheme is validated through an evaluation that demonstrates substantial performance gains.

  • Face Verification Based on the Age Progression Rules

    Kai FANG  Shuoyan LIU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/01/26
      Page(s):
    1112-1115

    Appearance changes conform to certain rules for a same person,while for different individuals the changes are uncontrolled. Hence, this paper studies the age progression rules to tackle face verification task. The age progression rules are discovered in the difference space of facial image pairs. For this, we first represent an image pair as a matrix whose elements are the difference of a set of visual words. Thereafter, the age progression rules are trained using Support Vector Machine (SVM) based on this matrix representation. Finally, we use these rules to accomplish the face verification tasks. The proposed approach is tested on the FGnet dataset and a collection of real-world images from identification card. The experimental results demonstrate the effectiveness of the proposed method for verification of identity.

  • Discriminative Dictionary Learning with Low-Rank Error Model for Robust Crater Recognition

    An LIU  Maoyin CHEN  Donghua ZHOU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/02/18
      Page(s):
    1116-1119

    Robust crater recognition is a research focus on deep space exploration mission, and sparse representation methods can achieve desirable robustness and accuracy. Due to destruction and noise incurred by complex topography and varied illumination in planetary images, a robust crater recognition approach is proposed based on dictionary learning with a low-rank error correction model in a sparse representation framework. In this approach, all the training images are learned as a compact and discriminative dictionary. A low-rank error correction term is introduced into the dictionary learning to deal with gross error and corruption. Experimental results on crater images show that the proposed method achieves competitive performance in both recognition accuracy and efficiency.