The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E103-D No.1  (Publication Date:2020/01/01)

    Special Section on Enriched Multimedia — Application of Multimedia Technology and Its Security —
  • FOREWORD Open Access

    Keiichi IWAMURA  

     
    FOREWORD

      Page(s):
    1-1
  • Lightweight Authentication for MP4 Format Container Using Subtitle Track

    KokSheik WONG  ChuanSheng CHAN  AprilPyone MAUNGMAUNG  

     
    INVITED PAPER

      Pubricized:
    2019/10/24
      Page(s):
    2-10

    With massive utilization of video in every aspect of our daily lives, managing videos is crucial and demanding. The rich literature of data embedding has proven its viability in managing as well as enriching videos and other multimedia contents, but conventional methods are designed to operate in the media/compression layer. In this work, the synchronization between the audio-video and subtitle tracks within an MP4 format container is manipulated to insert data. Specifically, the data are derived from the statistics of the audio samples and video frames, and it serves as the authentication data for verification purpose. When needed, the inserted authentication data can be extracted and compared against the information computed from the received audio samples and video frames. The proposed method is lightweight because simple statistics, i.e., ‘0’ and ‘1’ at the bit stream level, are treated as the authentication data. Furthermore, unlike conventional MP4 container format-based data insertion technique, the bit stream size remains unchanged before and after data insertion using the proposed method. The proposed authentication method can also be deployed for joint utilization with any existing authentication technique for audio / video as long as these media can be multiplexed into a single bit stream and contained within an MP4 container. Experiments are carried out to verify the basic functionality of the proposed technique as an authentication method.

  • Improvement of the Quality of Visual Secret Sharing Schemes with Constraints on the Usage of Shares

    Mariko FUJII  Tomoharu SHIBUYA  

     
    PAPER

      Pubricized:
    2019/10/07
      Page(s):
    11-24

    (k,n)-visual secret sharing scheme ((k,n)-VSSS) is a method to divide a secret image into n images called shares that enable us to restore the original image by only stacking at least k of them without any complicated computations. In this paper, we consider (2,2)-VSSS to share two secret images at the same time only by two shares, and investigate the methods to improve the quality of decoded images. More precisely, we consider (2,2)-VSSS in which the first secret image is decoded by stacking those two shares in the usual way, while the second one is done by stacking those two shares in the way that one of them is used reversibly. Since the shares must have some subpixels that inconsistently correspond to pixels of the secret images, the decoded pixels do not agree with the corresponding pixels of the secret images, which causes serious degradation of the quality of decoded images. To reduce such degradation, we propose several methods to construct shares that utilize 8-neighbor Laplacian filter and halftoning. Then we show that the proposed methods can effectively improve the quality of decoded images. Moreover, we demonstrate that the proposed methods can be naturally extended to (2,2)-VSSS for RGB images.

  • Image Identification of Encrypted JPEG Images for Privacy-Preserving Photo Sharing Services

    Kenta IIDA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2019/10/25
      Page(s):
    25-32

    We propose an image identification scheme for double-compressed encrypted JPEG images that aims to identify encrypted JPEG images that are generated from an original JPEG image. To store images without any visual sensitive information on photo sharing services, encrypted JPEG images are generated by using a block-scrambling-based encryption method that has been proposed for Encryption-then-Compression systems with JPEG compression. In addition, feature vectors robust against JPEG compression are extracted from encrypted JPEG images. The use of the image encryption and feature vectors allows us to identify encrypted images recompressed multiple times. Moreover, the proposed scheme is designed to identify images re-encrypted with different keys. The results of a simulation show that the identification performance of the scheme is high even when images are recompressed and re-encrypted.

  • Neural Watermarking Method Including an Attack Simulator against Rotation and Compression Attacks

    Ippei HAMAMOTO  Masaki KAWAMURA  

     
    PAPER

      Pubricized:
    2019/10/23
      Page(s):
    33-41

    We have developed a digital watermarking method that use neural networks to learn embedding and extraction processes that are robust against rotation and JPEG compression. The proposed neural networks consist of a stego-image generator, a watermark extractor, a stego-image discriminator, and an attack simulator. The attack simulator consists of a rotation layer and an additive noise layer, which simulate the rotation attack and the JPEG compression attack, respectively. The stego-image generator can learn embedding that is robust against these attacks, and also, the watermark extractor can extract watermarks without rotation synchronization. The quality of the stego-images can be improved by using the stego-image discriminator, which is a type of adversarial network. We evaluated the robustness of the watermarks and image quality and found that, using the proposed method, high-quality stego-images could be generated and the neural networks could be trained to embed and extract watermarks that are robust against rotation and JPEG compression attacks. We also showed that the robustness and image quality can be adjusted by changing the noise strength in the noise layer.

  • Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification

    Ryota KAMINISHI  Haruna MIYAMOTO  Sayaka SHIOTA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2019/10/25
      Page(s):
    42-49

    This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.

  • Secure Overcomplete Dictionary Learning for Sparse Representation

    Takayuki NAKACHI  Yukihiro BANDOH  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2019/10/09
      Page(s):
    50-58

    In this paper, we propose secure dictionary learning based on a random unitary transform for sparse representation. Currently, edge cloud computing is spreading to many application fields including services that use sparse coding. This situation raises many new privacy concerns. Edge cloud computing poses several serious issues for end users, such as unauthorized use and leak of data, and privacy failures. The proposed scheme provides practical MOD and K-SVD dictionary learning algorithms that allow computation on encrypted signals. We prove, theoretically, that the proposal has exactly the same dictionary learning estimation performance as the non-encrypted variant of MOD and K-SVD algorithms. We apply it to secure image modeling based on an image patch model. Finally, we demonstrate its performance on synthetic data and a secure image modeling application for natural images.

  • Multi-Scale Chroma n-Gram Indexing for Cover Song Identification

    Jin S. SEO  

     
    LETTER

      Pubricized:
    2019/10/23
      Page(s):
    59-62

    To enhance cover song identification accuracy on a large-size music archive, a song-level feature summarization method is proposed by using multi-scale representation. The chroma n-grams are extracted in multiple scales to cope with both global and local tempo changes. We derive index from the extracted n-grams by clustering to reduce storage and computation for DB search. Experiments on the widely used music datasets confirmed that the proposed method achieves the state-of-the-art accuracy while reducing cost for cover song search.

  • Non-Blind Speech Watermarking Method Based on Spread-Spectrum Using Linear Prediction Residue

    Reiya NAMIKAWA  Masashi UNOKI  

     
    LETTER

      Pubricized:
    2019/10/23
      Page(s):
    63-66

    We propose a method of non-blind speech watermarking based on direct spread spectrum (DSS) using a linear prediction scheme to solve sound distortion due to spread spectrum. Results of evaluation simulations revealed that the proposed method had much lower sound-quality distortion than the DSS method while having almost the same bit error ratios (BERs) against various attacks as the DSS method.

  • A Weighted Viewport Quality Metric for Omnidirectional Images

    Huyen T. T. TRAN  Trang H. HOANG  Phu N. MINH  Nam PHAM NGOC  Truong CONG THANG  

     
    LETTER

      Pubricized:
    2019/10/10
      Page(s):
    67-70

    Thanks to the ability to bring immersive experiences to users, Virtual Reality (VR) technologies have been gaining popularity in recent years. A key component in VR systems is omnidirectional content, which can provide 360-degree views of scenes. However, at a given time, only a portion of the full omnidirectional content, called viewport, is displayed corresponding to the user's current viewing direction. In this work, we first develop Weighted-Viewport PSNR (W-VPSNR), an objective quality metric for quality assessment of omnidirectional content. The proposed metric takes into account the foveation feature of the human visual system. Then, we build a subjective database consisting of 72 stimuli with spatial varying viewport quality. By using this database, an evaluation of the proposed metric and four conventional metrics is conducted. Experiment results show that the W-VPSNR metric well correlates with the mean opinion scores (MOS) and outperforms the conventional metrics. Also, it is found that the conventional metrics do not perform well for omnidirectional content.

  • Regular Section
  • HeteroRWR: A Novel Algorithm for Top-k Co-Author Recommendation with Fusion of Citation Networks

    Sufen ZHAO  Rong PENG  Meng ZHANG  Liansheng TAN  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/09/26
      Page(s):
    71-84

    It is of great importance to recommend collaborators for scholars in academic social networks, which can benefit more scientific research results. Facing the problem of data sparsity of co-author recommendation in academic social networks, a novel recommendation algorithm named HeteroRWR (Heterogeneous Random Walk with Restart) is proposed. Different from the basic Random Walk with Restart (RWR) model which only walks in homogeneous networks, HeteroRWR implements multiple random walks in a heterogeneous network which integrates a citation network and a co-authorship network to mine the k mostly valuable co-authors for target users. By introducing the citation network, HeteroRWR algorithm can find more suitable candidate authors when the co-authorship network is extremely sparse. Candidate recommenders will not only have high topic similarities with target users, but also have good community centralities. Analyses on the convergence and time efficiency of the proposed approach are presented. Extensive experiments have been conducted on DBLP and CiteSeerX datasets. Experimental results demonstrate that HeteroRWR outperforms state-of-the-art baseline methods in terms of precision and recall rate even in the case of incorporating an incomplete citation dataset.

  • Cloud Annealing: A Novel Simulated Annealing Algorithm Based on Cloud Model

    Shanshan JIAO  Zhisong PAN  Yutian CHEN  Yunbo LI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/09/27
      Page(s):
    85-92

    As one of the most popular intelligent optimization algorithms, Simulated Annealing (SA) faces two key problems, the generation of perturbation solutions and the control strategy of the outer loop (cooling schedule). In this paper, we introduce the Gaussian Cloud model to solve both problems and propose a novel cloud annealing algorithm. Its basic idea is to use the Gaussian Cloud model with decreasing numerical character He (Hyper-entropy) to generate new solutions in the inner loop, while He essentially indicates a heuristic control strategy to combine global random search of the outer loop and local tuning search of the inner loop. Experimental results in function optimization problems (i.e. single-peak, multi-peak and high dimensional functions) show that, compared with the simple SA algorithm, the proposed cloud annealing algorithm will lead to significant improvement on convergence and the average value of obtained solutions is usually closer to the optimal solution.

  • Node-Disjoint Paths Problems in Directed Bijective Connection Graphs

    Keiichi KANEKO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/09/26
      Page(s):
    93-100

    In this paper, we extend the notion of bijective connection graphs to introduce directed bijective connection graphs. We propose algorithms that solve the node-to-set node-disjoint paths problem and the node-to-node node-disjoint paths problem in a directed bijective connection graph. The time complexities of the algorithms are both O(n4), and the maximum path lengths are both 2n-1.

  • A Generalized Theory Based on the Turn Model for Deadlock-Free Irregular Networks

    Ryuta KAWANO  Ryota YASUDO  Hiroki MATSUTANI  Michihiro KOIBUCHI  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2019/10/08
      Page(s):
    101-110

    Recently proposed irregular networks can reduce the latency for both on-chip and off-chip systems with a large number of computing nodes and thus can improve the performance of parallel applications. However, these networks usually suffer from deadlocks in routing packets when using a naive minimal path routing algorithm. To solve this problem, we focus attention on a lately proposed theory that generalizes the turn model to maintain the network performance with deadlock-freedom. The theorems remain a challenge of applying themselves to arbitrary topologies including fully irregular networks. In this paper, we advance the theorems to completely general ones. Moreover, we provide a feasible implementation of a deadlock-free routing method based on our advanced theorem. Experimental results show that the routing method based on our proposed theorem can improve the network throughput by up to 138 % compared to a conventional deterministic minimal routing method. Moreover, when utilized as the escape path in Duato's protocol, it can improve the throughput by up to 26.3 % compared with the conventional up*/down* routing.

  • Genetic Node-Mapping Methods for Rapid Collective Communications

    Takashi YOKOTA  Kanemitsu OOTSU  Takeshi OHKAWA  

     
    PAPER-Computer System

      Pubricized:
    2019/10/10
      Page(s):
    111-129

    Inter-node communication is essential in parallel computation. The performance of parallel processing depends on the efficiencies in both computation and communication, thus, the communication cost is not negligible. A parallel application program involves a logical communication structure that is determined by the interchange of data between computation nodes. Sometimes the logical communication structure mismatches to that in a real parallel machine. This mismatch results in large communication costs. This paper addresses the node-mapping problem that rearranges logical position of node so that the degree of mismatch is decreased. This paper assumes that parallel programs execute one or more collective communications that follow specific traffic patterns. An appropriate node-mapping achieves high communication performance. This paper proposes a strong heuristic method for solving the node-mapping problem and adapts the method to a genetic algorithm. Evaluation results reveal that the proposed method achieves considerably high performance; it achieves 8.9 (4.9) times speed-up on average in single-(two-)traffic-pattern cases in 32×32 torus networks. Specifically, for some traffic patterns in small-scale networks, the proposed method finds theoretically optimized solutions. Furthermore, this paper discusses in deep about various issues in the proposed method that employs genetic algorithm, such as population of genes, number of generations, and traffic patterns. This paper also discusses applicability to large-scale systems for future practical use.

  • Efficient Supergraph Search Using Graph Coding

    Shun IMAI  Akihiro INOKUCHI  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2019/09/26
      Page(s):
    130-141

    This paper proposes a method for searching for graphs in the database which are contained as subgraphs by a given query. In the proposed method, the search index does not require any knowledge of the query set or the frequent subgraph patterns. In conventional techniques, enumerating and selecting frequent subgraph patterns is computationally expensive, and the distribution of the query set must be known in advance. Subsequent changes to the query set require the frequent patterns to be selected again and the index to be reconstructed. The proposed method overcomes these difficulties through graph coding, using a tree structured index that contains infrequent subgraph patterns in the shallow part of the tree. By traversing this code tree, we are able to rapidly determine whether multiple graphs in the database contain subgraphs that match the query, producing a powerful pruning or filtering effect. Furthermore, the filtering and verification steps of the graph search can be conducted concurrently, rather than requiring separate algorithms. As the proposed method does not require the frequent subgraph patterns and the query set, it is significantly faster than previous techniques; this independence from the query set also means that there is no need to reconstruct the search index when the query set changes. A series of experiments using a real-world dataset demonstrate the efficiency of the proposed method, achieving a search speed several orders of magnitude faster than the previous best.

  • Proposal and Evaluation of Visual Analytics Interface for Time-Series Data Based on Trajectory Representation

    Rei TAKAMI  Yasufumi TAKAMA  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2019/09/30
      Page(s):
    142-151

    This paper proposes a visual analytics (VA) interface for time-series data so that it can solve the problems arising from the property of time-series data: a collision between interaction and animation on the temporal aspect, collision of interaction between the temporal and spatial aspects, and the trade-off of exploration accuracy, efficiency, and scalability between different visualization methods. To solve these problems, this paper proposes a VA interface that can handle temporal and spatial changes uniformly. Trajectories can show temporal changes spatially, of which direct manipulation enables to examine the relationship among objects either at a certain time point or throughout the entire time range. The usefulness of the proposed interface is demonstrated through experiments.

  • Blob Detection Based on Soft Morphological Filter

    Weiqing TONG  Haisheng LI  Guoyue CHEN  

     
    PAPER-Pattern Recognition

      Pubricized:
    2019/10/02
      Page(s):
    152-162

    Blob detection is an important part of computer vision and a special case of region detection with important applications in the image analysis. In this paper, the dilation operator in standard mathematical morphology is firstly extended to the order dilation operator of soft morphology, three soft morphological filters are designed by using the operator, and a novel blob detection algorithm called SMBD is proposed on that basis. SMBD had been proven to have better performance of anti-noise and blob shape detection than similar blob filters based on mathematical morphology like Quoit and N-Quoit in terms of theoretical and experimental aspects. Additionally, SMBD was also compared to LoG and DoH in different classes, which are the most commonly used blob detector, and SMBD also achieved significantly great results.

  • Measuring Semantic Similarity between Words Based on Multiple Relational Information

    Jianyong DUAN  Yuwei WU  Mingli WU  Hao WANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2019/09/27
      Page(s):
    163-169

    The similarity of words extracted from the rich text relation network is the main way to calculate the semantic similarity. Complex relational information and text content in Wikipedia website, Community Question Answering and social network, provide abundant corpus for semantic similarity calculation. However, most typical research only focused on single relationship. In this paper, we propose a semantic similarity calculation model which integrates multiple relational information, and map multiple relationship to the same semantic space through learning representing matrix and semantic matrix to improve the accuracy of semantic similarity calculation. In experiments, we confirm that the semantic calculation method which integrates many kinds of relationships can improve the accuracy of semantic calculation, compared with other semantic calculation methods.

  • A Log-Based Testing Approach for Detecting Faults Caused by Incorrect Assumptions About the Environment

    Sooyong JEONG  Ajay Kumar JHA  Youngsul SHIN  Woo Jin LEE  

     
    LETTER-Software Engineering

      Pubricized:
    2019/10/04
      Page(s):
    170-173

    Embedded software developers assume the behavior of the environment when specifications are not available. However, developers may assume the behavior incorrectly, which may result in critical faults in the system. Therefore, it is important to detect the faults caused by incorrect assumptions. In this letter, we propose a log-based testing approach to detect the faults. First, we create a UML behavioral model to represent the assumed behavior of the environment, which is then transformed into a state model. Next, we extract the actual behavior of the environment from a log, which is then incorporated in the state model, resulting in a state model that represents both assumed and actual behaviors. Existing testing techniques based on the state model can be used to generate test cases from our state model to detect faults.

  • Mode Normalization Enhanced Recurrent Model for Multi-Modal Semantic Trajectory Prediction

    Shaojie ZHU  Lei ZHANG  Bailong LIU  Shumin CUI  Changxing SHAO  Yun LI  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/10/04
      Page(s):
    174-176

    Multi-modal semantic trajectory prediction has become a new challenge due to the rapid growth of multi-modal semantic trajectories with text message. Traditional RNN trajectory prediction methods have the following problems to process multi-modal semantic trajectory. The distribution of multi-modal trajectory samples shifts gradually with training. It leads to difficult convergency and long training time. Moreover, each modal feature shifts in different directions, which produces multiple distributions of dataset. To solve the above problems, MNERM (Mode Normalization Enhanced Recurrent Model) for multi-modal semantic trajectory is proposed. MNERM embeds multiple modal features together and combines the LSTM network to capture long-term dependency of trajectory. In addition, it designs Mode Normalization mechanism to normalize samples with multiple means and variances, and each distribution normalized falls into the action area of the activation function, so as to improve the prediction efficiency while improving greatly the training speed. Experiments on real dataset show that, compared with SERM, MNERM reduces the sensitivity of learning rate, improves the training speed by 9.120 times, increases HR@1 by 0.03, and reduces the ADE by 120 meters.

  • On the Detection of Malicious Behaviors against Introspection Using Hardware Architectural Events

    Huaizhe ZHOU  Haihe BA  Yongjun WANG  Tie HONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/10/09
      Page(s):
    177-180

    The arms race between offense and defense in the cloud impels the innovation of techniques for monitoring attacks and unauthorized activities. The promising technique of virtual machine introspection (VMI) becomes prevalent for its tamper-resistant capability. However, some elaborate exploitations are capable of invalidating VMI-based tools by breaking the assumption of a trusted guest kernel. To achieve a more reliable and robust introspection, we introduce a practical approach to monitor and detect attacks that attempt to subvert VMI in this paper. Our approach combines supervised machine learning and hardware architectural events to identify those malicious behaviors which are targeted at VMI techniques. To demonstrate the feasibility, we implement a prototype named HyperMon on the Xen hypervisor. The results of our evaluation show the effectiveness of HyperMon in detecting malicious behaviors with an average accuracy of 90.51% (AUC).

  • UMMS: Efficient Superpixel Segmentation Driven by a Mixture of Spatially Constrained Uniform Distribution

    Pengyu WANG  Hongqing ZHU  Ning CHEN  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2019/10/02
      Page(s):
    181-185

    A novel superpixel segmentation approach driven by uniform mixture model with spatially constrained (UMMS) is proposed. Under this algorithm, each observation, i.e. pixel is first represented as a five-dimensional vector which consists of colour in CLELAB space and position information. And then, we define a new uniform distribution through adding pixel position, so that this distribution can describe each pixel in input image. Applied weighted 1-Norm to difference between pixels and mean to control the compactness of superpixel. In addition, an effective parameter estimation scheme is introduced to reduce computational complexity. Specifically, the invariant prior probability and parameter range restrict the locality of superpixels, and the robust mean optimization technique ensures the accuracy of superpixel boundaries. Finally, each defined uniform distribution is associated with a superpixel and the proposed UMMS successfully implements superpixel segmentation. The experiments on BSDS500 dataset verify that UMMS outperforms most of the state-of-the-art approaches in terms of segmentation accuracy, regularity, and rapidity.