The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.72

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E102-D No.1  (Publication Date:2019/01/01)

    Special Section on Enriched Multimedia — Making Multimedia More Convenient and Safer —
  • FOREWORD Open Access

    Keiichi IWAMURA  

     
    FOREWORD

      Page(s):
    1-1
  • Robust Image Identification with DC Coefficients for Double-Compressed JPEG Images

    Kenta IIDA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2018/10/19
      Page(s):
    2-10

    In the case that images are shared via social networking services (SNS) and cloud photo storage services (CPSS), it is known that the JPEG images uploaded to the services are mostly re-compressed by the providers. Because of such a situation, a new image identification scheme for double-compressed JPEG images is proposed in this paper. The aim is to detect a single-compressed image that has the same original image as the double-compressed ones. In the proposed scheme, a feature extracted from only DC coefficients in DCT coefficients is used for the identification. The use of the feature allows us not only to robustly avoid errors caused by double-compression but also to perform the identification for different size images. The simulation results demonstrate the effectiveness of the proposed one in terms of the querying performance.

  • Image Manipulation Specifications on Social Networking Services for Encryption-then-Compression Systems

    Tatsuya CHUMAN  Kenta IIDA  Warit SIRICHOTEDUMRONG  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2018/10/19
      Page(s):
    11-18

    Encryption-then-Compression (EtC) systems have been proposed to securely transmit images through an untrusted channel provider. In this study, EtC systems were applied to social media like Twitter that carry out image manipulations. The block scrambling-based encryption schemes used in EtC systems were evaluated in terms of their robustness against image manipulation on social media. The aim was to investigate how five social networking service (SNS) providers, Facebook, Twitter, Google+, Tumblr and Flickr, manipulate images and to determine whether the encrypted images uploaded to SNS providers can avoid being distorted by such manipulations. In an experiment, encrypted and non-encrypted JPEG images were uploaded to various SNS providers. The results show that EtC systems are applicable to the five SNS providers.

  • Image Watermarking Technique Using Embedder and Extractor Neural Networks

    Ippei HAMAMOTO  Masaki KAWAMURA  

     
    PAPER

      Pubricized:
    2018/10/19
      Page(s):
    19-30

    An autoencoder has the potential ability to compress and decompress information. In this work, we consider the process of generating a stego-image from an original image and watermarks as compression, and the process of recovering the original image and watermarks from the stego-image as decompression. We propose embedder and extractor neural networks based on the autoencoder. The embedder network learns mapping from the DCT coefficients of the original image and a watermark to those of the stego-image. The extractor network learns mapping from the DCT coefficients of the stego-image to the watermark. Once the proposed neural network has been trained, the network can embed and extract the watermark into unlearned test images. We investigated the relation between the number of neurons and network performance by computer simulations and found that the trained neural network could provide high-quality stego-images and watermarks with few errors. We also evaluated the robustness against JPEG compression and found that, when suitable parameters were used, the watermarks were extracted with an average BER lower than 0.01 and image quality over 35 dB when the quality factor Q was over 50. We also investigated how to represent the watermarks in the stego-image by our neural network. There are two possibilities: distributed representation and sparse representation. From the results of investigation into the output of the stego layer (3rd layer), we found that the distributed representation emerged at an early learning step and then sparse representation came out at a later step.

  • Permutation-Based Signature Generation for Spread-Spectrum Video Watermarking

    Hiroshi ITO  Tadashi KASEZAWA  

     
    PAPER

      Pubricized:
    2018/10/19
      Page(s):
    31-40

    Generation of secure signatures suitable for spread-spectrum video watermarking is proposed. The method embeds a message, which is a two-dimensional binary pattern, into a three-dimensional volume, such as video, by addition of a signature. The message can be a mark or a logo indicating the copyright information. The signature is generated by shuffling or permuting random matrices along the third or time axis so that the message is extracted when they are accumulated after demodulation by the correct key. In this way, a message is hidden in the signature having equal probability of decoding any variation of the message, where the key is used to determine which one to extract. Security of the proposed method, stemming from the permutation, is evaluated as resistance to blind estimation of secret information. The matrix-based permutation allows the message to survive the spatial down-sampling without sacrificing the security. The downside of the proposed method is that it needs more data or frames to decode a reliable information compared to the conventional spread-spectrum modulation. However this is minimized by segmenting the matrices and applying permutation to sub-matrices independently. Message detectability is theoretically analyzed. Superiority of our method in terms of robustness to blind message estimation and down-sampling is verified experimentally.

  • Robust and Secure Data Hiding for PDF Text Document

    Minoru KURIBAYASHI  Takuya FUKUSHIMA  Nobuo FUNABIKI  

     
    PAPER

      Pubricized:
    2018/10/19
      Page(s):
    41-47

    The spaces between words and paragraphs are popular places for embedding data in data hiding techniques for text documents. Due to the low redundancy in text documents, the payload is limited to be small. As each bit of data is independently inserted into specific spaces in conventional methods, a malicious party may be able to modify the data without causing serious visible distortions. In this paper, we regard a collection of space lengths as a one-dimensional feature vector and embed watermark into its frequency components. To keep the secrecy of the embedded information, a random permutation and dither modulation are introduced in the operation. Furthermore, robustness against additive noise is enhanced by controlling the payload. In the proposed method, through experiments, we evaluated the trade-off among payload, distortion, and robustness.

  • Adaptive Tiling Selection for Viewport Adaptive Streaming of 360-degree Video

    Duc V. NGUYEN  Huyen T. T. TRAN  Truong Cong THANG  

     
    LETTER

      Pubricized:
    2018/10/19
      Page(s):
    48-51

    360-degree video is an important component of the emerging Virtual Reality. In this paper, we propose a new adaptation method for tiling-based viewport adaptive streaming of 360-degree video. The proposed method is able to dynamically select the best tiling scheme given the network conditions and user status. Experiments show that our proposed method can improve the viewport quality by up to 2.3 dB compared to a conventional fixed tiling method.

  • Regular Section
  • Accelerating Large-Scale Interconnection Network Simulation by Cellular Automata Concept

    Takashi YOKOTA  Kanemitsu OOTSU  Takeshi OHKAWA  

     
    PAPER-Computer System

      Pubricized:
    2018/10/05
      Page(s):
    52-74

    State-of-the-art parallel systems employ a huge number of computing nodes that are connected by an interconnection network. An interconnection network (ICN) plays an important role in a parallel system, since it is responsible to communication capability. In general, an ICN shows non-linear phenomena in its communication performance, most of them are caused by congestion. Thus, designing a large-scale parallel system requires sufficient discussions through repetitive simulation runs. This causes another problem in simulating large-scale systems within a reasonable cost. This paper shows a promising solution by introducing the cellular automata concept, which is originated in our prior work. Assuming 2D-torus topologies for simplification of discussion, this paper discusses fundamental design of router functions in terms of cellular automata, data structure of packets, alternative modeling of a router function, and miscellaneous optimization. The proposed models have a good affinity to GPGPU technology and, as representative speed-up results, the GPU-based simulator accelerates simulation upto about 1264 times from sequential execution on a single CPU. Furthermore, since the proposed models are applicable in the shared memory model, multithread implementation of the proposed methods achieve about 162 times speed-ups at the maximum.

  • Empirical Studies of a Kernel Density Estimation Based Naive Bayes Method for Software Defect Prediction

    Haijin JI  Song HUANG  Xuewei LV  Yaning WU  Yuntian FENG  

     
    PAPER-Software Engineering

      Pubricized:
    2018/10/03
      Page(s):
    75-84

    Software defect prediction (SDP) plays a significant part in allocating testing resources reasonably, reducing testing costs, and ensuring software quality. One of the most widely used algorithms of SDP models is Naive Bayes (NB) because of its simplicity, effectiveness and robustness. In NB, when a data set has continuous or numeric attributes, they are generally assumed to follow normal distributions and incorporate the probability density function of normal distribution into their conditional probabilities estimates. However, after conducting a Kolmogorov-Smirnov test, we find that the 21 main software metrics follow non-normal distribution at the 5% significance level. Therefore, this paper proposes an improved NB approach, which estimates the conditional probabilities of NB with kernel density estimation of training data sets, to help improve the prediction accuracy of NB for SDP. To evaluate the proposed method, we carry out experiments on 34 software releases obtained from 10 open source projects provided by PROMISE repository. Four well-known classification algorithms are included for comparison, namely Naive Bayes, Support Vector Machine, Logistic Regression and Random Tree. The obtained results show that this new method is more successful than the four well-known classification algorithms in the most software releases.

  • A Semantic Management Method of Simulation Models in GNSS Distributed Simulation Environment

    Guo-chao FAN  Chun-sheng HU  Xue-en ZHENG  Cheng-dong XU  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2018/10/09
      Page(s):
    85-92

    In GNSS (Global Navigation Satellite System) Distributed Simulation Environment (GDSE), the simulation task could be designed with the sharing models on the Internet. However, too much information and relation of model need to be managed in GDSE. Especially if there is a large quantity of sharing models, the model retrieval would be an extremely complex project. For meeting management demand of GDSE and improving the model retrieval efficiency, the characteristics of service simulation model are analysed firstly. A semantic management method of simulation model is proposed, and a model management architecture is designed. Compared with traditional retrieval way, it takes less retrieval time and has a higher accuracy result. The simulation results show that retrieval in the semantic management module has a good ability on understanding user needs, and helps user obtain appropriate model rapidly. It improves the efficiency of simulation tasks design.

  • On-Demand Generalization of Road Networks Based on Facility Search Results

    Daisuke YAMAMOTO  Masaki MURASE  Naohisa TAKAHASHI  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2018/10/16
      Page(s):
    93-103

    Road generalization is a method for thinning out road networks to allow easy viewing according to the size of the map. Most conventional road generalization methods mainly focus on the length of a stroke, which is a chain of links with good continuity based on the principle of perceptual grouping applied to network data such as roads and rivers. However, in the case of facility search in a web map service, for example, a “restaurant guide map,” a road generalization mechanism can be more effective if it depends not only on the stroke length but also on the facility search results. Accordingly, in this study, we implement an on-demand road generalization method that adapts to both the facility search results and the stroke length. Moreover, a sufficiently fast response speed is achieved for practical use in web map services. In particular, this study proposes a fat-stroke model that links facility information to individual strokes and implements a road generalization method that uses this model to improve the response time. In addition, we develop a prototype based on the proposed system. The system evaluation results are based on three indicators, namely, response time of the road generalization system, connectivity between strokes, and connectivity between stroke and facilities. Our experimental results suggest that the proposed method can yield improved response times by a factor of 100 or more while affording higher connectivity.

  • Selecting Orientation-Insensitive Features for Activity Recognition from Accelerometers

    Yasser MOHAMMAD  Kazunori MATSUMOTO  Keiichiro HOASHI  

     
    PAPER-Information Network

      Pubricized:
    2018/10/05
      Page(s):
    104-115

    Activity recognition from sensors is a classification problem over time-series data. Some research in the area utilize time and frequency domain handcrafted features that differ between datasets. Another categorically different approach is to use deep learning methods for feature learning. This paper explores a middle ground in which an off-the-shelf feature extractor is used to generate a large number of candidate time-domain features followed by a feature selector that was designed to reduce the bias toward specific classification techniques. Moreover, this paper advocates the use of features that are mostly insensitive to sensor orientation and show their applicability to the activity recognition problem. The proposed approach is evaluated using six different publicly available datasets collected under various conditions using different experimental protocols and shows comparable or higher accuracy than state-of-the-art methods on most datasets but usually using an order of magnitude fewer features.

  • Design of High-Speed Easy-to-Expand CC-Link Parallel Communication Module Based on R-IN32M3

    Yeong-Mo YEON  Seung-Hee KIM  

     
    PAPER-Information Network

      Pubricized:
    2018/10/09
      Page(s):
    116-123

    The CC-Link proposed by the Mitsubishi Electric Company is an industrial network used exclusively in most industries. However, the probabilities of data loss and interference with equipment control increase if the transmission time is greater than the link scan time of 381µs. The link scan time can be reduced by designing the CC-Link module as an external microprocessor (MPU) interface of R-IN32M3; however, it then suffers from expandability issues. Thus, in this paper, we propose a new CC-Link module utilizing R-IN32M3 to improve the expandability. In our designed CC-Link module, we devise a dual-port RAM (DPRAM) function in an external I/O module, which enables parallel communication between the DPRAM and the external MPU. Our experiment with the implemented CC-Link prototype demonstrates that our CC-Link design improves the communication speed owing to the parallel communication between DPRAM and external MPU, and expandability of remote I/O. Our design achieves miniaturization of the CC-Link module, wiring reduction, and an approximately 30% reduction in the link scan time. Furthermore, because we utilize both the Renesas R-IN32M3 and Xilinx XC95144XL chips widely used in diverse application areas, the designed CC-Link module reduces the investment cost. The proposed design is expected to significantly contribute to the utilization of the programmable logic controller memory and I/O expansion for factory automation and improvement of the investment efficiency in the flat panel display industry.

  • A High Throughput Device-to-Device Wireless Communication System

    Amin JAMALI  Seyed Mostafa SAFAVI HEMAMI  Mehdi BERENJKOUB  Hossein SAIDI  Masih ABEDINI  

     
    PAPER-Information Network

      Pubricized:
    2018/10/15
      Page(s):
    124-132

    Device-to-device (D2D) communication in cellular networks is defined as direct communication between two mobile users without traversing the base station (BS) or core network. D2D communication can occur on the cellular frequencies (i.e., inband) or unlicensed spectrum (i.e., outband). A high capacity IEEE 802.11-based outband device-to-device communication system for cellular networks is introduced in this paper. Transmissions in device-to-device connections are managed using our proposed medium access control (MAC) protocol. In the proposed MAC protocol, backoff window size is adjusted dynamically considering the current network status and utilizing an appropriate transmission attempt rate. We have considered both cases that the request to send/clear to send (RTS/CTS) mechanism is and is not used in our protocol design. Describing mechanisms for guaranteeing quality of service (QoS) and enhancing reliability of the system is another part of our work. Moreover, performance of the system in the presence of channel impairments is investigated analytically and through simulations. Analytical and simulation results demonstrate that our proposed system has high throughput, and it can provide different levels of QoS for its users.

  • Towards Privacy-Preserving Location Sharing over Mobile Online Social Networks Open Access

    Juan CHEN  Shen SU  Xianzhi WANG  

     
    PAPER-Information Network

      Pubricized:
    2018/10/18
      Page(s):
    133-146

    Location sharing services have recently gained momentum over mobile online social networks (mOSNs), seeing the increasing popularity of GPS-capable mobile devices such as smart phones. Despite the convenience brought by location sharing, there comes severe privacy risks. Though many efforts have been made to protect user privacy during location sharing, many of them rely on the extensive deployment of trusted Cellular Towers (CTs) and some incur excessive time overhead. More importantly, little research so far can support complete privacy including location privacy, identity privacy and social relation privacy. We propose SAM, a new System Architecture for mOSNs, and P3S, a Privacy-Preserving Protocol based on SAM, to address the above issues for privacy-preserving location sharing over mOSNs. SAM and P3S differ from previous work in providing complete privacy for location sharing services over mOSNs. Theoretical analysis and extensive experimental results demonstrate the feasibility and efficiency of the proposed system and protocol.

  • Automated Detection of Children at Risk of Chinese Handwriting Difficulties Using Handwriting Process Information: An Exploratory Study

    Zhiming WU  Tao LIN  Ming LI  

     
    PAPER-Educational Technology

      Pubricized:
    2018/10/22
      Page(s):
    147-155

    Handwriting difficulties (HWDs) in children have adverse effects on their confidence and academic progress. Detecting HWDs is the first crucial step toward clinical or teaching intervention for children with HWDs. To date, how to automatically detect HWDs is still a challenge, although digitizing tablets have provided an opportunity to automatically collect handwriting process information. Especially, to our best knowledge, there is no exploration into the potential of combining machine learning algorithms and the handwriting process information to automatically detect Chinese HWDs in children. To bridge the gap, we first conducted an experiment to collect sample data and then compared the performance of five commonly used classification algorithms (Decision tree, Support Vector Machine (SVM), Artificial Neural Network, Naïve Bayesian and k-Nearest Neighbor) in detecting HWDs. The results showed that: (1) only a small proportion (13%) of children had Chinese HWDs and each classification model on the imbalanced dataset (39 children at risk of HWDs versus 261 typical children) produced the results that were better than random guesses, indicating the possibility of using classification algorithms to detect Chinese HWDs; (2) the SVM model had the best performance in detecting Chinese HWDs among the five classification models; and (3) the performance of the SVM model, especially its sensitivity, could be significantly improved by employing the Synthetic Minority Oversampling Technique to handle the class-imbalanced data. This study gains new insights into which handwriting features are predictive of Chinese HWDs in children and proposes a method that can help the clinical and educational professionals to automatically detect children at risk of Chinese HWDs.

  • Visual Emphasis of Lip Protrusion for Pronunciation Learning

    Siyang YU  Kazuaki KONDO  Yuichi NAKAMURA  Takayuki NAKAJIMA  Hiroaki NANJO  Masatake DANTSUJI  

     
    PAPER-Educational Technology

      Pubricized:
    2018/10/22
      Page(s):
    156-164

    Pronunciation is a fundamental factor in speaking and listening. However, instructions for important articulation have not been sufficiently provided in conventional computer-assisted language learning (CALL) systems. One typical case is the articulation of rounded vowels. Although lip protrusion is essential for their correct pronunciation, the perception of lip protrusion is often difficult for beginners. To tackle this issue, we propose an innovative method that will provide a comprehensive visual explanation for articulation. Lip movements are three-dimensionally measured, and face images or videos are pseudocoloured on the basis of the movements. The coloured regions represent the lip protrusion of rounded vowels. To verify the learning effect of the proposed method, we conducted experiments with Japanese undergraduates in Chinese classes. The results showed that our method has advantages over conventional video materials.

  • Temporal and Spatial Analysis of Local Body Sway Movements for the Identification of People

    Takuya KAMITANI  Hiroki YOSHIMURA  Masashi NISHIYAMA  Yoshio IWAI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/10/09
      Page(s):
    165-174

    We propose a method for accurately identifying people using temporal and spatial changes in local movements measured from video sequences of body sway. Existing methods identify people using gait features that mainly represent the large swinging of the limbs. The use of gait features introduces a problem in that the identification performance decreases when people stop walking and maintain an upright posture. To extract informative features, our method measures small swings of the body, referred to as body sway. We extract the power spectral density as a feature from local body sway movements by dividing the body into regions. To evaluate the identification performance using our method, we collected three original video datasets of body sway sequences. The first dataset contained a large number of participants in an upright posture. The second dataset included variation over the long term. The third dataset represented body sway in different postures. The results on the datasets confirmed that our method using local movements measured from body sway can extract informative features for identification.

  • Real-Time Sparse Visual Tracking Using Circulant Reverse Lasso Model

    Chenggang GUO  Dongyi CHEN  Zhiqi HUANG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/10/09
      Page(s):
    175-184

    Sparse representation has been successfully applied to visual tracking. Recent progresses in sparse tracking are mainly made within the particle filter framework. However, most sparse trackers need to extract complex feature representations for each particle in the limited sample space, leading to expensive computation cost and yielding inferior tracking performance. To deal with the above issues, we propose a novel sparse tracking method based on the circulant reverse lasso model. Benefiting from the properties of circulant matrices, densely sampled target candidates are implicitly generated by cyclically shifting the base feature descriptors, and then embedded into a reverse sparse reconstruction model as a dictionary to encode a robust appearance template. The alternating direction method of multipliers is employed for solving the reverse sparse model and the optimization process can be efficiently solved in the frequency domain, which enables the proposed tracker to run in real-time. The calculated sparse coefficient map represents the similarity scores between the template and circular shifted samples. Thus the target location can be directly predicted according to the coordinates of the peak coefficient. A scale-aware template updating strategy is combined with the correlation filter template learning to take into account both appearance deformations and scale variations. Both quantitative and qualitative evaluations on two challenging tracking benchmarks demonstrate that the proposed algorithm performs favorably against several state-of-the-art sparse representation based tracking methods.

  • A Robot Model That Obeys a Norm of a Human Group by Participating in the Group and Interacting with Its Members

    Yotaro FUSE  Hiroshi TAKENOUCHI  Masataka TOKUMARU  

     
    PAPER-Kansei Information Processing, Affective Information Processing

      Pubricized:
    2018/10/03
      Page(s):
    185-194

    Herein, we proposed a robot model that will obey a norm of a certain group by interacting with the group members. Using this model, a robot system learns the norm of the group as a group member itself. The people with individual differences form a group and a characteristic norm that reflects the group members' personalities. When robots join a group that includes humans, the robots need to obey a characteristic norm: a group norm. We investigated whether the robot system generates a decision-making criterion to obey group norms by learning from interactions through reinforcement learning. In this experiment, human group members and the robot system answer same easy quizzes that could have several vague answers. When the group members answered differently from one another at first, we investigated whether the group members answered the quizzes while considering the group norm. To avoid bias toward the system's answers, one of the participants in a group only obeys the system, whereas the other participants are unaware of the system. Our experiments revealed that the group comprising the participants and the robot system forms group norms. The proposed model enables a social robot to make decisions socially in order to adjust their behaviors to common sense not only in a large human society but also in partial human groups, e.g., local communities. Therefore, we presumed that these robots can join human groups by interacting with its members. To adapt to these groups, these robots adjust their own behaviors. However, further studies are required to reveal whether the robots' answers affect people and whether the participants can form a group norm based on a robot's answer even in a situation wherein the participants recognize that they are interacting in a group that include a real robot. Moreover, some participants in a group do not know that the other participant only obeys the system's decisions and pretends to answer questions to prevent biased answers.

  • Cycle Time Improvement of EtherCAT Networks with Embedded Linux-Based Master

    Hyun-Chul YI  Joon-Young CHOI  

     
    LETTER-Software System

      Pubricized:
    2018/10/11
      Page(s):
    195-197

    We improve the cycle time performance of EtherCAT networks with embedded Linux-based master by developing a Linux Ethernet driver optimized for EtherCAT operation. The Ethernet driver is developed to establish a direct interface between the master module and Ethernet controllers of embedded systems by removing the involvement of Linux network stack and the New API (NAPI) of standard Ethernet drivers. Consequently, it is achieved that the time-consuming memory copy operations are reduced and the process of EtherCAT frames is accelerated. In order to demonstrate the effect of the developed Ethernet driver, we set up EtherCAT networks composed of an embedded Linux-based master and commercial off-the-shelf slaves, and the experimental results confirm that the cycle time performance is significantly improved.

  • JPEG Steganalysis Based on Multi-Projection Ensemble Discriminant Clustering

    Yan SUN  Guorui FENG  Yanli REN  

     
    LETTER-Information Network

      Pubricized:
    2018/10/15
      Page(s):
    198-201

    In this paper, we propose a novel algorithm called multi-projection ensemble discriminant clustering (MPEDC) for JPEG steganalysis. The scheme makes use of the optimal projection of linear discriminant analysis (LDA) algorithm to get more projection vectors by using the micro-rotation method. These vectors are similar to the optimal vector. MPEDC combines unsupervised K-means algorithm to make a comprehensive decision classification adaptively. The power of the proposed method is demonstrated on three steganographic methods with three feature extraction methods. Experimental results show that the accuracy can be improved using iterative discriminant classification.

  • Millimeter-Wave Radar Target Recognition Algorithm Based on Collaborative Auto-Encoder

    Yilu MA  Zhihui YE  Yuehua LI  

     
    LETTER-Pattern Recognition

      Pubricized:
    2018/10/03
      Page(s):
    202-205

    Conventional target recognition methods usually suffer from information-loss and target-aspect sensitivity when applied to radar high resolution range profile (HRRP) recognition. Thus, Effective establishment of robust and discriminatory feature representation has a significant performance improvement of practical radar applications. In this work, we present a novel feature extraction method, based on modified collaborative auto-encoder, for millimeter-wave radar HRRP recognition. The latent frame-specific weight vector is trained for samples in a frame, which contributes to retaining local information for different targets. Experimental results demonstrate that the proposed algorithm obtains higher target recognition accuracy than conventional target recognition algorithms.

  • Real-Time Head Action Recognition Based on HOF and ELM

    Tie HONG  Yuan Wei LI  Zhi Ying WANG  

     
    LETTER-Pattern Recognition

      Pubricized:
    2018/10/05
      Page(s):
    206-209

    Head action recognition, as a specific problem in action recognition, has been studied in this paper. Different from most existing researches, our head action recognition problem is specifically defined for the requirement of some practical applications. Based on our definition, we build a corresponding head action dataset which contains many challenging cases. For action recognition, we proposed a real-time head action recognition framework based on HOF and ELM. The framework consists of face detection based ROI determination, HOF feature extraction in ROI, and ELM based action prediction. Experiments show that our method achieves good accuracy and is efficient enough for practical applications.

  • Side Scan Sonar Image Super Resolution via Region-Selective Sparse Coding

    Jaihyun PARK  Bonhwa KU  Youngsaeng JIN  Hanseok KO  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2018/10/22
      Page(s):
    210-213

    Side scan sonar using low frequency can quickly search a wide range, but the images acquired are of low quality. The image super resolution (SR) method can mitigate this problem. The SR method typically uses sparse coding, but accurately estimating sparse coefficients incurs substantial computational costs. To reduce processing time, we propose a region-selective sparse coding based SR system that emphasizes object regions. In particular, the region that contains interesting objects is detected for side scan sonar based underwater images so that the subsequent sparse coding based SR process can be selectively applied. Effectiveness of the proposed method is verified by the reduced processing time required for image reconstruction yet preserving the same level of visual quality as conventional methods.

  • Fast Visual Odometry Based Sparse Geometric Constraint for RGB-D Camera Open Access

    Ruibin GUO  Dongxiang ZHOU  Keju PENG  Yunhui LIU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2018/10/09
      Page(s):
    214-218

    Pose estimation is a basic requirement for the autonomous behavior of robots. In this article we present a robust and fast visual odometry method to obtain camera poses by using RGB-D images. We first propose a motion estimation method based on sparse geometric constraint and derive the analytic Jacobian of the geometric cost function to improve the convergence performance, then we use our motion estimation method to replace the tracking thread in ORB-SLAM for improving its runtime performance. Experimental results show that our method is twice faster than ORB-SLAM while keeping the similar accuracy.

  • Symmetric Decomposition of Convolution Kernels

    Jun OU  Yujian LI  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2018/10/18
      Page(s):
    219-222

    It is a hot issue that speeding up the network layers and decreasing the network parameters in convolutional neural networks (CNNs). In this paper, we propose a novel method, namely, symmetric decomposition of convolution kernels (SDKs). It symmetrically separates k×k convolution kernels into (k×1 and 1×k) or (1×k and k×1) kernels. We conduct the comparison experiments of the network models designed by SDKs on MNIST and CIFAR-10 datasets. Compared with the corresponding CNNs, we obtain good recognition performance, with 1.1×-1.5× speedup and more than 30% reduction of network parameters. The experimental results indicate our method is useful and effective for CNNs in practice, in terms of speedup performance and reduction of parameters.