IEICE global.ieice.org Site

Keyword Search Result

[Keyword] SI(16314hit)

3401-3420hit(16314hit)

Error Correction Using Long Context Match for Smartphone Speech Recognition
Yuan LIANG Koji IWANO Koichi SHINODA

PAPER-Speech and Hearing

Pubricized:
2015/07/31
Vol:
E98-D No:11
Page(s):
1932-1942
Most error correction interfaces for speech recognition applications on smartphones require the user to first mark an error region and choose the correct word from a candidate list. We propose a simple multimodal interface to make the process more efficient. We develop Long Context Match (LCM) to get candidates that complement the conventional word confusion network (WCN). Assuming that not only the preceding words but also the succeeding words of the error region are validated by users, we use such contexts to search higher-order n-grams corpora for matching word sequences. For this purpose, we also utilize the Web text data. Furthermore, we propose a combination of LCM and WCN (“LCM + WCN”) to provide users with candidate lists that are more relevant than those yielded by WCN alone. We compare our interface with the WCN-based interface on the Corpus of Spontaneous Japanese (CSJ). Our proposed “LCM + WCN” method improved the 1-best accuracy by 23%, improved the Mean Reciprocal Rank (MRR) by 28%, and our interface reduced the user's load by 12%.
Image Modification Based on a Visual Saliency Map for Guiding Visual Attention
Hironori TAKIMOTO Tatsuhiko KOKUI Hitoshi YAMAUCHI Mitsuyoshi KISHIHARA Kensuke OKUBO

PAPER-Image Recognition, Computer Vision

Pubricized:
2015/08/13
Vol:
E98-D No:11
Page(s):
1967-1975
It is commonly believed that improved interaction between humans and electronic device, it is effective to draw the viewer's attention to a particular object. Augmented reality (AR) applications can call attention to real objects by overlaying highlight effects or visual stimuli (such as arrows) on a physical scene. Sometimes, more subtle effects would be desirable, in which case it would be necessary to smoothly and naturally guide the user's gaze without external stimuli. Here, a novel image modification method is proposed for directing a viewer's gaze to specific regions of interest. The proposed method uses saliency analysis and color modulation to create modified images in which the region of interest is the most salient region in the entire image. The proposed saliency map model that is used during saliency analysis reduces computational costs and improves the naturalness of the image using the LAB color space and simplified normalization. During color modulation, the modulation value of each LAB component is determined in order to consider the relationship between the LAB components and the saliency value. With the image obtained in this manner, the viewer's attention is smoothly attracted to a specific region very naturally. Gaze measurements as well as a subjective experiments were conducted to prove the effectiveness of the proposed method. These results show that a viewer's visual attention is indeed attracted toward the specified region without any sense of discomfort or disruption when the proposed method is used.
HTTP Traffic Classification Based on Hierarchical Signature Structure
Sung-Ho YOON Jun-Sang PARK Ji-Hyeok CHOI Youngjoon WON Myung-Sup KIM

LETTER-Information Network

Pubricized:
2015/08/19
Vol:
E98-D No:11
Page(s):
1994-1997
Considering diversified HTTP types, the performance bottleneck of signature-based classification must be resolved. We define a signature model classifying the traffic in multiple dimensions and suggest a hierarchical signature structure to remove signature redundancy and minimize search space. Our experiments on campus traffic demonstrated 1.8 times faster processing speed than the Aho-Corasick matching algorithm in Snort.
Blind Image Deblurring Using Weighted Sum of Gaussian Kernels for Point Spread Function Estimation
Hong LIU BenYong LIU

LETTER-Image Processing and Video Processing

Pubricized:
2015/08/05
Vol:
E98-D No:11
Page(s):
2026-2029
Point spread function (PSF) estimation plays a paramount role in image deblurring processing, and traditionally it is solved by parameter estimation of a certain preassumed PSF shape model. In real life, the PSF shape is generally arbitrary and complicated, and thus it is assumed in this manuscript that a PSF may be decomposed as a weighted sum of a certain number of Gaussian kernels, with weight coefficients estimated in an alternating manner, and an l1 norm-based total variation (TVl1) algorithm is adopted to recover the latent image. Experiments show that the proposed method can achieve satisfactory performance on synthetic and realistic blurred images.
An Efficient and Universal Conical Hypervolume Evolutionary Algorithm in Three or Higher Dimensional Objective Space
Weiqin YING Yuehong XIE Xing XU Yu WU An XU Zhenyu WANG

LETTER-Numerical Analysis and Optimization

Vol:
E98-A No:11
Page(s):
2330-2335
The conical area evolutionary algorithm (CAEA) has a very high run-time efficiency for bi-objective optimization, but it can not tackle problems with more than two objectives. In this letter, a conical hypervolume evolutionary algorithm (CHEA) is proposed to extend the CAEA to a higher dimensional objective space. CHEA partitions objective spaces into a series of conical subregions and retains only one elitist individual for every subregion within a compact elitist archive. Additionally, each offspring needs to be compared only with the elitist individual in the same subregion in terms of the local hypervolume scalar indicator. Experimental results on 5-objective test problems have revealed that CHEA can obtain the satisfactory overall performance on both run-time efficiency and solution quality.
Improvement of Colorization-Based Coding Using Optimization by Novel Colorization Matrix Construction and Adaptive Color Conversion
Kazu MISHIBA Takeshi YOSHITOME

PAPER-Image Processing and Video Processing

Pubricized:
2015/07/31
Vol:
E98-D No:11
Page(s):
1943-1949
This study improves the compression efficiency of Lee's colorization-based coding framework by introducing a novel colorization matrix construction and an adaptive color conversion. Colorization-based coding methods reconstruct color components in the decoder by colorization, which adds color to a base component (a grayscale image) using scant color information. The colorization process can be expressed as a linear combination of a few column vectors of a colorization matrix. Thus it is important for colorization-based coding to make a colorization matrix whose column vectors effectively approximate color components. To make a colorization matrix, Lee's colorization-based coding framework first obtains a base and color components by RGB-YCbCr color conversion, and then performs a segmentation method on the base component. Finally, the entries of a colorization matrix are created using the segmentation results. To improve compression efficiency on this framework, we construct a colorization matrix based on a correlation of base-color components. Furthermore, we embed an edge-preserving smoothing filtering process into the colorization matrix to reduce artifacts. To achieve more improvement, our method uses adaptive color conversion instead of RGB-YCbCr color conversion. Our proposed color conversion maximizes the sum of the local variance of a base component, which resulted in increment of the difference of intensities at region boundaries. Since segmentation methods partition images based on the difference, our adaptive color conversion leads to better segmentation results. Experiments showed that our method has higher compression efficiency compared with the conventional method.
Mixture Hyperplanes Approximation for Global Tracking
Song GU Zheng MA Mei XIE

LETTER-Pattern Recognition

Pubricized:
2015/08/13
Vol:
E98-D No:11
Page(s):
2008-2012
Template tracking has been extensively studied in Computer Vision with a wide range of applications. A general framework is to construct a parametric model to predict movement and to track the target. The difference in intensity between the pixels belonging to the current region and the pixels of the selected target allows a straightforward prediction of the region position in the current image. Traditional methods track the object based on the assumption that the relationship between the intensity difference and the region position is linear or non-linear. They will result in bad tracking performance when just one model is adopted. This paper proposes a method, called as Mixture Hyperplanes Approximation, which is based on finite mixture of generalized linear regression models to perform robust tracking. Moreover, a fast learning strategy is discussed, which improves the robustness against noise. Experiments demonstrate the performance and stability of Mixture Hyperplanes Approximation.
High-Speed and Local-Changes Invariant Image Matching
Chao ZHANG Takuya AKASHI

PAPER-Image Recognition, Computer Vision

Pubricized:
2015/08/03
Vol:
E98-D No:11
Page(s):
1958-1966
In recent years, many variants of key point based image descriptors have been designed for the image matching, and they have achieved remarkable performances. However, to some images, local features appear to be inapplicable. Since theses images usually have many local changes around key points compared with a normal image, we define this special image category as the image with local changes (IL). An IL pair (ILP) refers to an image pair which contains a normal image and its IL. ILP usually loses local visual similarities between two images while still holding global visual similarity. When an IL is given as a query image, the purpose of this work is to match the corresponding ILP in a large scale image set. As a solution, we use a compressed HOG feature descriptor to extract global visual similarity. For the nearest neighbor search problem, we propose random projection indexed KD-tree forests (rKDFs) to match ILP efficiently instead of exhaustive linear search. rKDFs is built with large scale low-dimensional KD-trees. Each KD-tree is built in a random projection indexed subspace and contributes to the final result equally through a voting mechanism. We evaluated our method by a benchmark which contains 35,000 candidate images and 5,000 query images. The results show that our method is efficient for solving local-changes invariant image matching problems.
Performance of a Bayesian-Network-Model-Based BCI Using Single-Trial EEGs
Maiko SAKAMOTO Hiromi YAMAGUCHI Toshimasa YAMAZAKI Ken-ichi KAMIJO Takahiro YAMANOI

PAPER-Biocybernetics, Neurocomputing

Pubricized:
2015/08/06
Vol:
E98-D No:11
Page(s):
1976-1981
We have proposed a new Bayesian network model (BNM) framework for single-trial-EEG-based Brain-Computer Interface (BCI). The BNM was constructed in the following. In order to discriminate between left and right hands to be imaged from single-trial EEGs measured during the movement imagery tasks, the BNM has the following three steps: (1) independent component analysis (ICA) for each of the single-trial EEGs; (2) equivalent current dipole source localization (ECDL) for projections of each IC on the scalp surface; (3) BNM construction using the ECDL results. The BNMs were composed of nodes and edges which correspond to the brain sites where ECDs are located, and their connections, respectively. The connections were quantified as node activities by conditional probabilities calculated by probabilistic inference in each trial. The BNM-based BCI is compared with the common spatial pattern (CSP) method. For ten healthy subjects, there was no significant difference between the two methods. Our BNM might reflect each subject's strategy for task execution.
Privacy-Preserving Decision Tree Learning with Boolean Target Class
Hiroaki KIKUCHI Kouichi ITOH Mebae USHIDA Hiroshi TSUDA Yuji YAMAOKA

PAPER-Cryptography and Information Security

Vol:
E98-A No:11
Page(s):
2291-2300
This paper studies a privacy-preserving decision tree learning protocol (PPDT) for vertically partitioned datasets. In vertically partitioned datasets, a single class (target) attribute is shared by both parities or carefully treated by either party in existing studies. The proposed scheme allows both parties to have independent class attributes in a secure way and to combine multiple class attributes in arbitrary boolean function, which gives parties some flexibility in data-mining. Our proposed PPDT protocol reduces the CPU-intensive computation of logarithms by approximating with a piecewise linear function defined by light-weight fundamental operations of addition and constant multiplication so that information gain for attributes can be evaluated in a secure function evaluation scheme. Using the UCI Machine Learning dataset and a synthesized dataset, the proposed protocol is evaluated in terms of its accuracy and the sizes of trees*.
Fractional Pilot Reuse in Massive MIMO System
Chao ZHANG Lu MA

LETTER-Communication Theory and Signals

Vol:
E98-A No:11
Page(s):
2356-2359
The pilot contamination is a serious problem which hinders the capacity increasing in the massive MIMO system. Similar to Fractional Frequency Reuse (FFR) in the OFDMA system, Fractional Pilot Reuse (FPR) is proposed for the massive MIMO system. The FPR can be further classified as the strict FPR and soft FPR. Meanwhile, the detailed FPR schemes with pilot assignment and the mathematical models are provided. With FPR, the capacity and the transmission quality can be improved with metrics such as the higher Signal to Interference and Noise Ratio (SINR) of the pilots, the higher coverage probability, and the higher system capacity.
Bridging the Gap between Tenant CMDB and Device Status in Multi-Tenant Datacenter Networking
Yosuke HIMURA Yoshiko YASUDA

PAPER

Vol:
E98-B No:11
Page(s):
2132-2140
Multi-tenant datacenter networking, with which multiple customer networks (tenants) are virtualized and consolidated in a single shared physical infrastructure, has recently become a promising approach to reduce device cost, thanks to advances of virtualization technologies for various networking devices (e.g., switches, firewalls, load balancers). Since network devices are configured with low-level commands (no context of tenants), network engineers need to manually manage the context of tenants in different stores such as spreadsheet and/or configuration management database (CMDB). The use of CMDB is also effective in increasing the ‘visibility’ of tenant configurations (e.g., information sharing among various teams); However, different from the ideal use, only limited portion of network configuration are stored in CMDB in order to reduce the amount of ‘double configuration management’ between device settings (running information) and CMDB (stored information). In this present work, we aim to bridge the gap between CDMB and device status. Our basic approach is to automatically analyze per-device configuration settings to recover per-tenant network-wide configuration (running information) based on a graph-traversal technique applied over abstracted graph representation of device settings (to handle various types of vendor-specific devices); The recovered running information of per-tenant network configurations is automatically uploaded to CMDB. An implementation of this methodology is applied to a datacenter environment that management of about 100 tenants involves approximately 5,000 CMDB records, and our practical experiences are that this methodology enables to double the amount of CMDB records. We also discuss possible use cases enabled with this methodology.
Robust ASR Based on ETSI Advanced Front-End Using Complex Speech Analysis
Keita HIGA Keiichi FUNAKI

PAPER

Vol:
E98-A No:11
Page(s):
2211-2219
The advanced front-end (AFE) for automatic speech recognition (ASR) was standardized by the European Telecommunications Standards Institute (ETSI). The AFE provides speech enhancement realized by an iterative Wiener filter (IWF) in which a smoothed FFT spectrum over adjacent frames is used to design the filter. We have previously proposed robust time-varying complex Auto-Regressive (TV-CAR) speech analysis for an analytic signal and evaluated the performance of speech processing such as F0 estimation and speech enhancement. TV-CAR analysis can estimate more accurate spectrum than FFT, especially in low frequencies because of the nature of the analytic signal. In addition, TV-CAR can estimate more accurate speech spectrum against additive noise. In this paper, a time-invariant version of wide-band TV-CAR analysis is introduced to the IWF in the AFE and is evaluated using the CENSREC-2 database and its baseline script.
Beamwidth Scaling in Wireless Networks with Outage Constraints
Trung-Anh DO Won-Yong SHIN

PAPER-Fundamental Theories for Communications

Vol:
E98-B No:11
Page(s):
2202-2211
This paper analyzes the impact of directional antennas in improving the transmission capacity, defined as the maximum allowable spatial node density of successful transmissions multiplied by their data rate with a given outage constraint, in wireless networks. We consider the case where the gain Gm for the mainlobe of beamwidth can scale at an arbitrarily large rate. Under the beamwidth scaling model, the transmission capacity is analyzed for all path-loss attenuation regimes for the following two network configurations. In dense networks, in which the spatial node density increases with the antenna gain Gm, the transmission capacity scales as Gm4/α, where α denotes the path-loss exponent. On the other hand, in extended networks of fixed node density, the transmission capacity scales logarithmically in Gm. For comparison, we also show an ideal antenna model where there is no sidelobe beam. In addition, computer simulations are performed, which show trends consistent with our analytical behaviors. Our analysis sheds light on a new understanding of the fundamental limit of outage-constrained ad hoc networks operating in the directional mode.
Food Image Enhancement by Adjusting Intensity and Saturation in RGB Color Space
Chiaki UEDA Minami IBATA Tadahiro AZETSU Noriaki SUETAKE Eiji UCHINO

PAPER

Vol:
E98-A No:11
Page(s):
2220-2228
In a food image acquired by a digital camera, its intensity and saturation components are sometimes decreased depending on the illumination environment. In this case, the food image does not look delicious. In general, RGB components are transformed into hue, saturation and intensity components, and then the saturation and intensity components are enhanced so that the food image looks delicious. However, these processes are complex and involve a gamut problem. In this paper, we propose an intensity and saturation enhancement method while preserving the hue in the RGB color space for the food image. In this method, at first, the intensity components are enhanced avoiding the saturation deterioration. Then the saturation components of the regions having the hue components frequently appeared in foods are enhanced. In order to illustrate the effectiveness of the proposed method, the enhancement experiments using several food images are done.
Ensemble and Multiple Kernel Regressors: Which Is Better?
Akira TANAKA Hirofumi TAKEBAYASHI Ichigaku TAKIGAWA Hideyuki IMAI Mineichi KUDO

PAPER-Neural Networks and Bioengineering

Vol:
E98-A No:11
Page(s):
2315-2324
For the last few decades, learning with multiple kernels, represented by the ensemble kernel regressor and the multiple kernel regressor, has attracted much attention in the field of kernel-based machine learning. Although their efficacy was investigated numerically in many works, their theoretical ground is not investigated sufficiently, since we do not have a theoretical framework to evaluate them. In this paper, we introduce a unified framework for evaluating kernel regressors with multiple kernels. On the basis of the framework, we analyze the generalization errors of the ensemble kernel regressor and the multiple kernel regressor, and give a sufficient condition for the ensemble kernel regressor to outperform the multiple kernel regressor in terms of the generalization error in noise-free case. We also show that each kernel regressor can be better than the other without the sufficient condition by giving examples, which supports the importance of the sufficient condition.
Multi-Focus Image Fusion Based on Multiple Directional LOTs
Zhiyu CHEN Shogo MURAMATSU

LETTER-Image

Vol:
E98-A No:11
Page(s):
2360-2365
This letter proposes an image fusion method which adopts a union of multiple directional lapped orthogonal transforms (DirLOTs). DirLOTs are used to generate symmetric orthogonal discrete wavelet transforms and then to construct a union of unitary transforms as a redundant dictionary with a multiple directional property. The multiple DirLOTs can overcome a disadvantage of separable wavelets to represent images which contain slant textures and edges. We analyse the characteristic of local luminance contrast, and propose a fusion rule based on interscale relation of wavelet coefficients. Relying on the above, a novel image fusion method is proposed. Some experimental results show that the proposed method is able to significantly improve the fusion performance from those with the conventional discrete wavelet transforms.
An Encryption-then-Compression System for JPEG/Motion JPEG Standard
Kenta KURIHARA Masanori KIKUCHI Shoko IMAIZUMI Sayaka SHIOTA Hitoshi KIYA

PAPER

Vol:
E98-A No:11
Page(s):
2238-2245
In many multimedia applications, image encryption has to be conducted prior to image compression. This paper proposes a JPEG-friendly perceptual encryption method, which enables to be conducted prior to JPEG and Motion JPEG compressions. The proposed encryption scheme can provides approximately the same compression performance as that of JPEG compression without any encryption, where both gray scale images and color ones are considered. It is also shown that the proposed scheme consists of four block-based encryption steps, and provide a reasonably high level of security. Most of conventional perceptual encryption schemes have not been designed for international compression standards, but this paper focuses on applying the JPEG and Motion JPEG standards, as one of the most widely used image compression standards. In addition, this paper considers an efficient key management scheme, which enables an encryption with multiple keys to be easy to manage its keys.
Distributed Utility Maximization with Backward Physical Signaling in Interference-Limited Wireless Systems
Hye J. KANG Chung G. KANG

PAPER-Network

Vol:
E98-B No:10
Page(s):
2033-2039
In this paper, we consider a distributed power control scheme that can maximize overall capacity of an interference-limited wireless system in which the same radio resource is spatially reused among different transmitter-receiver pairs. This power control scheme employs a gradient-descent method in each transmitter, which adapts its own transmit power to co-channel interference dynamically to maximize the total weighted sum rate (WSR) of the system over a given interval. The key contribution in this paper is to propose a common feedback channel, over which a backward physical signal is accumulated for computing the gradient of the transmit power in each transmitter, thereby significantly reducing signaling overhead for the distributed power control. We show that the proposed power control scheme can achieve almost 95% of its theoretical upper WSR bound, while outperforming the non-power-controlled system by roughly 63% on average.
An Improved Platform for Multi-Agent Based Stock Market Simulation in Distributed Environment
Ce YU Xiang CHEN Chunyu WANG Hutong WU Jizhou SUN Yuelei LI Xiaotao ZHANG

PAPER-Fundamentals of Information Systems

Pubricized:
2015/06/25
Vol:
E98-D No:10
Page(s):
1727-1735
Multi-agent based simulation has been widely used in behavior finance, and several single-processed simulation platforms with Agent-Based Modeling (ABM) have been proposed. However, traditional simulations of stock markets on single processed computers are limited by the computing capability since financial researchers need larger and larger number of agents and more and more rounds to evolve agents' intelligence and get more efficient data. This paper introduces a distributed multi-agent simulation platform, named PSSPAM, for stock market simulation focusing on large scale of parallel agents, communication system and simulation scheduling. A logical architecture for distributed artificial stock market simulation is proposed, containing four loosely coupled modules: agent module, market module, communication system and user interface. With the customizable trading strategies inside, agents are deployed to multiple computing nodes. Agents exchange messages with each other and with the market based on a customizable network topology through a uniform communication system. With a large number of agent threads, the round scheduling strategy is used during the simulation, and a worker pool is applied in the market module. Financial researchers can design their own financial models and run the simulation through the user interface, without caring about the complexity of parallelization and related problems. Two groups of experiments are conducted, one with internal communication between agents and the other without communication between agents, to verify PSSPAM to be compatible with the data from Euronext-NYSE. And the platform shows fair scalability and performance under different parallelism configurations.

3401-3420hit(16314hit)

Keyword Search Result

[Keyword] SI(16314hit)

Error Correction Using Long Context Match for Smartphone Speech Recognition

Image Modification Based on a Visual Saliency Map for Guiding Visual Attention

HTTP Traffic Classification Based on Hierarchical Signature Structure

Blind Image Deblurring Using Weighted Sum of Gaussian Kernels for Point Spread Function Estimation

An Efficient and Universal Conical Hypervolume Evolutionary Algorithm in Three or Higher Dimensional Objective Space

Improvement of Colorization-Based Coding Using Optimization by Novel Colorization Matrix Construction and Adaptive Color Conversion

Mixture Hyperplanes Approximation for Global Tracking

High-Speed and Local-Changes Invariant Image Matching

Performance of a Bayesian-Network-Model-Based BCI Using Single-Trial EEGs

Privacy-Preserving Decision Tree Learning with Boolean Target Class

Fractional Pilot Reuse in Massive MIMO System

Bridging the Gap between Tenant CMDB and Device Status in Multi-Tenant Datacenter Networking

Robust ASR Based on ETSI Advanced Front-End Using Complex Speech Analysis

Beamwidth Scaling in Wireless Networks with Outage Constraints

Food Image Enhancement by Adjusting Intensity and Saturation in RGB Color Space

Ensemble and Multiple Kernel Regressors: Which Is Better?

Multi-Focus Image Fusion Based on Multiple Directional LOTs

An Encryption-then-Compression System for JPEG/Motion JPEG Standard

Distributed Utility Maximization with Backward Physical Signaling in Interference-Limited Wireless Systems

An Improved Platform for Multi-Agent Based Stock Market Simulation in Distributed Environment

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles