IEICE global.ieice.org Site

Keyword Search Result

[Keyword] CRI(505hit)

61-80hit(505hit)

Infants' Pain Recognition Based on Facial Expression: Dynamic Hybrid Descriptions
Ruicong ZHI Ghada ZAMZMI Dmitry GOLDGOF Terri ASHMEADE Tingting LI Yu SUN

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2018/04/20
Vol:
E101-D No:7
Page(s):
1860-1869
The accurate assessment of infants' pain is important for understanding their medical conditions and developing suitable treatment. Pediatric studies reported that the inadequate treatment of infants' pain might cause various neuroanatomical and psychological problems. The fact that infants can not communicate verbally motivates increasing interests to develop automatic pain assessment system that provides continuous and accurate pain assessment. In this paper, we propose a new set of pain facial activity features to describe the infants' facial expression of pain. Both dynamic facial texture feature and dynamic geometric feature are extracted from video sequences and utilized to classify facial expression of infants as pain or no pain. For the dynamic analysis of facial expression, we construct spatiotemporal domain representation for texture features and time series representation (i.e. time series of frame-level features) for geometric features. Multiple facial features are combined through both feature fusion and decision fusion schemes to evaluate their effectiveness in infants' pain assessment. Experiments are conducted on the video acquired from NICU infants, and the best accuracy of the proposed pain assessment approaches is 95.6%. Moreover, we find that although decision fusion does not perform better than that of feature fusion, the False Negative Rate of decision fusion (6.2%) is much lower than that of feature fusion (25%).
Effects of Automated Transcripts on Non-Native Speakers' Listening Comprehension
Xun CAO Naomi YAMASHITA Toru ISHIDA

PAPER-Human-computer Interaction

Pubricized:
2017/11/24
Vol:
E101-D No:3
Page(s):
730-739
Previous research has shown that transcripts generated by automatic speech recognition (ASR) technologies can improve the listening comprehension of non-native speakers (NNSs). However, we still lack a detailed understanding of how ASR transcripts affect listening comprehension of NNSs. To explore this issue, we conducted two studies. The first study examined how the current presentation of ASR transcripts impacted NNSs' listening comprehension. 20 NNSs engaged in two listening tasks, each in different conditions: C1) audio only and C2) audio+ASR transcripts. The participants pressed a button whenever they encountered a comprehension problem, and explained each problem in the subsequent interviews. From our data analysis, we found that NNSs adopted different strategies when using the ASR transcripts; some followed the transcripts throughout the listening; some only checked them when necessary. NNSs also appeared to face difficulties following imperfect and slightly delayed transcripts while listening to speech - many reported difficulties concentrating on listening/reading or shifting between the two. The second study explored how different display methods of ASR transcripts affected NNSs' listening experiences. We focused on two display methods: 1) accuracy-oriented display which shows transcripts only after the completion of speech input analysis, and 2) speed-oriented display which shows the interim analysis results of speech input. We conducted a laboratory experiment with 22 NNSs who engaged in two listening tasks with ASR transcripts presented via the two display methods. We found that the more the NNSs paid attention to listening to the audio, the more they tended to prefer the speed-oriented transcripts, and vice versa. Mismatched transcripts were found to have negative effects on NNSs' listening comprehension. Our findings have implications for improving the presentation methods of ASR transcripts to more effectively support NNSs.
ArchHDL: A Novel Hardware RTL Modeling and High-Speed Simulation Environment
Shimpei SATO Ryohei KOBAYASHI Kenji KISE

PAPER-Design Methodology and Platform

Pubricized:
2017/11/17
Vol:
E101-D No:2
Page(s):
344-353
LSIs are generally designed through four stages including architectural design, logic design, circuit design, and physical design. In architectural design and logic design, designers describe their target hardware in RTL. However, they generally use different languages for each phase. Typically a general purpose programming language such as C or C++ and a hardware description language such as Verilog HDL or VHDL are used for architectural design and logic design, respectively. That is time-consuming way for designing a hardware and more efficient design environment is required. In this paper, we propose a new hardware modeling and high-speed simulation environment for architectural design and logic design. Our environment realizes writing and verifying hardware by one language. The environment consists of (1) a new hardware description language called ArchHDL, which enables to simulate hardware faster than Verilog HDL simulation, and (2) a source code translation tool from ArchHDL code to Verilog HDL code. ArchHDL is a new language for hardware RTL modeling based on C++. The key features of this language are that (1) designers describe a combinational circuit as a function and (2) the ArchHDL library realizes non-blocking assignment in C++. Using these features, designers are able to write a hardware transparently from abstracted level description to RTL description in Verilog HDL-like style. Source codes in ArchHDL is converted to Verilog HDL codes by the translation tool and they are used to synthesize for FPGAs or ASICs. As the evaluation of our environment, we implemented a practical many-core processor in ArchHDL and measured the simulation speed on an Intel CPU and an Intel Xeon Phi processor. The simulation speed for the Intel CPU by ArchHDL achieves about 4.5 times faster than the simulation speed by Synopsys VCS. We also confirmed that the RTL simulation by ArchHDL is efficiently parallelized on the Intel Xeon Phi processor. We convert the ArchHDL code to a Verilog HDL code and estimated the hardware utilization on an FPGA. To implement a 48-node many-core processor, 71% of entire resources of a Virtex-7 FPGA are consumed.
A Describing Method of an Image Processing Software in C for a High-Level Synthesis Considering a Function Chaining
Akira YAMAWAKI Seiichi SERIKAWA

PAPER-Design Methodology and Platform

Pubricized:
2017/11/17
Vol:
E101-D No:2
Page(s):
324-334
This paper shows a describing method of an image processing software in C for high-level synthesis (HLS) technology considering function chaining to realize an efficient hardware. A sophisticated image processing would be built on the sequence of several primitives represented as sub-functions like the gray scaling, filtering, binarization, thinning, and so on. Conventionally, generic describing methods for each sub-function so that HLS technology can generate an efficient hardware module have been shown. However, few studies have focused on a systematic describing method of the single top function consisting of the sub-functions chained. According to the proposed method, any number of sub-functions can be chained, maintaining the pipeline structure. Thus, the image processing can achieve the near ideal performance of 1 pixel per clock even when the processing chain is long. In addition, implicitly, the deadlock due to the mismatch of the number of pushes and pops on the FIFO connecting the functions is eliminated and the interpolation of the border pixels is done. The case study on a canny edge detection including the chain of some sub-functions demonstrates that our proposal can easily realize the expected hardware mentioned above. The experimental results on ZYNQ FPGA show that our proposal can be converted to the pipelined hardware with moderate size and achieve the performance gain of more than 70 times compared to the software execution. Moreover, the reconstructed C software program following our proposed method shows the small performance degradation of 8% compared with the pure C software through a comparative evaluation preformed on the Cortex A9 embedded processor in ZYNQ FPGA. This fact indicates that a unified image processing library using HLS software which can be executed on CPU or hardware module for HW/SW co-design can be established by using our proposed describing method.
A Simple and Effective Generalization of Exponential Matrix Discriminant Analysis and Its Application to Face Recognition
Ruisheng RAN Bin FANG Xuegang WU Shougui ZHANG

LETTER-Pattern Recognition

Pubricized:
2017/10/18
Vol:
E101-D No:1
Page(s):
265-268
As an effective method, exponential discriminant analysis (EDA) has been proposed and widely used to solve the so-called small-sample-size (SSS) problem. In this paper, a simple and effective generalization of EDA is presented and named as GEDA. In GEDA, a general exponential function, where the base of exponential function is larger than the Euler number, is used. Due to the property of general exponential function, the distance between samples belonging to different classes is larger than that of EDA, and then the discrimination property is largely emphasized. The experiment results on the Extended Yale and CMU-PIE face databases show that, GEDA gets more advantageous recognition performance compared to EDA.
Robust Sparse Signal Recovery in Impulsive Noise Using Bayesian Methods
Jinyang SONG Feng SHEN Xiaobo CHEN Di ZHAO

LETTER-Digital Signal Processing

Vol:
E101-A No:1
Page(s):
273-278
In this letter, robust sparse signal recovery is considered in the presence of heavy-tailed impulsive noise. Two Bayesian approaches are developed where a Bayesian framework is constructed by utilizing the Laplace distribution to model the noise. By rewriting the noise-fitting term as a reweighted quadratic function which is optimized in the sparse signal space, the Type I Maximum A Posteriori (MAP) approach is proposed. Next, by exploiting the hierarchical structure of the sparse prior and the likelihood function, we develop the Type II Evidence Maximization approach optimized in the hyperparameter space. The numerical results verify the effectiveness of the proposed methods in the presence of impulsive noise.
Tolerance Evaluation of Audio Watermarking Method Based on Modification of Sound Pressure Level between Channels
Harumi MURATA Akio OGIHARA Shigetoshi HAYASHI

LETTER

Pubricized:
2017/10/16
Vol:
E101-D No:1
Page(s):
68-71
We have proposed an audio watermarking method based on modification of sound pressure level between channels. This method is focused on the invariability of sound localization against sound processing like MP3 and the imperceptibility about slightly change of sound localization. In this paper, we investigate about tolerance evaluation against various attacks in reference to IHC criteria.
HMM-Based Maximum Likelihood Frame Alignment for Voice Conversion from a Nonparallel Corpus
Ki-Seung LEE

LETTER-Speech and Hearing

Pubricized:
2017/08/23
Vol:
E100-D No:12
Page(s):
3064-3067
One of the problems associated with voice conversion from a nonparallel corpus is how to find the best match or alignment between the source and the target vector sequences without linguistic information. In a previous study, alignment was achieved by minimizing the distance between the source vector and the transformed vector. This method, however, yielded a sequence of feature vectors that were not well matched with the underlying speaker model. In this letter, the vectors were selected from the candidates by maximizing the overall likelihood of the selected vectors with respect to the target model in the HMM context. Both objective and subjective evaluations were carried out using the CMU ARCTIC database to verify the effectiveness of the proposed method.
Identification and Application of Invariant Critical Paths under NBTI Degradation
Song BIAN Shumpei MORITA Michihiro SHINTANI Hiromitsu AWANO Masayuki HIROMOTO Takashi SATO

PAPER

Vol:
E100-A No:12
Page(s):
2797-2806
As technology further scales semiconductor devices, aging-induced device degradation has become one of the major threats to device reliability. In addition, aging mechanisms like the negative bias temperature instability (NBTI) are known to be sensitive to workload (i.e., signal probability) that is hard to be assumed at design phase. In this work, we analyze the workload dependence of NBTI degradation using a processor, and propose a novel technique to estimate the worst-case paths. In our approach, we exploit the fact that the deterministic nature of circuit structure limits the amount of NBTI degradation on different paths, and propose a two-stage path extraction algorithm to identify the invariant critical paths (ICPs) in the processor. Utilizing these paths, we also propose an optimization technique for the replacement of internal node control logic that mitigates the NBTI degradation in the design. Through numerical experiment on two processor designs, we achieved nearly 300x reduction in the sheer number of paths on both designs. Utilizing the extracted ICPs, we achieved 96x-197x speedup without loss in mitigation gain.
GOCD: Gradient Order Curve Descriptor
Hongmin LIU Lulu CHEN Zhiheng WANG Zhanqiang HUO

PAPER-Image Recognition, Computer Vision

Pubricized:
2017/09/15
Vol:
E100-D No:12
Page(s):
2973-2983
In this paper, the concept of gradient order is introduced and a novel gradient order curve descriptor (GOCD) for curve matching is proposed. The GOCD is constructed in the following main steps: firstly, curve support region independent of the dominant orientation is determined and then divided into several sub-regions based on gradient magnitude order; then gradient order feature (GOF) of each feature point is generated by encoding the local gradient information of the sample points; the descriptor is finally achieved by turning to the description matrix of GOF. Since both the local and the global gradient information are captured by GOCD, it is more distinctive and robust compared with the existing curve matching methods. Experiments under various changes, such as illumination, viewpoint, image rotation, JPEG compression and noise, show the great performance of GOCD. Furthermore, the application of image mosaic proves GOCD can be used successfully in actual field.
Weighted Voting of Discriminative Regions for Face Recognition
Wenming YANG Riqiang GAO Qingmin LIAO

LETTER-Image Recognition, Computer Vision

Pubricized:
2017/08/04
Vol:
E100-D No:11
Page(s):
2734-2737
This paper presents a strategy, Weighted Voting of Discriminative Regions (WVDR), to improve the face recognition performance, especially in Small Sample Size (SSS) and occlusion situations. In WVDR, we extract the discriminative regions according to facial key points and abandon the rest parts. Considering different regions of face make different contributions to recognition, we assign weights to regions for weighted voting. We construct a decision dictionary according to the recognition results of selected regions in the training phase, and this dictionary is used in a self-defined loss function to obtain weights. The final identity of test sample is the weighted voting of selected regions. In this paper, we combine the WVDR strategy with CRC and SRC separately, and extensive experiments show that our method outperforms the baseline and some representative algorithms.
AIGIF: Adaptively Integrated Gradient and Intensity Feature for Robust and Low-Dimensional Description of Local Keypoint
Songlin DU Takeshi IKENAGA

PAPER-Vision

Vol:
E100-A No:11
Page(s):
2275-2284
Establishing local visual correspondences between images taken under different conditions is an important and challenging task in computer vision. A common solution for this task is detecting keypoints in images and then matching the keypoints with a feature descriptor. This paper proposes a robust and low-dimensional local feature descriptor named Adaptively Integrated Gradient and Intensity Feature (AIGIF). The proposed AIGIF descriptor partitions the support region surrounding each keypoint into sub-regions, and classifies the sub-regions into two categories: edge-dominated ones and smoothness-dominated ones. For edge-dominated sub-regions, gradient magnitude and orientation features are extracted; for smoothness-dominated sub-regions, intensity feature is extracted. The gradient and intensity features are integrated to generate the descriptor. Experiments on image matching were conducted to evaluate performances of the proposed AIGIF. Compared with SIFT, the proposed AIGIF achieves 75% reduction of feature dimension (from 128 bytes to 32 bytes); compared with SURF, the proposed AIGIF achieves 87.5% reduction of feature dimension (from 256 bytes to 32 bytes); compared with the state-of-the-art ORB descriptor which has the same feature dimension with AIGIF, AIGIF achieves higher accuracy and robustness. In summary, the AIGIF combines the advantages of gradient feature and intensity feature, and achieves relatively high accuracy and robustness with low feature dimension.
Ground Plane Detection with a New Local Disparity Texture Descriptor
Kangru WANG Lei QU Lili CHEN Jiamao LI Yuzhang GU Dongchen ZHU Xiaolin ZHANG

LETTER-Pattern Recognition

Pubricized:
2017/06/27
Vol:
E100-D No:10
Page(s):
2664-2668
In this paper, a novel approach is proposed for stereo vision-based ground plane detection at superpixel-level, which is implemented by employing a Disparity Texture Map in a convolution neural network architecture. In particular, the Disparity Texture Map is calculated with a new Local Disparity Texture Descriptor (LDTD). The experimental results demonstrate our superior performance in KITTI dataset.
Image Retrieval Framework Based on Dual Representation Descriptor
Yuichi YOSHIDA Tsuyoshi TOYOFUKU

PAPER-Image Processing and Video Processing

Pubricized:
2017/07/06
Vol:
E100-D No:10
Page(s):
2605-2613
Descriptor aggregation techniques such as the Fisher vector and vector of locally aggregated descriptors (VLAD) are used in most image retrieval frameworks. It takes some time to extract local descriptors, and the geometric verification requires storage if a real-valued descriptor such as SIFT is used. Moreover, if we apply binary descriptors to such a framework, the performance of image retrieval is not better than if we use a real-valued descriptor. Our approach tackles these issues by using a dual representation descriptor that has advantages of being both a real-valued and a binary descriptor. The real value of the dual representation descriptor is aggregated into a VLAD in order to achieve high accuracy in the image retrieval, and the binary one is used to find correspondences in the geometric verification stage in order to reduce the amount of storage needed. We implemented a dual representation descriptor extracted in semi-real time by using the CARD descriptor. We evaluated the accuracy of our image retrieval framework including the geometric verification on three datasets (holidays, ukbench and Stanford mobile visual search). The results indicate that our framework is as accurate as the framework that uses SIFT. In addition, the experiments show that the image retrieval speed and storage requirements of our framework are as efficient as those of a framework that uses ORB.
Visualizing Web Images Using Fisher Discriminant Locality Preserving Canonical Correlation Analysis
Kohei TATENO Takahiro OGAWA Miki HASEYAMA

PAPER

Pubricized:
2017/06/14
Vol:
E100-D No:9
Page(s):
2005-2016
A novel dimensionality reduction method, Fisher Discriminant Locality Preserving Canonical Correlation Analysis (FDLP-CCA), for visualizing Web images is presented in this paper. FDLP-CCA can integrate two modalities and discriminate target items in terms of their semantics by considering unique characteristics of the two modalities. In this paper, we focus on Web images with text uploaded on Social Networking Services for these two modalities. Specifically, text features have high discriminate power in terms of semantics. On the other hand, visual features of images give their perceptual relationships. In order to consider both of the above unique characteristics of these two modalities, FDLP-CCA estimates the correlation between the text and visual features with consideration of the cluster structure based on the text features and the local structures based on the visual features. Thus, FDLP-CCA can integrate the different modalities and provide separated manifolds to organize enhanced compactness within each natural cluster.
Incorporating Security Constraints into Mixed-Criticality Real-Time Scheduling
Hyeongboo BAEK Jinkyu LEE

PAPER-Software System

Pubricized:
2017/05/31
Vol:
E100-D No:9
Page(s):
2068-2080
While conventional studies on real-time systems have mostly considered the real-time constraint of real-time systems only, recent research initiatives are trying to incorporate a security constraint into real-time scheduling due to the recognition that the violation of either of two constrains can cause catastrophic losses for humans, the system, and even environment. The focus of most studies, however, is the single-criticality systems, while the security of mixed-criticality systems has received scant attention, even though security is also a critical issue for the design of mixed-criticality systems. In this paper, we address the problem of the information leakage that arises from the shared resources that are used by tasks with different security-levels of mixed-criticality systems. We define a new concept of the security constraint employing a pre-flushing mechanism to cleanse the state of shared resources whenever there is a possibility of the information leakage regarding it. Then, we propose a new non-preemptive real-time scheduling algorithm and a schedulability analysis, which incorporate the security constraint for mixed-criticality systems. Our evaluation demonstrated that a large number of real-time tasks can be scheduled without a significant performance loss under a new security constraint.
Reordering-Based Test Pattern Reduction Considering Critical Area-Aware Weighted Fault Coverage
Masayuki ARAI Kazuhiko IWASAKI

PAPER

Vol:
E100-A No:7
Page(s):
1488-1495
Shrinking feature sizes and higher levels of integration in semiconductor device manufacturing technologies are increasingly causing the gap between defect levels estimated in the design stage and reported ones for fabricated devices. In this paper, we propose a unified weighted fault coverage approach that includes both bridge and open faults, considering the critical area as the incident rate of each fault. We then propose a test pattern reordering scheme that incorporates our weighted fault coverage with an aim to reduce test costs. Here we apply a greedy algorithm to reorder test patterns generated by the bridge and stuck-at automatic test pattern generator (ATPG), evaluating the relationship between the number of patterns and the weighted fault coverage. Experimental results show that by applying this reordering scheme, the number of test patterns was reduced, on average, by approximately 50%. Our results also indicate that relaxing coverage constraints can drastically reduce test pattern set sizes to a level comparable to traditional 100% coverage stuck-at pattern sets, while targeting the majority of bridge faults and keeping the defect level to no more than 10 defective parts per milion (DPPM) with a 99% manufacturing yield.
Narrow Fingerprint Template Synthesis by Clustering Minutiae Descriptors
Zhiqiang HU Dongju LI Tsuyoshi ISSHIKI Hiroaki KUNIEDA

PAPER-Pattern Recognition

Pubricized:
2017/03/08
Vol:
E100-D No:6
Page(s):
1290-1302
Narrow swipe sensor has been widely used in embedded systems such as smart-phone. However, the size of captured image is much smaller than that obtained by the traditional area sensor. Therefore, the limited template coverage is the performance bottleneck of such kind of systems. Aiming to increase the geometry coverage of templates, a novel fingerprint template feature synthesis scheme is proposed in the present study. This method could synthesis multiple input fingerprints into a wider template by clustering the minutiae descriptors. The proposed method consists of two modules. Firstly, a user behavior-based Registration Pattern Inspection (RPI) algorithm is proposed to select the qualified candidates. Secondly, an iterative clustering algorithm Modified Fuzzy C-Means (MFCM) is proposed to process the large amount of minutiae descriptors and then generate the final template. Experiments conducted over swipe fingerprint database validate that this innovative method gives rise to significant improvements in reducing FRR (False Reject Rate) and EER (Equal Error Rate).
Robust Singing Transcription System Using Local Homogeneity in the Harmonic Structure
Hoon HEO Kyogu LEE

PAPER-Music Information Processing

Pubricized:
2017/02/18
Vol:
E100-D No:5
Page(s):
1114-1123
Automatic music transcription from audio has long been one of the most intriguing problems and a challenge in the field of music information retrieval, because it requires a series of low-level tasks such as onset/offset detection and F0 estimation, followed by high-level post-processing for symbolic representation. In this paper, a comprehensive transcription system for monophonic singing voice based on harmonic structure analysis is proposed. Given a precise tracking of the fundamental frequency, a novel acoustic feature is derived to signify the harmonic structure in singing voice signals, regardless of the loudness and pitch. It is then used to generate a parametric mixture model based on the von Mises-Fisher distribution, so that the model represents the intrinsic harmonic structures within a region of smoothly connected notes. To identify the note boundaries, the local homogeneity in the harmonic structure is exploited by two different methods: the self-similarity analysis and hidden Markov model. The proposed system identifies the note attributes including the onset time, duration and note pitch. Evaluations are conducted from various aspects to verify the performance improvement of the proposed system and its robustness, using the latest evaluation methodology for singing transcription. The results show that the proposed system significantly outperforms other systems including the state-of-the-art systems.
Stochastic Dykstra Algorithms for Distance Metric Learning with Covariance Descriptors
Tomoki MATSUZAWA Eisuke ITO Raissa RELATOR Jun SESE Tsuyoshi KATO

PAPER-Pattern Recognition

Pubricized:
2017/01/13
Vol:
E100-D No:4
Page(s):
849-856
In recent years, covariance descriptors have received considerable attention as a strong representation of a set of points. In this research, we propose a new metric learning algorithm for covariance descriptors based on the Dykstra algorithm, in which the current solution is projected onto a half-space at each iteration, and which runs in O(n3) time. We empirically demonstrate that randomizing the order of half-spaces in the proposed Dykstra-based algorithm significantly accelerates convergence to the optimal solution. Furthermore, we show that the proposed approach yields promising experimental results for pattern recognition tasks.

61-80hit(505hit)

Keyword Search Result

[Keyword] CRI(505hit)

Infants' Pain Recognition Based on Facial Expression: Dynamic Hybrid Descriptions

Effects of Automated Transcripts on Non-Native Speakers' Listening Comprehension

ArchHDL: A Novel Hardware RTL Modeling and High-Speed Simulation Environment

A Describing Method of an Image Processing Software in C for a High-Level Synthesis Considering a Function Chaining

A Simple and Effective Generalization of Exponential Matrix Discriminant Analysis and Its Application to Face Recognition

Robust Sparse Signal Recovery in Impulsive Noise Using Bayesian Methods

Tolerance Evaluation of Audio Watermarking Method Based on Modification of Sound Pressure Level between Channels

HMM-Based Maximum Likelihood Frame Alignment for Voice Conversion from a Nonparallel Corpus

Identification and Application of Invariant Critical Paths under NBTI Degradation

GOCD: Gradient Order Curve Descriptor

Weighted Voting of Discriminative Regions for Face Recognition

AIGIF: Adaptively Integrated Gradient and Intensity Feature for Robust and Low-Dimensional Description of Local Keypoint

Ground Plane Detection with a New Local Disparity Texture Descriptor

Image Retrieval Framework Based on Dual Representation Descriptor

Visualizing Web Images Using Fisher Discriminant Locality Preserving Canonical Correlation Analysis

Incorporating Security Constraints into Mixed-Criticality Real-Time Scheduling

Reordering-Based Test Pattern Reduction Considering Critical Area-Aware Weighted Fault Coverage

Narrow Fingerprint Template Synthesis by Clustering Minutiae Descriptors

Robust Singing Transcription System Using Local Homogeneity in the Harmonic Structure

Stochastic Dykstra Algorithms for Distance Metric Learning with Covariance Descriptors

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles