IEICE global.ieice.org Site

Keyword Search Result

[Keyword] tract(469hit)

121-140hit(469hit)

Muffled and Brisk Speech Evaluation with Criterion Based on Temporal Differentiation of Vocal Tract Area Function
Masanori MORISE Satoshi TSUZUKI Hideki BANNO Kenji OZAWA

LETTER-Speech and Hearing

Pubricized:
2014/09/17
Vol:
E97-D No:12
Page(s):
3230-3233
This research deals with muffled speech as the evaluation target and introduces a criterion for evaluating the auditory impression in muffled speech. It focuses on the vocal tract area function (VTAF) to evaluate the auditory impression, and the criterion uses temporal differentiation of this function to track the temporal variation of the shape of the mouth. The experimental results indicate that the proposed criterion can be used to evaluate the auditory impression as well as the subjective impression.
Scene Analysis from Viewing Orientations in a Shooting Environment of Multiple Mobile Phones
Shogo TOKAI Takayoshi MORIOKA Hiroyuki HASE

LETTER

Vol:
E97-A No:11
Page(s):
2178-2180
We propose a method to extract scene situation by orientation sensors of multiple mobile phones' environment. By using orientations recorded with videos, we analyzed their view concentrations as a remarkable position of the scene for each frame of videos. In an experiment for a soccer scene, the extracted points can be related to a trajectory of a soccer ball.
Sunshine-Change-Tolerant Moving Object Masking for Realizing both Privacy Protection and Video Surveillance
Yoichi TOMIOKA Hikaru MURAKAMI Hitoshi KITAZAWA

PAPER-Image Recognition, Computer Vision

Vol:
E97-D No:9
Page(s):
2483-2492
Recently, video surveillance systems have been widely introduced in various places, and protecting the privacy of objects in the scene has been as important as ensuring security. Masking each moving object with a background subtraction method is an effective technique to protect its privacy. However, the background subtraction method is heavily affected by sunshine change, and a redundant masking by over-extraction is inevitable. Such superfluous masking disturbs the quality of video surveillance. In this paper, we propose a moving object masking method combining background subtraction and machine learning based on Real AdaBoost. This method can reduce the superfluous masking while maintaining the reliability of privacy protection. In the experiments, we demonstrate that the proposed method achieves about 78-94% accuracy for classifying superfluous masking regions and moving objects.
Activity Recognition Based on an Accelerometer in a Smartphone Using an FFT-Based New Feature and Fusion Methods
Yang XUE Yaoquan HU Lianwen JIN

LETTER-Human-computer Interaction

Vol:
E97-D No:8
Page(s):
2182-2186
With the development of personal electronic equipment, the use of a smartphone with a tri-axial accelerometer to detect human physical activity is becoming popular. In this paper, we propose a new feature based on FFT for activity recognition from tri-axial acceleration signals. To improve the classification performance, two fusion methods, minimal distance optimization (MDO) and variance contribution ranking (VCR), are proposed. The new proposed feature achieves a recognition rate of 92.41%, which outperforms six traditional time- or frequency-domain features. Furthermore, the proposed fusion methods effectively improve the recognition rates. In particular, the average accuracy based on class fusion VCR (CFVCR) is 97.01%, which results in an improvement in accuracy of 4.14% compared with the results without any fusion. Experiments confirm the effectiveness of the new proposed feature and fusion methods.
A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation
Kou TANAKA Tomoki TODA Graham NEUBIG Sakriani SAKTI Satoshi NAKAMURA

PAPER-Voice Conversion and Speech Enhancement

Vol:
E97-D No:6
Page(s):
1429-1437
This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. Moreover, the excitation sounds produced by the device often leak outside, adding to EL speech as noise. To address these issues, there are mainly two conventional approached to EL speech enhancement through either noise reduction or statistical voice conversion (VC). The former approach usually causes no degradation in intelligibility but yields only small improvements in naturalness as the mechanical excitation sounds remain essentially unchanged. On the other hand, the latter approach significantly improves naturalness of EL speech using spectral and excitation parameters of natural voices converted from acoustic parameters of EL speech, but it usually causes degradation in intelligibility owing to errors in conversion. We propose a hybrid approach using a noise reduction method for enhancing spectral parameters and statistical voice conversion method for predicting excitation parameters. Moreover, we further modify the prediction process of the excitation parameters to improve its prediction accuracy and reduce adverse effects caused by unvoiced/voiced prediction errors. The experimental results demonstrate the proposed method yields significant improvements in naturalness compared with EL speech while keeping intelligibility high enough.
File and Task Abstraction in Task Workflow Patterns for File Recommendation Using File-Access Log Open Access
Qiang SONG Takayuki KAWABATA Fumiaki ITOH Yousuke WATANABE Haruo YOKOTA

PAPER

Vol:
E97-D No:4
Page(s):
634-643
The numbers of files in file systems have increased dramatically in recent years. Office workers spend much time and effort searching for the documents required for their jobs. To reduce these costs, we propose a new method for recommending files and operations on them. Existing technologies for recommendation, such as collaborative filtering, suffer from two problems. First, they can only work with documents that have been accessed in the past, so that they cannot recommend when only newly generated documents are inputted. Second, they cannot easily handle sequences involving similar or differently ordered elements because of the strict matching used in the access sequences. To solve these problems, such minor variations should be ignored. In our proposed method, we introduce the concepts of abstract files as groups of similar files used for a similar purpose, abstract tasks as groups of similar tasks, and frequent abstract workflows grouped from similar workflows, which are sequences of abstract tasks. In experiments using real file-access logs, we confirmed that our proposed method could extract workflow patterns with longer sequences and higher support-count values, which are more suitable as recommendations. In addition, the F-measure for the recommendation results was improved significantly, from 0.301 to 0.598, compared with a method that did not use the concepts of abstract tasks and abstract workflows.
FPGA Implementation of Exclusive Block Matching for Robust Moving Object Extraction and Tracking
Yoichi TOMIOKA Ryota TAKASU Takashi AOKI Eiichi HOSOYA Hitoshi KITAZAWA

PAPER-Image Processing and Video Processing

Vol:
E97-D No:3
Page(s):
573-582
Hardware acceleration is an essential technique for extracting and tracking moving objects in real time. It is desirable to design tracking algorithms such that they are applicable for parallel computations on hardware. Exclusive block matching methods are designed for hardware implementation, and they can realize detailed motion extraction as well as robust moving object tracking. In this study, we develop tracking hardware based on an exclusive block matching method on FPGA. This tracking hardware is based on a two-dimensional systolic array architecture, and can realize robust moving object extraction and tracking at more than 100 fps for QVGA images using the high parallelism of an exclusive block matching method, synchronous shift data transfer, and special circuits to accelerate searching the exclusive correspondence of blocks.
Discrete Abstraction of Stochastic Nonlinear Systems
Shun-ichi AZUMA George J. PAPPAS

PAPER

Vol:
E97-A No:2
Page(s):
452-458
This paper addresses the discrete abstraction problem for stochastic nonlinear systems with continuous-valued state. The proposed solution is based on a function, called the bisimulation function, which provides a sufficient condition for the existence of a discrete abstraction for a given continuous system. We first introduce the bisimulation function and show how the function solves the problem. Next, a convex optimization based method for constructing a bisimulation function is presented. Finally, the proposed framework is demonstrated by a numerical simulation.
Efficient Multiply-by-3 and Divide-by-3 Algorithms and Their Fast Hardware Implementation
Chin-Long WEY Ping-Chang JUI Gang-Neng SUNG

PAPER-VLSI Design Technology and CAD

Vol:
E97-A No:2
Page(s):
616-623
This study presents efficient algorithms for performing multiply-by-3 (3N) and divide-by-3 (N/3) operations with the additions and subtractions, respectively. No multiplications and divisions are needed. Full adder (FA) and full subtractor (FS) can be implemented to realize the N3 and N/3 operations, respectively. For fast hardware implementation, this paper introduces two basic cells UCA and UCS for 3N and N/3 operations, respectively. For 3N operation, the UCA-based ripple carry adder (RCA) and carry lookahead adder (CLA) designs are proposed and their speed performances are estimated based on the delay data of standard cell library in TSMC 0.18µm CMOS process. Results show that the 16-bit UCA-based RCA is about 3 times faster than the conventional FA-based RCA and even 25% faster than the FA-based CLA. The proposed 16-bit and 64-bit UCA-based CLAs are 62% and 36% faster than the conventional FA-based CLAs, respectively. For N/3 operations, ripple borrow subtractor (RBS) is also presented. The 16-bit UCS-based RBS is about 15.5% faster than the 16-bit FS-based RBS.
Discrete Abstraction for a Class of Stochastic Hybrid Systems Based on Bounded Bisimulation
Koichi KOBAYASHI Yasuhito FUKUI Kunihiko HIRAISHI

PAPER

Vol:
E97-A No:2
Page(s):
459-467
A stochastic hybrid system can express complex dynamical systems such as biological systems and communication networks, but computation for analysis and control is frequently difficult. In this paper, for a class of stochastic hybrid systems, a discrete abstraction method in which a given system is transformed into a finite-state system is proposed based on the notion of bounded bisimulation. In the existing discrete abstraction method based on bisimulation, a computational procedure is not in general terminated. In the proposed method, only the behavior for the finite time interval is expressed as a finite-state system, and termination is guaranteed. Furthermore, analysis of genetic toggle switches is also discussed as an application.
Semi-Automatically Extracting Features from Source Code of Android Applications
Tetsuya KANDA Yuki MANABE Takashi ISHIO Makoto MATSUSHITA Katsuro INOUE

LETTER-Software Engineering

Vol:
E96-D No:12
Page(s):
2857-2859
It is not always easy for an Android user to choose the most suitable application for a particular task from the great number of applications available. In this paper, we propose a semi-automatic approach to extract feature names from Android applications. The case study verifies that we can associate common sequences of Android API calls with feature names.
Personal Information Extraction from Korean Obituaries
Kyoung-Soo HAN

LETTER-Artificial Intelligence, Data Mining

Vol:
E96-D No:12
Page(s):
2873-2876
Pieces of personal information, such as personal names and relationships, are crucial in text mining applications. Obituaries are good sources for this kind of information. This study proposes an effective method for extracting various facts about people from obituary Web pages. Experiments show that the proposed method achieves high performance in terms of recall and precision.
Fast Information Retrieval Method from Printed Images Considering Mobile Devices
Aya HIYAMA Mitsuji MUNEYASU

LETTER-Image Processing

Vol:
E96-A No:11
Page(s):
2194-2197
In information retrieval from printed images considering the use of mobile devices, the correction of geometrical deformation and lens distortion is required, posing a heavy computational burden. In this paper, we propose a method of reducing the computational burden for such corrections. This method consists of improved extraction to find a line segment of a frame, the reconsideration of the interpolation method for image correction, and the optimization of image resolution in the correction process. The proposed method can reduce the number of computations significantly. The experimental result shows the effectiveness of the proposed method.
A Steganographic Scheme Based on Formula Fully Exploiting Modification Directions
Wen-Chung KUO Ming-Chih KAO

PAPER-Cryptography and Information Security

Vol:
E96-A No:11
Page(s):
2235-2243
Many EMD-type data hiding schemes have been proposed. However, the data hiding capacity is less than 2bpp when the embedding procedure uses formula operations. In order to improve the data hiding capacity from 1bpp to 4.5bpp, a new data hiding scheme is proposed in this paper based on a formula using the fully exploiting modification directions method (FEMD). By using our proposed theorem, the secret data can be embedded by formula operations directly without using a lookup matrix. The simulation results and performance analysis show the proposed scheme not only maintains good embedding capacity and stegoimage quality but also solves the overflow problem. It does so without using extra memory resources and performs within a reasonable computing time. The resource usage and capabilities of this scheme are well matched to the constraints and requirements of resource-scarce mobile devices.
A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour
Pramual CHOORAT Werapon CHIRACHARIT Kosin CHAMNONGTHAI Takao ONOYE

PAPER-Image Processing

Vol:
E96-A No:11
Page(s):
2169-2178
In tooth contour extraction there is insufficient intensity difference in x-ray images between the tooth and dental bone. This difference must be enhanced in order to improve the accuracy of tooth segmentation. This paper proposes a method to improve the intensity between the tooth and dental bone. This method consists of an estimation of tooth orientation (intensity projection, smoothing filter, and peak detection) and PCA-Stacked Gabor with ellipse Gabor banks. Tooth orientation estimation is performed to determine the angle of a single oriented tooth. PCA-Stacked Gabor with ellipse Gabor banks is then used, in particular to enhance the border between the tooth and dental bone. Finally, active contour extraction is performed in order to determine tooth contour. In the experiment, in comparison with the conventional active contour without edge (ACWE) method, the average mean square error (MSE) values of extracted tooth contour points are reduced from 26.93% and 16.02% to 19.07% and 13.42% for tooth x-ray type I and type H images, respectively.
Predominant Melody Extraction from Polyphonic Music Signals Based on Harmonic Structure
Jea-Yul YOON Chai-Jong SONG Hochong PARK

LETTER-Music Information Processing

Vol:
E96-D No:11
Page(s):
2504-2507
A new method for predominant melody extraction from polyphonic music signals based on harmonic structure is proposed. The proposed method first extracts a set of fundamental frequency candidates by analyzing the distance between spectral peaks. Then, the predominant fundamental frequency is selected by pitch tracking according to the harmonic strength of the selected candidates. Finally, the method runs pitch smoothing on a large temporal scale for eliminating pitch doubling error, and conducts voicing frame detection. The proposed method shows the best overall performance for ADC 2004 DB in the MIREX 2011 audio melody extraction task.
On the Complexity of Inference and Completion of Boolean Networks from Given Singleton Attractors
Hao JIANG Takeyuki TAMURA Wai-Ki CHING Tatsuya AKUTSU

PAPER-General Fundamentals and Boundaries

Vol:
E96-A No:11
Page(s):
2265-2274
In this paper, we consider the problem of inferring a Boolean network (BN) from a given set of singleton attractors, where it is required that the resulting BN has the same set of singleton attractors as the given one. We show that the problem can be solved in linear time if the number of singleton attractors is at most two and each Boolean function is restricted to be a conjunction or disjunction of literals. We also show that the problem can be solved in polynomial time if more general Boolean functions can be used. In addition to the inference problem, we study two network completion problems from a given set of singleton attractors: adding the minimum number of edges to a given network, and determining Boolean functions to all nodes when only network structure of a BN is given. In particular, we show that the latter problem cannot be solved in polynomial time unless P=NP, by means of a polynomial-time Turing reduction from the complement of the another solution problem for the Boolean satisfiability problem.
Track Extraction for Accelerated Targets in Dense Environments Using Variable Gating MLPDA
Masanori MORI Takashi MATSUZAKI Hiroshi KAMEDA Toru UMEZAWA

PAPER-Sensing

Vol:
E96-B No:8
Page(s):
2173-2179
MLPDA (Maximum Likelihood Probabilistic Data Association) has attracted a great deal of attention as an effective target track extraction method in high false density environments. However, to extract an accelerated target track on a 2-dimensional plane, the computational load of the conventional MLPDA is extremely high, since it needs to search for the most-likely position, velocity and acceleration of the target in 6-dimensional space. In this paper, we propose VG-MLPDA (Variable Gating MLPDA), which consists of the following two steps. The first step is to search the target's position and velocity among candidates with the assumed acceleration by using variable gates, which take into account both the observation noise and the difference between assumed and true acceleration. The second step is to search the most-likely position, velocity and acceleration using a maximization algorithm while reducing the gate volume. Simulation results show the validity of our method.
Spectral Subtraction Based on Non-extensive Statistics for Speech Recognition
Hilman PARDEDE Koji IWANO Koichi SHINODA

PAPER-Speech and Hearing

Vol:
E96-D No:8
Page(s):
1774-1782
Spectral subtraction (SS) is an additive noise removal method which is derived in an extensive framework. In spectral subtraction, it is assumed that speech and noise spectra follow Gaussian distributions and are independent with each other. Hence, noisy speech also follows a Gaussian distribution. Spectral subtraction formula is obtained by maximizing the likelihood of noisy speech distribution with respect to its variance. However, it is well known that noisy speech observed in real situations often follows a heavy-tailed distribution, not a Gaussian distribution. In this paper, we introduce a q-Gaussian distribution in the non-extensive statistics to represent the distribution of noisy speech and derive a new spectral subtraction method based on it. We found that the q-Gaussian distribution fits the noisy speech distribution better than the Gaussian distribution does. Our speech recognition experiments using the Aurora-2 database showed that the proposed method, q-spectral subtraction (q-SS), outperformed the conventional SS method.
Face Retrieval in Large-Scale News Video Datasets
Thanh Duc NGO Hung Thanh VU Duy-Dinh LE Shin'ichi SATOH

PAPER-Image Recognition, Computer Vision

Vol:
E96-D No:8
Page(s):
1811-1825
Face retrieval in news video has been identified as a challenging task due to the huge variations in the visual appearance of the human face. Although several approaches have been proposed to deal with this problem, their extremely high computational cost limits their scalability to large-scale video datasets that may contain millions of faces of hundreds of characters. In this paper, we introduce approaches for face retrieval that are scalable to such datasets while maintaining competitive performances with state-of-the-art approaches. To utilize the variability of face appearances in video, we use a set of face images called face-track to represent the appearance of a character in a video shot. Our first proposal is an approach for extracting face-tracks. We use a point tracker to explore the connections between detected faces belonging to the same character and then group them into one face-track. We present techniques to make the approach robust against common problems caused by flash lights, partial occlusions, and scattered appearances of characters in news videos. In the second proposal, we introduce an efficient approach to match face-tracks for retrieval. Instead of using all the faces in the face-tracks to compute their similarity, our approach obtains a representative face for each face-track. The representative face is computed from faces that are sampled from the original face-track. As a result, we significantly reduce the computational cost of face-track matching while taking into account the variability of faces in face-tracks to achieve high matching accuracy. Experiments are conducted on two face-track datasets extracted from real-world news videos, of such scales that have never been considered in the literature. One dataset contains 1,497 face-tracks of 41 characters extracted from 370 hours of TRECVID videos. The other dataset provides 5,567 face-tracks of 111 characters observed from a television news program (NHK News 7) over 11 years. We make both datasets publically accessible by the research community. The experimental results show that our proposed approaches achieved a remarkable balance between accuracy and efficiency.

121-140hit(469hit)

Keyword Search Result

[Keyword] tract(469hit)

Muffled and Brisk Speech Evaluation with Criterion Based on Temporal Differentiation of Vocal Tract Area Function

Scene Analysis from Viewing Orientations in a Shooting Environment of Multiple Mobile Phones

Sunshine-Change-Tolerant Moving Object Masking for Realizing both Privacy Protection and Video Surveillance

Activity Recognition Based on an Accelerometer in a Smartphone Using an FFT-Based New Feature and Fusion Methods

A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

File and Task Abstraction in Task Workflow Patterns for File Recommendation Using File-Access Log Open Access

FPGA Implementation of Exclusive Block Matching for Robust Moving Object Extraction and Tracking

Discrete Abstraction of Stochastic Nonlinear Systems

Efficient Multiply-by-3 and Divide-by-3 Algorithms and Their Fast Hardware Implementation

Discrete Abstraction for a Class of Stochastic Hybrid Systems Based on Bounded Bisimulation

Semi-Automatically Extracting Features from Source Code of Android Applications

Personal Information Extraction from Korean Obituaries

Fast Information Retrieval Method from Printed Images Considering Mobile Devices

A Steganographic Scheme Based on Formula Fully Exploiting Modification Directions

A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour

Predominant Melody Extraction from Polyphonic Music Signals Based on Harmonic Structure

On the Complexity of Inference and Completion of Boolean Networks from Given Singleton Attractors

Track Extraction for Accelerated Targets in Dense Environments Using Variable Gating MLPDA

Spectral Subtraction Based on Non-extensive Statistics for Speech Recognition

Face Retrieval in Large-Scale News Video Datasets

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles