The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] tract(469hit)

121-140hit(469hit)

  • Muffled and Brisk Speech Evaluation with Criterion Based on Temporal Differentiation of Vocal Tract Area Function

    Masanori MORISE  Satoshi TSUZUKI  Hideki BANNO  Kenji OZAWA  

     
    LETTER-Speech and Hearing

      Pubricized:
    2014/09/17
      Vol:
    E97-D No:12
      Page(s):
    3230-3233

    This research deals with muffled speech as the evaluation target and introduces a criterion for evaluating the auditory impression in muffled speech. It focuses on the vocal tract area function (VTAF) to evaluate the auditory impression, and the criterion uses temporal differentiation of this function to track the temporal variation of the shape of the mouth. The experimental results indicate that the proposed criterion can be used to evaluate the auditory impression as well as the subjective impression.

  • Scene Analysis from Viewing Orientations in a Shooting Environment of Multiple Mobile Phones

    Shogo TOKAI  Takayoshi MORIOKA  Hiroyuki HASE  

     
    LETTER

      Vol:
    E97-A No:11
      Page(s):
    2178-2180

    We propose a method to extract scene situation by orientation sensors of multiple mobile phones' environment. By using orientations recorded with videos, we analyzed their view concentrations as a remarkable position of the scene for each frame of videos. In an experiment for a soccer scene, the extracted points can be related to a trajectory of a soccer ball.

  • Sunshine-Change-Tolerant Moving Object Masking for Realizing both Privacy Protection and Video Surveillance

    Yoichi TOMIOKA  Hikaru MURAKAMI  Hitoshi KITAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:9
      Page(s):
    2483-2492

    Recently, video surveillance systems have been widely introduced in various places, and protecting the privacy of objects in the scene has been as important as ensuring security. Masking each moving object with a background subtraction method is an effective technique to protect its privacy. However, the background subtraction method is heavily affected by sunshine change, and a redundant masking by over-extraction is inevitable. Such superfluous masking disturbs the quality of video surveillance. In this paper, we propose a moving object masking method combining background subtraction and machine learning based on Real AdaBoost. This method can reduce the superfluous masking while maintaining the reliability of privacy protection. In the experiments, we demonstrate that the proposed method achieves about 78-94% accuracy for classifying superfluous masking regions and moving objects.

  • Activity Recognition Based on an Accelerometer in a Smartphone Using an FFT-Based New Feature and Fusion Methods

    Yang XUE  Yaoquan HU  Lianwen JIN  

     
    LETTER-Human-computer Interaction

      Vol:
    E97-D No:8
      Page(s):
    2182-2186

    With the development of personal electronic equipment, the use of a smartphone with a tri-axial accelerometer to detect human physical activity is becoming popular. In this paper, we propose a new feature based on FFT for activity recognition from tri-axial acceleration signals. To improve the classification performance, two fusion methods, minimal distance optimization (MDO) and variance contribution ranking (VCR), are proposed. The new proposed feature achieves a recognition rate of 92.41%, which outperforms six traditional time- or frequency-domain features. Furthermore, the proposed fusion methods effectively improve the recognition rates. In particular, the average accuracy based on class fusion VCR (CFVCR) is 97.01%, which results in an improvement in accuracy of 4.14% compared with the results without any fusion. Experiments confirm the effectiveness of the new proposed feature and fusion methods.

  • A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

    Kou TANAKA  Tomoki TODA  Graham NEUBIG  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Voice Conversion and Speech Enhancement

      Vol:
    E97-D No:6
      Page(s):
    1429-1437

    This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. Moreover, the excitation sounds produced by the device often leak outside, adding to EL speech as noise. To address these issues, there are mainly two conventional approached to EL speech enhancement through either noise reduction or statistical voice conversion (VC). The former approach usually causes no degradation in intelligibility but yields only small improvements in naturalness as the mechanical excitation sounds remain essentially unchanged. On the other hand, the latter approach significantly improves naturalness of EL speech using spectral and excitation parameters of natural voices converted from acoustic parameters of EL speech, but it usually causes degradation in intelligibility owing to errors in conversion. We propose a hybrid approach using a noise reduction method for enhancing spectral parameters and statistical voice conversion method for predicting excitation parameters. Moreover, we further modify the prediction process of the excitation parameters to improve its prediction accuracy and reduce adverse effects caused by unvoiced/voiced prediction errors. The experimental results demonstrate the proposed method yields significant improvements in naturalness compared with EL speech while keeping intelligibility high enough.

  • File and Task Abstraction in Task Workflow Patterns for File Recommendation Using File-Access Log Open Access

    Qiang SONG  Takayuki KAWABATA  Fumiaki ITOH  Yousuke WATANABE  Haruo YOKOTA  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    634-643

    The numbers of files in file systems have increased dramatically in recent years. Office workers spend much time and effort searching for the documents required for their jobs. To reduce these costs, we propose a new method for recommending files and operations on them. Existing technologies for recommendation, such as collaborative filtering, suffer from two problems. First, they can only work with documents that have been accessed in the past, so that they cannot recommend when only newly generated documents are inputted. Second, they cannot easily handle sequences involving similar or differently ordered elements because of the strict matching used in the access sequences. To solve these problems, such minor variations should be ignored. In our proposed method, we introduce the concepts of abstract files as groups of similar files used for a similar purpose, abstract tasks as groups of similar tasks, and frequent abstract workflows grouped from similar workflows, which are sequences of abstract tasks. In experiments using real file-access logs, we confirmed that our proposed method could extract workflow patterns with longer sequences and higher support-count values, which are more suitable as recommendations. In addition, the F-measure for the recommendation results was improved significantly, from 0.301 to 0.598, compared with a method that did not use the concepts of abstract tasks and abstract workflows.

  • FPGA Implementation of Exclusive Block Matching for Robust Moving Object Extraction and Tracking

    Yoichi TOMIOKA  Ryota TAKASU  Takashi AOKI  Eiichi HOSOYA  Hitoshi KITAZAWA  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E97-D No:3
      Page(s):
    573-582

    Hardware acceleration is an essential technique for extracting and tracking moving objects in real time. It is desirable to design tracking algorithms such that they are applicable for parallel computations on hardware. Exclusive block matching methods are designed for hardware implementation, and they can realize detailed motion extraction as well as robust moving object tracking. In this study, we develop tracking hardware based on an exclusive block matching method on FPGA. This tracking hardware is based on a two-dimensional systolic array architecture, and can realize robust moving object extraction and tracking at more than 100 fps for QVGA images using the high parallelism of an exclusive block matching method, synchronous shift data transfer, and special circuits to accelerate searching the exclusive correspondence of blocks.

  • Discrete Abstraction of Stochastic Nonlinear Systems

    Shun-ichi AZUMA  George J. PAPPAS  

     
    PAPER

      Vol:
    E97-A No:2
      Page(s):
    452-458

    This paper addresses the discrete abstraction problem for stochastic nonlinear systems with continuous-valued state. The proposed solution is based on a function, called the bisimulation function, which provides a sufficient condition for the existence of a discrete abstraction for a given continuous system. We first introduce the bisimulation function and show how the function solves the problem. Next, a convex optimization based method for constructing a bisimulation function is presented. Finally, the proposed framework is demonstrated by a numerical simulation.

  • Efficient Multiply-by-3 and Divide-by-3 Algorithms and Their Fast Hardware Implementation

    Chin-Long WEY  Ping-Chang JUI  Gang-Neng SUNG  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:2
      Page(s):
    616-623

    This study presents efficient algorithms for performing multiply-by-3 (3N) and divide-by-3 (N/3) operations with the additions and subtractions, respectively. No multiplications and divisions are needed. Full adder (FA) and full subtractor (FS) can be implemented to realize the N3 and N/3 operations, respectively. For fast hardware implementation, this paper introduces two basic cells UCA and UCS for 3N and N/3 operations, respectively. For 3N operation, the UCA-based ripple carry adder (RCA) and carry lookahead adder (CLA) designs are proposed and their speed performances are estimated based on the delay data of standard cell library in TSMC 0.18µm CMOS process. Results show that the 16-bit UCA-based RCA is about 3 times faster than the conventional FA-based RCA and even 25% faster than the FA-based CLA. The proposed 16-bit and 64-bit UCA-based CLAs are 62% and 36% faster than the conventional FA-based CLAs, respectively. For N/3 operations, ripple borrow subtractor (RBS) is also presented. The 16-bit UCS-based RBS is about 15.5% faster than the 16-bit FS-based RBS.

  • Discrete Abstraction for a Class of Stochastic Hybrid Systems Based on Bounded Bisimulation

    Koichi KOBAYASHI  Yasuhito FUKUI  Kunihiko HIRAISHI  

     
    PAPER

      Vol:
    E97-A No:2
      Page(s):
    459-467

    A stochastic hybrid system can express complex dynamical systems such as biological systems and communication networks, but computation for analysis and control is frequently difficult. In this paper, for a class of stochastic hybrid systems, a discrete abstraction method in which a given system is transformed into a finite-state system is proposed based on the notion of bounded bisimulation. In the existing discrete abstraction method based on bisimulation, a computational procedure is not in general terminated. In the proposed method, only the behavior for the finite time interval is expressed as a finite-state system, and termination is guaranteed. Furthermore, analysis of genetic toggle switches is also discussed as an application.

  • Semi-Automatically Extracting Features from Source Code of Android Applications

    Tetsuya KANDA  Yuki MANABE  Takashi ISHIO  Makoto MATSUSHITA  Katsuro INOUE  

     
    LETTER-Software Engineering

      Vol:
    E96-D No:12
      Page(s):
    2857-2859

    It is not always easy for an Android user to choose the most suitable application for a particular task from the great number of applications available. In this paper, we propose a semi-automatic approach to extract feature names from Android applications. The case study verifies that we can associate common sequences of Android API calls with feature names.

  • Personal Information Extraction from Korean Obituaries

    Kyoung-Soo HAN  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E96-D No:12
      Page(s):
    2873-2876

    Pieces of personal information, such as personal names and relationships, are crucial in text mining applications. Obituaries are good sources for this kind of information. This study proposes an effective method for extracting various facts about people from obituary Web pages. Experiments show that the proposed method achieves high performance in terms of recall and precision.

  • Fast Information Retrieval Method from Printed Images Considering Mobile Devices

    Aya HIYAMA  Mitsuji MUNEYASU  

     
    LETTER-Image Processing

      Vol:
    E96-A No:11
      Page(s):
    2194-2197

    In information retrieval from printed images considering the use of mobile devices, the correction of geometrical deformation and lens distortion is required, posing a heavy computational burden. In this paper, we propose a method of reducing the computational burden for such corrections. This method consists of improved extraction to find a line segment of a frame, the reconsideration of the interpolation method for image correction, and the optimization of image resolution in the correction process. The proposed method can reduce the number of computations significantly. The experimental result shows the effectiveness of the proposed method.

  • A Steganographic Scheme Based on Formula Fully Exploiting Modification Directions

    Wen-Chung KUO  Ming-Chih KAO  

     
    PAPER-Cryptography and Information Security

      Vol:
    E96-A No:11
      Page(s):
    2235-2243

    Many EMD-type data hiding schemes have been proposed. However, the data hiding capacity is less than 2bpp when the embedding procedure uses formula operations. In order to improve the data hiding capacity from 1bpp to 4.5bpp, a new data hiding scheme is proposed in this paper based on a formula using the fully exploiting modification directions method (FEMD). By using our proposed theorem, the secret data can be embedded by formula operations directly without using a lookup matrix. The simulation results and performance analysis show the proposed scheme not only maintains good embedding capacity and stegoimage quality but also solves the overflow problem. It does so without using extra memory resources and performs within a reasonable computing time. The resource usage and capabilities of this scheme are well matched to the constraints and requirements of resource-scarce mobile devices.

  • A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour

    Pramual CHOORAT  Werapon CHIRACHARIT  Kosin CHAMNONGTHAI  Takao ONOYE  

     
    PAPER-Image Processing

      Vol:
    E96-A No:11
      Page(s):
    2169-2178

    In tooth contour extraction there is insufficient intensity difference in x-ray images between the tooth and dental bone. This difference must be enhanced in order to improve the accuracy of tooth segmentation. This paper proposes a method to improve the intensity between the tooth and dental bone. This method consists of an estimation of tooth orientation (intensity projection, smoothing filter, and peak detection) and PCA-Stacked Gabor with ellipse Gabor banks. Tooth orientation estimation is performed to determine the angle of a single oriented tooth. PCA-Stacked Gabor with ellipse Gabor banks is then used, in particular to enhance the border between the tooth and dental bone. Finally, active contour extraction is performed in order to determine tooth contour. In the experiment, in comparison with the conventional active contour without edge (ACWE) method, the average mean square error (MSE) values of extracted tooth contour points are reduced from 26.93% and 16.02% to 19.07% and 13.42% for tooth x-ray type I and type H images, respectively.

  • Predominant Melody Extraction from Polyphonic Music Signals Based on Harmonic Structure

    Jea-Yul YOON  Chai-Jong SONG  Hochong PARK  

     
    LETTER-Music Information Processing

      Vol:
    E96-D No:11
      Page(s):
    2504-2507

    A new method for predominant melody extraction from polyphonic music signals based on harmonic structure is proposed. The proposed method first extracts a set of fundamental frequency candidates by analyzing the distance between spectral peaks. Then, the predominant fundamental frequency is selected by pitch tracking according to the harmonic strength of the selected candidates. Finally, the method runs pitch smoothing on a large temporal scale for eliminating pitch doubling error, and conducts voicing frame detection. The proposed method shows the best overall performance for ADC 2004 DB in the MIREX 2011 audio melody extraction task.

  • On the Complexity of Inference and Completion of Boolean Networks from Given Singleton Attractors

    Hao JIANG  Takeyuki TAMURA  Wai-Ki CHING  Tatsuya AKUTSU  

     
    PAPER-General Fundamentals and Boundaries

      Vol:
    E96-A No:11
      Page(s):
    2265-2274

    In this paper, we consider the problem of inferring a Boolean network (BN) from a given set of singleton attractors, where it is required that the resulting BN has the same set of singleton attractors as the given one. We show that the problem can be solved in linear time if the number of singleton attractors is at most two and each Boolean function is restricted to be a conjunction or disjunction of literals. We also show that the problem can be solved in polynomial time if more general Boolean functions can be used. In addition to the inference problem, we study two network completion problems from a given set of singleton attractors: adding the minimum number of edges to a given network, and determining Boolean functions to all nodes when only network structure of a BN is given. In particular, we show that the latter problem cannot be solved in polynomial time unless P=NP, by means of a polynomial-time Turing reduction from the complement of the another solution problem for the Boolean satisfiability problem.

  • Track Extraction for Accelerated Targets in Dense Environments Using Variable Gating MLPDA

    Masanori MORI  Takashi MATSUZAKI  Hiroshi KAMEDA  Toru UMEZAWA  

     
    PAPER-Sensing

      Vol:
    E96-B No:8
      Page(s):
    2173-2179

    MLPDA (Maximum Likelihood Probabilistic Data Association) has attracted a great deal of attention as an effective target track extraction method in high false density environments. However, to extract an accelerated target track on a 2-dimensional plane, the computational load of the conventional MLPDA is extremely high, since it needs to search for the most-likely position, velocity and acceleration of the target in 6-dimensional space. In this paper, we propose VG-MLPDA (Variable Gating MLPDA), which consists of the following two steps. The first step is to search the target's position and velocity among candidates with the assumed acceleration by using variable gates, which take into account both the observation noise and the difference between assumed and true acceleration. The second step is to search the most-likely position, velocity and acceleration using a maximization algorithm while reducing the gate volume. Simulation results show the validity of our method.

  • Spectral Subtraction Based on Non-extensive Statistics for Speech Recognition

    Hilman PARDEDE  Koji IWANO  Koichi SHINODA  

     
    PAPER-Speech and Hearing

      Vol:
    E96-D No:8
      Page(s):
    1774-1782

    Spectral subtraction (SS) is an additive noise removal method which is derived in an extensive framework. In spectral subtraction, it is assumed that speech and noise spectra follow Gaussian distributions and are independent with each other. Hence, noisy speech also follows a Gaussian distribution. Spectral subtraction formula is obtained by maximizing the likelihood of noisy speech distribution with respect to its variance. However, it is well known that noisy speech observed in real situations often follows a heavy-tailed distribution, not a Gaussian distribution. In this paper, we introduce a q-Gaussian distribution in the non-extensive statistics to represent the distribution of noisy speech and derive a new spectral subtraction method based on it. We found that the q-Gaussian distribution fits the noisy speech distribution better than the Gaussian distribution does. Our speech recognition experiments using the Aurora-2 database showed that the proposed method, q-spectral subtraction (q-SS), outperformed the conventional SS method.

  • Face Retrieval in Large-Scale News Video Datasets

    Thanh Duc NGO  Hung Thanh VU  Duy-Dinh LE  Shin'ichi SATOH  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E96-D No:8
      Page(s):
    1811-1825

    Face retrieval in news video has been identified as a challenging task due to the huge variations in the visual appearance of the human face. Although several approaches have been proposed to deal with this problem, their extremely high computational cost limits their scalability to large-scale video datasets that may contain millions of faces of hundreds of characters. In this paper, we introduce approaches for face retrieval that are scalable to such datasets while maintaining competitive performances with state-of-the-art approaches. To utilize the variability of face appearances in video, we use a set of face images called face-track to represent the appearance of a character in a video shot. Our first proposal is an approach for extracting face-tracks. We use a point tracker to explore the connections between detected faces belonging to the same character and then group them into one face-track. We present techniques to make the approach robust against common problems caused by flash lights, partial occlusions, and scattered appearances of characters in news videos. In the second proposal, we introduce an efficient approach to match face-tracks for retrieval. Instead of using all the faces in the face-tracks to compute their similarity, our approach obtains a representative face for each face-track. The representative face is computed from faces that are sampled from the original face-track. As a result, we significantly reduce the computational cost of face-track matching while taking into account the variability of faces in face-tracks to achieve high matching accuracy. Experiments are conducted on two face-track datasets extracted from real-world news videos, of such scales that have never been considered in the literature. One dataset contains 1,497 face-tracks of 41 characters extracted from 370 hours of TRECVID videos. The other dataset provides 5,567 face-tracks of 111 characters observed from a television news program (NHK News 7) over 11 years. We make both datasets publically accessible by the research community. The experimental results show that our proposed approaches achieved a remarkable balance between accuracy and efficiency.

121-140hit(469hit)