The search functionality is under construction.

Author Search Result

[Author] Masashi UNOKI(8hit)

1-8hit
  • Method of Audio Watermarking Based on Adaptive Phase Modulation

    Nhut Minh NGO  Masashi UNOKI  

     
    PAPER

      Pubricized:
    2015/10/21
      Vol:
    E99-D No:1
      Page(s):
    92-101

    This paper proposes a method of watermarking for digital audio signals based on adaptive phase modulation. Audio signals are usually non-stationary, i.e., their own characteristics are time-variant. The features for watermarking are usually not selected by combining the principle of variability, which affects the performance of the whole watermarking system. The proposed method embeds a watermark into an audio signal by adaptively modulating its phase with the watermark using IIR all-pass filters. The frequency location of the pole-zero of an IIR all-pass filter that characterizes the transfer function of the filter is adapted on the basis of signal power distribution on sub-bands in a magnitude spectrum domain. The pole-zero locations are adapted so that the phase modulation produces slight distortion in watermarked signals to achieve the best sound quality. The experimental results show that the proposed method could embed inaudible watermarks into various kinds of audio signals and correctly detect watermarks without the aid of original signals. A reasonable trade-off between inaudibility and robustness could be obtained by balancing the phase modulation scheme. The proposed method can embed a watermark into audio signals up to 100 bits per second with 99% accuracy and 6 bits per second with 94.3% accuracy in the cases of no attack and attacks, respectively.

  • Speech Enhancement Based on Noise Eigenspace Projection

    Dongwen YING  Masashi UNOKI  Xugang LU  Jianwu DANG  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:5
      Page(s):
    1137-1145

    How to reduce noise with less speech distortion is a challenging issue for speech enhancement. We propose a novel approach for reducing noise with the cost of less speech distortion. A noise signal can generally be considered to consist of two components, a "white-like" component with a uniform energy distribution and a "color" component with a concentrated energy distribution in some frequency bands. An approach based on noise eigenspace projections is proposed to pack the color component into a subspace, named "noise subspace". This subspace is then removed from the eigenspace to reduce the color component. For the white-like component, a conventional enhancement algorithm is adopted as a complementary processor. We tested our algorithm on a speech enhancement task using speech data from the Texas Instruments and Massachusetts Institute of Technology (TIMIT) dataset and noise data from NOISEX-92. The experimental results show that the proposed algorithm efficiently reduces noise with little speech distortion. Objective and subjective evaluations confirmed that the proposed algorithm outperformed conventional enhancement algorithms.

  • Singular-Spectrum Analysis for Digital Audio Watermarking with Automatic Parameterization and Parameter Estimation Open Access

    Jessada KARNJANA  Masashi UNOKI  Pakinee AIMMANEE  Chai WUTIWIWATCHAI  

     
    PAPER-Information Network

      Pubricized:
    2016/05/16
      Vol:
    E99-D No:8
      Page(s):
    2109-2120

    This paper proposes a blind, inaudible, robust digital-audio watermarking scheme based on singular-spectrum analysis, which relates to watermarking techniques based on singular value decomposition. We decompose a host signal into its oscillatory components and modify amplitudes of some of those components with respect to a watermark bit and embedding rule. To improve the sound quality of a watermarked signal and still maintain robustness, differential evolution is introduced to find optimal parameters of the proposed scheme. Test results show that, although a trade-off between inaudibility and robustness still persists, the difference in sound quality between the original and the watermarked one is considerably smaller. This improved scheme is robust against many attacks, such as MP3 and MP4 compression, and band-pass filtering. However, there is a drawback, i.e., some music-dependent parameters need to be shared between embedding and extraction processes. To overcome this drawback, we propose a method for automatic parameter estimation. By incorporating the estimation method into the framework, those parameters need not to be shared, and the test results show that it can blindly decode watermark bits with an accuracy of 99.99%. This paper not only proposes a new technique and scheme but also discusses the singular value and its physical interpretation.

  • Speech Analysis Method Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition

    Surasak BOONKLA  Masashi UNOKI  Stanislav S. MAKHANOV  Chai WUTIWIWATCHAI  

     
    PAPER-Speech and Hearing

      Vol:
    E99-A No:10
      Page(s):
    1762-1773

    We propose a speech analysis method based on the source-filter model using multivariate empirical mode decomposition (MEMD). The proposed method takes multiple adjacent frames of a speech signal into account by combining their log spectra into multivariate signals. The multivariate signals are then decomposed into intrinsic mode functions (IMFs). The IMFs are divided into two groups using the peak of the autocorrelation function (ACF) of an IMF. The first group characterized by a spectral fine structure is used to estimate the fundamental frequency F0 by using the ACF, whereas the second group characterized by the frequency response of the vocal-tract filter is used to estimate formant frequencies by using a peak picking technique. There are two advantages of using MEMD: (i) the variation in the number of IMFs is eliminated in contrast with single-frame based empirical mode decomposition and (ii) the common information of the adjacent frames aligns in the same order of IMFs because of the common mode alignment property of MEMD. These advantages make the analysis more accurate than with other methods. As opposed to the conventional linear prediction (LP) and cepstrum methods, which rely on the LP order and cut-off frequency, respectively, the proposed method automatically separates the glottal-source and vocal-tract filter. The results showed that the proposed method exhibits the highest accuracy of F0 estimation and correctly estimates the formant frequencies of the vocal-tract filter.

  • Robust, Blindly-Detectable, and Semi-Reversible Technique of Audio Watermarking Based on Cochlear Delay Characteristics

    Masashi UNOKI  Ryota MIYAUCHI  

     
    PAPER

      Vol:
    E98-D No:1
      Page(s):
    38-48

    We previously proposed an inaudible non-blind digital-audio watermarking approach based on cochlear delay (CD) characteristics. There are, however, three remaining issues with regard to blind-detectability, frame synchronization related to confidentiality, and reversibility. We attempted to solve these issues in developing the proposed approach by taking blind-detectability and reversibility of audio watermarking into consideration. Frame synchronization was also incorporated into the proposed approach to improve confidentiality. We evaluated inaudibility, robustness, and reversibility with the new approach by carrying out three objective tests (PEAQ, LSD, and bit-detection or SNR) and six robustness tests. The results revealed that inaudible, robust, blindly-detectable, and semi-reversible watermarking based on CD could be accomplished.

  • MTF-Based Kalman Filtering with Linear Prediction for Power Envelope Restoration in Noisy Reverberant Environments

    Yang LIU  Shota MORITA  Masashi UNOKI  

     
    PAPER-Digital Signal Processing

      Vol:
    E99-A No:2
      Page(s):
    560-569

    This paper proposes a method based on modulation transfer function (MTF) to restore the power envelope of noisy reverberant speech by using a Kalman filter with linear prediction (LP). Its advantage is that it can simultaneously suppress the effects of noise and reverberation by restoring the smeared MTF without measuring room impulse responses. This scheme has two processes: power envelope subtraction and power envelope inverse filtering. In the subtraction process, the statistical properties of observation noise and driving noise for power envelope are investigated for the criteria of the Kalman filter which requires noise to be white and Gaussian. Furthermore, LP coefficients drastically affect the Kalman filter performance, and a method is developed for deriving LP coefficients from noisy reverberant speech. In the dereverberation process, an inverse filtering method is applied to remove the effects of reverberation. Objective experiments were conducted under various noisy reverberant conditions to evaluate how well the proposed Kalman filtering method based on MTF improves the signal-to-error ratio (SER) and correlation between restored power envelopes compared with conventional methods. Results showed that the proposed Kalman filtering method based on MTF can improve SER and correlation more than conventional methods.

  • Speech Watermarking Method Based on Formant Tuning

    Shengbei WANG  Masashi UNOKI  

     
    PAPER

      Vol:
    E98-D No:1
      Page(s):
    29-37

    This paper proposes a speech watermarking method based on the concept of formant tuning. The characteristic that formant tuning can improve the sound quality of synthesized speech was employed to achieve inaudibility for watermarking. In the proposed method, formants were firstly extracted with linear prediction (LP) analysis and then embedded with watermarks by symmetrically controlling a pair of line spectral frequencies (LSFs) as formant tuning. We evaluated the proposed method by two kinds of experiments regarding inaudibility and robustness compared with other methods. Inaudibility was evaluated with objective and subjective tests and robustness was evaluated with speech codecs and speech processing. The results revealed that the proposed method could satisfy both inaudibility and robustness that required for speech watermarking.

  • Non-Blind Speech Watermarking Method Based on Spread-Spectrum Using Linear Prediction Residue

    Reiya NAMIKAWA  Masashi UNOKI  

     
    LETTER

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:1
      Page(s):
    63-66

    We propose a method of non-blind speech watermarking based on direct spread spectrum (DSS) using a linear prediction scheme to solve sound distortion due to spread spectrum. Results of evaluation simulations revealed that the proposed method had much lower sound-quality distortion than the DSS method while having almost the same bit error ratios (BERs) against various attacks as the DSS method.