The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] duality(13hit)

1-13hit
  • Individuality-Preserving Gait Pattern Prediction Based on Gait Feature Transitions

    Tsuyoshi HIGASHIGUCHI  Norimichi UKITA  Masayuki KANBARA  Norihiro HAGITA  

     
    PAPER-Pattern Recognition

      Pubricized:
    2018/07/20
      Vol:
    E101-D No:10
      Page(s):
    2501-2508

    This paper proposes a method for predicting individuality-preserving gait patterns. Physical rehabilitation can be performed using visual and/or physical instructions by physiotherapists or exoskeletal robots. However, a template-based rehabilitation may produce discomfort and pain in a patient because of deviations from the natural gait of each patient. Our work addresses this problem by predicting an individuality-preserving gait pattern for each patient. In this prediction, the transition of the gait patterns is modeled by associating the sequence of a 3D skeleton in gait with its continuous-value gait features (e.g., walking speed or step width). In the space of the prediction model, the arrangement of the gait patterns are optimized so that (1) similar gait patterns are close to each other and (2) the gait feature changes smoothly between neighboring gait patterns. This model allows to predict individuality-preserving gait patterns of each patient even if his/her various gait patterns are not available for prediction. The effectiveness of the proposed method is demonstrated quantitatively. with two datasets.

  • Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics

    Yuji OSHIMA  Shinnosuke TAKAMICHI  Tomoki TODA  Graham NEUBIG  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2016/08/30
      Vol:
    E99-D No:12
      Page(s):
    3132-3139

    This paper presents a novel non-native speech synthesis technique that preserves the individuality of a non-native speaker. Cross-lingual speech synthesis based on voice conversion or Hidden Markov Model (HMM)-based speech synthesis is a technique to synthesize foreign language speech using a target speaker's natural speech uttered in his/her mother tongue. Although the technique holds promise to improve a wide variety of applications, it tends to cause degradation of target speaker's individuality in synthetic speech compared to intra-lingual speech synthesis. This paper proposes a new approach to speech synthesis that preserves speaker individuality by using non-native speech spoken by the target speaker. Although the use of non-native speech makes it possible to preserve the speaker individuality in the synthesized target speech, naturalness is significantly degraded as the synthesized speech waveform is directly affected by unnatural prosody and pronunciation often caused by differences in the linguistic systems of the source and target languages. To improve naturalness while preserving speaker individuality, we propose (1) a prosody correction method based on model adaptation, and (2) a phonetic correction method based on spectrum replacement for unvoiced consonants. The experimental results using English speech uttered by native Japanese speakers demonstrate that (1) the proposed methods are capable of significantly improving naturalness while preserving the speaker individuality in synthetic speech, and (2) the proposed methods also improve intelligibility as confirmed by a dictation test.

  • Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines

    Toru NAKASHIKA  Tetsuya TAKIGUCHI  Yasuo ARIKI  

     
    PAPER-Voice Conversion and Speech Enhancement

      Vol:
    E97-D No:6
      Page(s):
    1403-1410

    This paper presents a voice conversion technique using speaker-dependent Restricted Boltzmann Machines (RBM) to build high-order eigen spaces of source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. We build a deep conversion architecture that concatenates the two speaker-dependent RBMs with neural networks, expecting that they automatically discover abstractions to express the original input features. Under this concept, if we train the RBMs using only the speech of an individual speaker that includes various phonemes while keeping the speaker individuality unchanged, it can be considered that there are fewer phonemes and relatively more speaker individuality in the output features of the hidden layer than original acoustic features. Training the RBMs for a source speaker and a target speaker, we can then connect and convert the speaker individuality abstractions using Neural Networks (NN). The converted abstraction of the source speaker is then back-propagated into the acoustic space (e.g., MFCC) using the RBM of the target speaker. We conducted speaker-voice conversion experiments and confirmed the efficacy of our method with respect to subjective and objective criteria, comparing it with the conventional Gaussian Mixture Model-based method and an ordinary NN.

  • Spectral Features for Perceptually Natural Phoneme Replacement by Another Speaker's Speech

    Reiko TAKOU  Hiroyuki SEGI  Tohru TAKAGI  Nobumasa SEIYAMA  

     
    PAPER-Speech and Hearing

      Vol:
    E95-A No:4
      Page(s):
    751-759

    The frequency regions and spectral features that can be used to measure the perceived similarity and continuity of voice quality are reported here. A perceptual evaluation test was conducted to assess the naturalness of spoken sentences in which either a vowel or a long vowel of the original speaker was replaced by that of another. Correlation analysis between the evaluation score and the spectral feature distance was conducted to select the spectral features that were expected to be effective in measuring the voice quality and to identify the appropriate speech segment of another speaker. The mel-frequency cepstrum coefficient (MFCC) and the spectral center of gravity (COG) in the low-, middle-, and high-frequency regions were selected. A perceptual paired comparison test was carried out to confirm the effectiveness of the spectral features. The results showed that the MFCC was effective for spectra across a wide range of frequency regions, the COG was effective in the low- and high-frequency regions, and the effective spectral features differed among the original speakers.

  • Functional Duality between Distributed Source Coding and Broadcast Channel Coding in the Case of Correlated Messages

    Suhan CHOI  Hichan MOON  Eunchul YOON  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E95-B No:1
      Page(s):
    275-278

    In this letter, functional duality between distributed source coding (DSC) with correlated messages and broadcast channel coding (BCC) with correlated messages is considered. It is shown that under certain conditions, for a given DSC problem with correlated messages, a functional dual BCC problem with correlated messages can be obtained, and vice versa. That is, the optimal encoder-decoder mappings for one problem become the optimal decoder-encoder mappings for the dual problem. Furthermore, the correlation structure of the messages in the two dual problems and the source distortion and channel cost measure for this duality are specified.

  • Joint Design of Uplink-Downlink MIMO Relay Networks Using Duality

    Seungwon CHOI  Jung-Hyun PARK  Seokkwon KIM  Dong-Jo PARK  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E95-B No:1
      Page(s):
    333-336

    This letter introduces a joint design method for uplink-downlink multiple-input multiple-output (MIMO) relay communication systems in which the source nodes transmit information to the destination nodes with the help of a relay. We propose a signal forwarding schceme based on the minimum mean-square error (MMSE) approach in uplink relay systems. Exploiting the duality of relay systems, we also propose a relaying scheme for downlink relay systems. Simulation results confirm that the proposed joint design method improves the performance of the relay systems compared with that of conventional relaying schemes in uplink and downlink MIMO relay systems.

  • A Per-User QoS Enhancement Strategy via Downlink Cooperative Transmission Using Distributed Antennas

    Byungseok LEE  Ju Wook JANG  Sang-Gyu PARK  Wonjin SUNG  

     
    LETTER

      Vol:
    E93-B No:12
      Page(s):
    3538-3541

    In this letter, we address a strategy to enhance the signal-to-interference plus noise ratio (SINR) of the worst-case user by using cooperative transmission from a set of geographically separated antennas. Unlike previously reported schemes which are based on either the power control of individual antennas or cooperative orthogonal transmission, the presented strategy utilizes the minimum-mean-squared error (MMSE) filter structure for beamforming, which provides increased robustness to the external interference as well as the background noise at the receiver. By iteratively updating the cooperative transmission beamforming vector and power control (PC), the balanced SINR is obtained for all users, while the transmission power from each antenna also converges to within the constrained value. It is demonstrated that proposed MMSE beamforming significantly outperforms other existing schemes in terms of the achievable minimum SINR.

  • The Gaussian MIMO Broadcast Channel under Receive Power Protection Constraints Open Access

    Ian Dexter GARCIA  Kei SAKAGUCHI  Kiyomichi ARAKI  

     
    PAPER

      Vol:
    E93-B No:12
      Page(s):
    3448-3460

    A Gaussian MIMO broadcast channel (GMBC) models the MIMO transmission of Gaussian signals from a transmitter to one or more receivers. Its capacity region and different precoding schemes for it have been well investigated, especially for the case wherein there are only transmit power constraints. In this paper, a special case of GMBC is investigated, wherein receive power constraints are also included. By imposing receive power constraints, the model, called protected GMBC (PGMBC), can be applied to certain scenarios in spatial spectrum sharing, secretive communications, mesh networks and base station cooperation. The sum capacity, capacity region, and application examples for the PGMBC are discussed in this paper. Sub-optimum precoding algorithms are also proposed for the PGMBC, where standard user precoding techniques are performed over a BC with a modified channel, which we refer to as the "protection-implied BC." In the protection-implied BC, the receiver protection constraints have been implied in the channel, which means that by satisfying the transmit power constraints on the protection implied channel, receiver protection constraints are guaranteed to be met. Any standard single-user or multi-user MIMO precoding scheme may then be performed on the protection-implied channel. When SINR-matching duality-based precoding is applied on the protection-implied channel, sum-capacity under full protection constraints (zero receive power), and near-sum-capacity under partial protection constraints (limited non-zero receive power) are achieved, and were verified by simulations.

  • Effective Use of Geometric Information for Clustering and Related Topics

    Tetsuo ASANO  

     
    INVITED SURVEY PAPER-Algorithms for Geometric Problems

      Vol:
    E83-D No:3
      Page(s):
    418-427

    This paper surveys how geometric information can be effectively used for efficient algorithms with focus on clustering problems. Given a complete weighted graph G of n vertices, is there a partition of the vertex set into k disjoint subsets so that the maximum weight of an innercluster edge (whose two endpoints both belong to the same subset) is minimized? This problem is known to be NP-complete even for k = 3. The case of k=2, that is, bipartition problem is solvable in polynomial time. On the other hand, in geometric setting where vertices are points in the plane and weights of edges equal the distances between corresponding points, the same problem is solvable in polynomial time even for k 3 as far as k is a fixed constant. For the case k=2, effective use of geometric property of an optimal solution leads to considerable improvement on the computational complexity. Other related topics are also discussed.

  • Escape-Time Modified Algorithm for Generating Fractal Images Based on Petri Net Reachability

    Hussein Karam HUSSEIN  Aboul-Ella HASSANIEN  Masayuki NAKAJIMA  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E82-D No:7
      Page(s):
    1101-1108

    This paper presents a new approach to computer image generation via three proposed methods for translating the evolution of a Petri net into fractal image synthesis. The idea is derived from the concept of fractal iteration principles in the escape-time algorithm and chaos game. The approach uses a Petri net as a powerful abstract modeling tool for fractal image synthesis via its duality, deadlock, inhibitor arc, firing sequence and marking reachability. The objective of this approach is to enhance the analysis technique of a Petri net and use it as a novel technique for fractal image synthesis. Generating fractal images via the dynamics of a Petri net allows an easy and direct proof for the similarity and correspondence between the dynamics of complex quadratic fractals by the recursive procedure of the escape-time algorithm and the state of a Petri net via a reachability problem. The reachability problem will be manipulated in terms of the dynamics of the fractal in order to generate images via three proposed methods. Validation of our approach is given by discussion and an illustration of some experimental results.

  • Topological Walk Revisited

    Tetsuo ASANO  Takeshi TOKUYAMA  

     
    PAPER

      Vol:
    E81-A No:5
      Page(s):
    751-756

    Topological Walk is an algorithm that can sweep an arrangement of n lines in O(n2) time and O(n) space. This paper revisits Topological Walk to give its new interpretation in contrast with Topological Sweep. We also survey applications of Topological Walk to make the distinction clearer.

  • Perceptual Contributions of Static and Dynamic Features of Vocal Tract Characteristics to Talker Individuality

    Weizhong ZHU  Hideki KASUYA  

     
    PAPER-Acoustics

      Vol:
    E81-A No:2
      Page(s):
    268-274

    Experiments were performed to investigate perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality. An ARX (Auto-regressive with exogenous input) speech production model was used to extract separately voice source and vocal tract parameters from a Japanese sentence, /aoiueoie/ ("Say blue top" in English) uttered by three males. The Discrete Cosine Transform (DCT) was applied to resolve formant trajectories of the speech signal into static and dynamic components. The perceptual contributions were quantitatively studied by systematically replacing the corresponding formant components of the sentences between the three talkers. Results of the experiments show that the static (average) feature of the vocal tract is a primary cue to talker individuality.

  • Partial Construction of an Arrangement of Lines and Its Application to Optimal Partitioning of Bichromatic Point Set

    Tetsuo ASANO  Takeshi TOKUYAMA  

     
    PAPER

      Vol:
    E77-A No:4
      Page(s):
    595-600

    This paper presents an efficient algorithm for constructing at-most-k levels of an arrangement of n lines in the plane in time O(nk+n log n), which is optimal since Ω(nk) line segments are included there. The algorithm can sweep the at-most-k levels of the arrangement using O(n) space. Although Everett et al. recently gave an algorithm for constructing the at-most-k levels with the same time complexity independently, our algorithm is superior with respect to the space complexity as a sweep algorithm. Then, we apply the algorithm to a bipartitioning problem of a bichromatic point set: For r red points and b blue points in the plane and a directed line L, the figure of demerit fd(L) associated with L is defined to be the sum of the number of blue points below L and that of red ones above L. The problem we are going to consider is to find an optimal partitioning line to minimize the figure of demerit. Given a number k, our algorithm first determines whether there is a line whose figure of demerit is at most k, and further finds an optimal bipartitioning line if there is one. It runs in O(kn+n log n) time (n=r+b), which is subquadratic if k is sublinear.