The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] rhythm(15hit)

1-15hit
  • Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis

    Kenichi FUJITA  Atsushi ANDO  Yusuke IJIMA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2023/10/06
      Vol:
    E107-D No:1
      Page(s):
    93-104

    This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted from phonemes and their durations, which are known to be related to speaking rhythm. They are extracted with a speaker identification model similar to the conventional spectral feature-based one. We conducted three experiments, speaker embeddings generation, speech synthesis with generated embeddings, and embedding space analysis, to evaluate the performance. The proposed method demonstrated a moderate speaker identification performance (15.2% EER), even with only phonemes and their duration information. The objective and subjective evaluation results demonstrated that the proposed method can synthesize speech with speech rhythm closer to the target speaker than the conventional method. We also visualized the embeddings to evaluate the relationship between the distance of the embeddings and the perceptual similarity. The visualization of the embedding space and the relation analysis between the closeness indicated that the distribution of embeddings reflects the subjective and objective similarity.

  • Mathematical Analysis of Phase Resetting Control Mechanism during Rhythmic Movements

    Kazuki NAKADA  Keiji MIURA  

     
    INVITED PAPER

      Vol:
    E103-A No:2
      Page(s):
    398-406

    Possible functional roles of the phase resetting control during rhythmic movements have been attracting much attention in the field of robotics. The phase resetting control is a control mechanism in which the phase shift of periodic motion is induced depending on the timing of a given perturbation, leading to dynamical stability such as a rapid transition from an unstable state to a stable state in rhythmic movements. A phase response curve (PRC) is used to quantitatively evaluate the phase shift in the phase resetting control. It has been demonstrated that an optimal PRC for bipedal walking becomes bimodal. The PRCs acquired by reinforcement learning in simulated biped walking are qualitatively consistent with measured results obtained from experiments. In this study, we considered how such characteristics are obtained from a mathematical point of view. First, we assumed a symmetric Bonhoeffer-Van der Pol oscillator and phase excitable element known as an active rotator as a model of the central pattern generator for controlling rhythmic movements. Second, we constructed feedback control systems by combining them with manipulators. Next, we numerically computed the PRCs of such systems and compared the resulting PRCs. Furthermore, we approximately calculated analytical solutions of the PRCs. Based on the results, we systematically investigated the parameter dependence of the analytical PRCs. Finally, we investigated the requirements for realizing an optimal PRC for the phase resetting control during rhythmic movements.

  • Rhythm Tap Technique for Cross-Device Interaction Enabling Uniform Operation for Various Devices Open Access

    Hirohito SHIBATA  Junko ICHINO  Shun'ichi TANO  Tomonori HASHIYAMA  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2019/09/19
      Vol:
    E102-D No:12
      Page(s):
    2515-2523

    This paper proposes a novel interaction technique to transfer data across various types of digital devices in uniform a manner and to allow specifying what kind of data should be sent. In our framework, when users tap multiple devices rhythmically, data corresponding to the rhythm (transfer type) are transferred from a device tapped in the first tap (source device) to the other (target device). It is easy to operate, applicable to a wide range of devices, and extensible in a sense that we can adopt new transfer types by adding new rhythms. Through a subjective evaluation and a simulation, we had a prospect that our approach would be feasible. We also discuss suggestions and limitation to implement the technique.

  • Deformable Part Model Based Arrhythmia Detection Using Time Domain Features

    Yuuka HIRAO  Yoshinori TAKEUCHI  Masaharu IMAI  Jaehoon YU  

     
    PAPER-Digital Signal Processing

      Vol:
    E100-A No:11
      Page(s):
    2221-2229

    Heart disease is one of the major causes of death in many advanced countries. For prevention or treatment of heart disease, getting an early diagnosis from a long time period of electrocardiogram (ECG) examination is necessary. However, it could be a large burden on medical experts to analyze this large amount of data. To reduce the burden and support the analysis, this paper proposes an arrhythmia detection method based on a deformable part model, which absorbs individual variation of ECG waveform and enables the detection of various arrhythmias. Moreover, to detect the arrhythmia in low processing delay, the proposed method only utilizes time domain features. In an experimental result, the proposed method achieved 0.91 F-measure for arrhythmia detection.

  • Real-Time and Memory-Efficient Arrhythmia Detection in ECG Monitors Using Antidictionary Coding

    Takahiro OTA  Hiroyoshi MORITA  Adriaan J. de Lind van WIJNGAARDEN  

     
    PAPER-Source Coding

      Vol:
    E96-A No:12
      Page(s):
    2343-2350

    This paper presents a real-time and memory-efficient arrhythmia detection system with binary classification that uses antidictionary coding for the analysis and classification of electrocardiograms (ECGs). The measured ECG signals are encoded using a lossless antidictionary encoder, and the system subsequently uses the compression rate to distinguish between normal beats and arrhythmia. An automated training data procedure is used to construct the automatons, which are probabilistic models used to compress the ECG signals, and to determine the threshold value for detecting the arrhythmia. Real-time computer simulations with samples from the MIT-BIH arrhythmia database show that the averages of sensitivity and specificity of the proposed system are 97.8% and 96.4% for premature ventricular contraction detection, respectively. The automatons are constructed using training data and comprise only 11 kilobytes on average. The low complexity and low memory requirements make the system particularly suitable for implementation in portable ECG monitors.

  • A Method for Predicting Stressed Words in Teaching Materials for English Jazz Chants

    Ryo NAGATA  Kotaro FUNAKOSHI  Tatsuya KITAMURA  Mikio NAKANO  

     
    PAPER-Educational Technology

      Vol:
    E95-D No:11
      Page(s):
    2658-2663

    To acquire a second language, one must develop an ear and tongue for the correct stress and intonation patterns of that language. In English language teaching, there is an effective method called Jazz Chants for working on the sound system. In this paper, we propose a method for predicting stressed words, which play a crucial role in Jazz Chants. The proposed method is specially designed for stress prediction in Jazz chants. It exploits several sources of information including words, POSs, sentence types, and the constraint on the number of stressed words in a chant text. Experiments show that the proposed method achieves an F-measure of 0.939 and outperforms the other methods implemented for comparison. The proposed method is expected to be useful in supporting non-native teachers of English when they teach chants to students and create chant texts with stress marks from arbitrary texts.

  • Causality of Frontal and Occipital Alpha Activity Revealed by Directed Coherence

    Gang WANG  Kazutomo YUNOKUCHI  

     
    PAPER-Medical Engineering

      Vol:
    E85-D No:8
      Page(s):
    1334-1340

    Recently there has been increased attention to the causality among biomedical signals. The causality between brain structures involved in the generation of alpha activity is examined based on EEG signals acquired simultaneously in the frontal and occipital regions of the scalp. The concept of directed coherence (DC) is introduced as a means of resolving two-signal observations into the constituent components of original signals, the interaction between signals and the influence of one signal source on the other, through autoregressive modeling. The technique was applied to EEG recorded from 11 normal subjects with eyes closed. Through an analysis of the directed coherence, it was found that in both the left and right hemispheres, alpha rhythms with relatively low frequency had a significantly higher correlation in the frontal-occipital direction than in the opposite direction. In the upper alpha frequency band, a significantly higher DC was observed in the occipital-frontal direction, and the right-left DC in the occipital area was consistently higher. The activity of rhythms near 10 Hz was widespread. These results suggest that there is a difference in the genesis and the structure of information transmission in the lower and upper band, and for 10-Hz alpha waves.

  • Chaotic Features of Rhythmic Joint Movement

    Hirokazu IWASE  Atsuo MURATA  

     
    LETTER-Medical Engineering

      Vol:
    E85-D No:7
      Page(s):
    1175-1179

    The purpose of this study is to show the chaotic features of rhythmic joint movement. Depending on the experimental conditions, one (or both) elbow angle(s) was (were) measured by one (or two) goniometer(s). Pacing was provided for six different frequencies presented in random order. When the frequency of the pace increased, the fractal dimension and first Lyapunov exponent tended to increase. Moreover, the first Lyapunov exponent obtained positive values for all of the observed data. These results indicate that there is chaos in rhythmic joint movement and that the larger the frequency, the more chaotic the joint movement becomes.

  • BP Neural Networks Approach for Identifying Biological Signal Source in Circadian Data Fluctuations

    Youssouf CISSE  Yohsuke KINOUCHI  Hirofumi NAGASHINO  Masatake AKUTAGAWA  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E85-D No:3
      Page(s):
    568-576

    Almost all land animals coordinate their behavior with circadian rhythms, matching their functions to the daily cycles of lightness and darkness that result from the rotation of the earth corresponding to 24 hours. Through external stimuli, such as dairy life activities or other sources from our environment may influence the internal rhythmicity of sleep and waking properties. However, the rhythms are regulated to keep their activity constant by homeostasis while fluctuating by incessant influences of external forces. A modeling study has been developed to identify homeostatic dynamics properties underlying a circadian rhythm activity of Sleep and Wake data measured from normal subjects, using an MA (Moving Average) model associated with Backpropagation (BP) algorithm. As results, we found that the neural network can capture the regularity and irregularity components included in the data. The order of MA neural network model depends on subjects behavior, the first two orders are usually dominant in the case of no strong external forces. The adaptive dynamic changes are evaluated by the change of weight vectors, a kind of internal representation of the trained network. The dynamic is kept in a steady state for more than 20 days at most. Identified properties reflect the subject's behavior, and hence may be useful for medical diagnoses of disorders related to circadian rhythms.

  • Evaluation of Mental Workload by Variability of Pupil Area

    Atsuo MURATA  Hirokazu IWASE  

     
    LETTER-Medical Engineering

      Vol:
    E83-D No:5
      Page(s):
    1187-1190

    It is generally known that the autonomic nervous system regulates the pupil. In this study, we attempted to assess mental workload on the basis of the fluctuation rhythm in the pupil area. Controlling the respiration interval, we measured the pupil area during mental tasking for one minute. We simultaneously measured the respiration curve to monitor the respiration interval. We required the subject to perform two mental tasks. One was a mathematical division task, the difficulty of which was set to two, three, four, and five dividends. The other was a Sternberg memory search task, which had four work levels defined by the number of memory sets. In the Sternberg memory search, the number of memory set changed from five to eight. In such a way, we changed the mental workload induced by mental loading. As a result of calculating an autoregressive (AR) power spectrum, we could observe two peaks which corresponded to the blood pressure variation and respiratory sinus arrhythmia under a low workload. With an increased workload, the spectral peak related to the respiratory sinus arrhythmia disappeared. The ratio of the power at the low frequency band, from 0.05-0.15Hz, to the power at the respiration frequency band, from 0.35-0.4Hz, increased with the work level. In conclusion, the fluctuation of the pupil area is a promising means for the evaluation of mental workload or autonomic nervous function.

  • An Efficient R-R Interval Detection for ECG Monitoring System

    Takashi KOHAMA  Shogo NAKAMURA  Hiroshi HOSHINO  

     
    PAPER-Medical Electronics and Medical Information

      Vol:
    E82-D No:10
      Page(s):
    1425-1432

    The recording of electrocardiogram (ECG) signals for the purpose of finding arrhythmias takes 24 hours. Generally speaking, changes in R-R intervals are used to detect arrhythmias. Our purpose is to develop an algorithm which efficiently detects R-R intervals. This system uses the R-wave position to calculate R-R intervals and then detects any arrhythmias. The algorithm searches for only the short time duration estimated from the most recent R-wave position in order to detect the next R-wave efficiently. We call this duration a WINDOW. A WINDOW is decided according to a proposed search algorithm so that the next R-wave can be expected in the WINDOW. In a case in which an S-wave is enhanced for some reason such as the manner in which the electrodes are installed in the system, the S-wave positions are taken to calculate the peak intervals instead of the R-wave. However, baseline wander and noise contained in the ECG signal have a deterrent effect on the accuracy with which the R-wave or the S-wave position is determined. In order to improve detection, the ECG signal is preprocessed using a Band-Pass Filter (BPF) which is composed of simple Cascaded Integrator Comb (CIC) filters. The American Heart Association (AHA) database was used in the simulation with the proposed algorithm. Accurate detection of the R-wave position was achieved in 99% of cases and efficient extraction of R-R intervals was possible.

  • Assessment of Fatigue by Pupillary Response

    Atsuo MURATA  

     
    PAPER-Systems and Control

      Vol:
    E80-A No:7
      Page(s):
    1318-1323

    This study was conducted to assess the relationship between fatigue and pupillary responses. Pupillary responses, ECG and blood pressure were measured for 24 hours every 30 min in 8 subjects. A questionnaire was used to rate subjective feeling of fatigue. Twenty-four hours were divided equally into four 6-hour blocks. Subjective feeling of fatigue increased markedly in the fourth block, and the difference in subjective fatigue between fourth and first blocks was significant. Of nine pupillary responses, the pupil diameter was found to decrease with time. With respect to the function of the autonomic nervous system such as heart rate, systolic blood pressure and diastolic blood pressure, only heart rate was found to be sensitive to the increased subjective feeling of fatigue. A significant difference was found in the mean pupil diameter and mean heart rate between the last and first blocks. This result indicates that pupil diameter is related to fatigue and can be used to assess fatigue. Possible implications for fatigue assessment are discussed.

  • Multimodal Interaction in Human Communication

    Keiko WATANUKI  Kenji SAKAMOTO  Fumio TOGAWA  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    609-615

    We are developing multimodal man-machine interfaces through which users can communicate by integrating speech, gaze, facial expressions, and gestures such as nodding and finger pointing. Such multimodal interfaces are expected to provide more flexible, natural and productive communications between humans and computers. To achieve this goal, we have taken the approach of modeling human behavior in the context of ordinary face-to-face conversations. As the first step, we have implemented a system which utilizes video and audio recording equipment to capture verbal and nonverbal information in interpersonal communications. Using this system, we have collected data from a task-oriented conversation between a guest (subject) and a receptionist at company reception desk, and quantitatively analyzed this data with respect to multi-modalities which would be functional in fluid interactions. This paper presents detailed analyses of the data collected: (1) head nodding and eye-contact are related to the beginning and end of speaking turns, acting to supplement speech information; (2) listener responses occur after an average of 0.35 sec. from the receptionist's utterance of a keyword, and turn-taking for tag-questions occurs after an average of 0.44 sec.; and (3) there is a rhythmical coordination between speakers and listeners.

  • Experimental Discussion on Measurement of Mental Workload--Evaluation of Mental Workload by HRV Measures--

    Atsuo MURATA  

     
    PAPER-Ergonomics and medical Engineering

      Vol:
    E77-A No:2
      Page(s):
    409-416

    The aim of this study is to evaluate mental workload (MWL) quantitatively by HRV (Heart Rate Variability) measures. The electrocardiography and the respiration curve were recorded in five different epochs (1) during a rest condition and (2) during mental arithmetic tasks (addition). In the experiment, subjects added two numbers. The work levels (figures of the number in the addition) were set to one figure, two figures, three figures and four figures. The work level had effects on the mean percent correct, the number of answers and the mean processing time. The psychological evaluation on mental workload obtained by the method of paired comparison increased with the work level. Among the statistical HRV measures, the number of peak and trough waves could distinguish between the rest and the mental loading. However, mental workload for each work level was not evaluated quantitatively by the measure. The HRV measures were also calculated from the power spectrum estimated by the autoregressive (AR) model identification. The ratio of the low frequency power to the high frequency power increased linearly with the work level. In conclusion, the HRV measures obtained by the AR power spectrum analysis were found to be sensitive to changes of mental workload.

  • Multiwave: A Wavelet-Based ECG Data Compression Algorithm

    Nitish V. THAKOR  Yi-chun SUN  Hervé RIX  Pere CAMINAL  

     
    PAPER

      Vol:
    E76-D No:12
      Page(s):
    1462-1469

    MultiWave data compression algorithm is based on the multiresolution wavelet techniqu for decomposing Electrocardiogram (ECG) signals into their coarse and successively more detailed components. At each successive resolution, or scale, the data are convolved with appropriate filters and then the alternate samples are discarded. This procedure results in a data compression rate that increased on a dyadic scale with successive wavelet resolutions. ECG signals recorded from patients with normal sinus rhythm, supraventricular tachycardia, and ventriular tachycardia are analyzed. The data compression rates and the percentage distortion levels at each resolution are obtained. The performance of the MultiWave data compression algorithm is shown to be superior to another algorithm (the Turning Point algorithm) that also carries out data reduction on a dyadic scale.