IEICE global.ieice.org Site

Keyword Search Result

[Keyword] factor analysis(8hit)

1-8hit

Tensor Factor Analysis for Arbitrary Speaker Conversion
Daisuke SAITO Nobuaki MINEMATSU Keikichi HIROSE

PAPER-Speech and Hearing

Pubricized:
2020/03/13
Vol:
E103-D No:6
Page(s):
1395-1405
This paper describes a novel approach to flexible control of speaker characteristics using tensor representation of multiple Gaussian mixture models (GMM). In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice GMM (EV-GMM) was proposed. In the EVC, a speaker space is constructed based on GMM supervectors which are high-dimensional vectors derived by concatenating the mean vectors of each of the speaker GMMs. In the speaker space, each speaker is represented by a small number of weight parameters of eigen-supervectors. In this paper, we revisit construction of the speaker space by introducing the tensor factor analysis of training data set. In our approach, each speaker is represented as a matrix of which the row and the column respectively correspond to the dimension of the mean vector and the Gaussian component. The speaker space is derived by the tensor factor analysis of the set of the matrices. Our approach can solve an inherent problem of supervector representation, and it improves the performance of voice conversion. In addition, in this paper, effects of speaker adaptive training before factorization are also investigated. Experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.
A Trend-Shift Model for Global Factor Analysis of Investment Products
Makoto KIRIHATA Qiang MA

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2019/08/13
Vol:
E102-D No:11
Page(s):
2205-2213
Recently, more and more people start investing. Understanding the factors affecting financial products is important for making investment decisions. However, it is difficult to understand factors for novices because various factors affect each other. Various technique has been studied, but conventional factor analysis methods focus on revealing the impact of factors over a certain period locally, and it is not easy to predict net asset values. As a reasonable solution for the prediction of net asset values, in this paper, we propose a trend shift model for the global analysis of factors by introducing trend change points as shift interference variables into state space models. In addition, to realize the trend shift model efficiently, we propose an effective trend detection method, TP-TBSM (two-phase TBSM), by extending TBSM (trend-based segmentation method). Comparing with TBSM, TP-TBSM could detect trends flexibly by reducing the dependence on parameters. We conduct experiments with eleven investment trust products and reveal the usefulness and effectiveness of the proposed model and method.
Parametric Models for Mutual Kernel Matrix Completion
Rachelle RIVERO Tsuyoshi KATO

PAPER-Fundamentals of Information Systems

Pubricized:
2018/09/26
Vol:
E101-D No:12
Page(s):
2976-2983
Recent studies utilize multiple kernel learning to deal with incomplete-data problem. In this study, we introduce new methods that do not only complete multiple incomplete kernel matrices simultaneously, but also allow control of the flexibility of the model by parameterizing the model matrix. By imposing restrictions on the model covariance, overfitting of the data is avoided. A limitation of kernel matrix estimations done via optimization of an objective function is that the positive definiteness of the result is not guaranteed. In view of this limitation, our proposed methods employ the LogDet divergence, which ensures the positive definiteness of the resulting inferred kernel matrix. We empirically show that our proposed restricted covariance models, employed with LogDet divergence, yield significant improvements in the generalization performance of previous completion methods.
Speaker Adaptation Based on PARAFAC2 of Transformation Matrices for Continuous Speech Recognition
Yongwon JEONG Sangjun LIM Young Kuk KIM Hyung Soon KIM

LETTER-Speech and Hearing

Vol:
E96-D No:9
Page(s):
2152-2155
We present an acoustic model adaptation method where the transformation matrix for a new speaker is given by the product of bases and a weight matrix. The bases are built from the parallel factor analysis 2 (PARAFAC2) of training speakers' transformation matrices. We perform continuous speech recognition experiments using the WSJ0 corpus.
Psychological Effects of Ambient Illumination Control and Illumination Layout While Viewing Various Video Images
Takuya IWANAMI Ayano KIKUCHI Keita HIRAI Toshiya NAKAGUCHI Norimichi TSUMURA Yoichi MIYAKE

PAPER-Vision

Vol:
E94-A No:2
Page(s):
493-499
Recently enhancing the visual experience of the user has been a new trend for TV displays. This trend comes from the fact that changes of ambient illuminations while viewing a Liquid Crystal Display (LCD) significantly affect human impressions. However, psychological effects caused by the combination of displayed video image and ambient illuminations have not been investigated. In the present research, we clarify the relationship between ambient illuminations and psychological effects while viewing video image displayed on the LCD by using a questionnaire based semantic differential (SD) method and a factor analysis method. Six kinds of video images were displayed under different colors and layouts of illumination conditions and rated by 15 observers. According to the analysis, it became clear that the illumination control around the LCD with displayed video image, the feeling of 'activity' and 'evaluating' were rated higher than the feeling of fluorescent ceiling condition. In particular, simultaneous illumination control around the display and the ceiling enhanced the feeling of 'activity,' and 'evaluating' with keeping 'comfort.' Moreover, the feeling of 'activity' under the illumination control around the LCD and the ceiling condition while viewing music video image was rated clearly higher than that with natural scene video image.
Adaptive Ambient Illumination Based on Color Harmony Model
Ayano KIKUCHI Keita HIRAI Toshiya NAKAGUCHI Norimichi TSUMURA Yoichi MIYAKE

LETTER-Color

Vol:
E92-A No:12
Page(s):
3372-3375
We investigated the relationship between ambient illumination and psychological effect by applying a modified color harmony model. We verified the proposed model by analyzing correlation between psychological value and modified color harmony score. Experimental results showed the possibility to obtain the best color for illumination using this model.
Opinion Model Using Psychological Factors for Interactive Multimodal Services
Kazuhisa YAMAGISHI Takanori HAYASHI

PAPER

Vol:
E89-B No:2
Page(s):
281-288
We propose the concept of an opinion model for interactive multimodal services and apply it to an audiovisual communication service. First, psychological factors of an audiovisual communication service were extracted by using the semantic differential (SD) technique and factor analysis. Forty subjects participated in subjective tests and performed point-to-point conversational tasks on a PC-based video phone that exhibited various network qualities. The subjects assessed those qualities on the basis of 25 pairs of adjectives. Two psychological factors, i.e., an aesthetic feeling and a feeling of activity, were extracted from the results. Then, quality impairment factors affecting these two psychological factors were analyzed. We found that the aesthetic feeling was affected by IP packet loss and video coding bit rate, and the feeling of activity depended on delay time, video packet loss, video coding bit rate, and video frame rate. Using this result, we formulated an opinion model derived from the relationships among quality impairment factors, psychological factors, and overall quality. The validation test results indicated that the estimation error of our model was almost equivalent to the statistical reliability of the subjective score.
An Acoustically Oriented Vocal-Tract Model
Hani C. YEHIA Kazuya TAKEDA Fumitada ITAKURA

PAPER-Speech Processing and Acoustics

Vol:
E79-D No:8
Page(s):
1198-1208
The objective of this paper is to find a parametric representation for the vocal-tract log-area function that is directly and simply related to basic acoustic characteristics of the human vocal-tract. The importance of this representation is associated with the solution of the articulatory-to-acoustic inverse problem, where a simple mapping from the articulatory space onto the acoustic space can be very useful. The method is as follows: Firstly, given a corpus of log-area functions, a parametric model is derived following a factor analysis technique. After that, the articulatory space, defined by the parametric model, is filled with approximately uniformly distributed points, and the corresponding first three formant frequencies are calculated. These formants define an acoustic space onto which the articulatory space maps. In the next step, an independent component analysis technique is used to determine acoustic and articulatory coordinate systems whose components are as independent as possible. Finally, using singular value decomposition, acoustic and articulatory coordinate systems are rotated so that each of the first three components of the articulatory space has major influence on one, and only one, component of the acoustic space. An example showing how the proposed model can be applied to the solution of the articulatory-to-acoustic inverse problem is given at the end of the paper.