1-7hit |
KuanChao CHU Satoshi YAMAZAKI Hideki NAKAYAMA
This work focuses on training dataset enhancement of informative relational triplets for Scene Graph Generation (SGG). Due to the lack of effective supervision, the current SGG model predictions perform poorly for informative relational triplets with inadequate training samples. Therefore, we propose two novel training dataset enhancement modules: Feature Space Triplet Augmentation (FSTA) and Soft Transfer. FSTA leverages a feature generator trained to generate representations of an object in relational triplets. The biased prediction based sampling in FSTA efficiently augments artificial triplets focusing on the challenging ones. In addition, we introduce Soft Transfer, which assigns soft predicate labels to general relational triplets to make more supervisions for informative predicate classes effectively. Experimental results show that integrating FSTA and Soft Transfer achieve high levels of both Recall and mean Recall in Visual Genome dataset. The mean of Recall and mean Recall is the highest among all the existing model-agnostic methods.
Yu TSAO Ting-Yao HU Sakriani SAKTI Satoshi NAKAMURA Lin-shan LEE
This study proposes a variable selection linear regression (VSLR) adaptation framework to improve the accuracy of automatic speech recognition (ASR) with only limited and unlabeled adaptation data. The proposed framework can be divided into three phases. The first phase prepares multiple variable subsets by applying a ranking filter to the original regression variable set. The second phase determines the best variable subset based on a pre-determined performance evaluation criterion and computes a linear regression (LR) mapping function based on the determined subset. The third phase performs adaptation in either model or feature spaces. The three phases can select the optimal components and remove redundancies in the LR mapping function effectively and thus enable VSLR to provide satisfactory adaptation performance even with a very limited number of adaptation statistics. We formulate model space VSLR and feature space VSLR by integrating the VS techniques into the conventional LR adaptation systems. Experimental results on the Aurora-4 task show that model space VSLR and feature space VSLR, respectively, outperform standard maximum likelihood linear regression (MLLR) and feature space MLLR (fMLLR) and their extensions, with notable word error rate (WER) reductions in a per-utterance unsupervised adaptation manner.
Rameswar DEBNATH Haruhisa TAKAHASHI
The choice of kernel is an important issue in the support vector machine algorithm, and the performance of it largely depends on the kernel. Up to now, no general rule is available as to which kernel should be used. In this paper we investigate two kernels: Gaussian RBF kernel and polynomial kernel. So far Gaussian RBF kernel is the best choice for practical applications. This paper shows that the polynomial kernel in the normalized feature space behaves better or as good as Gaussian RBF kernel. The polynomial kernel in the normalized feature space is the best alternative to Gaussian RBF kernel.
Amaro LIMA Heiga ZEN Yoshihiko NANKAKU Chiyomi MIYAJIMA Keiichi TOKUDA Tadashi KITAMURA
This paper describes an approach to feature extraction in speech recognition systems using kernel principal component analysis (KPCA). This approach represents speech features as the projection of the mel-cepstral coefficients mapped into a feature space via a non-linear mapping onto the principal components. The non-linear mapping is implicitly performed using the kernel-trick, which is a useful way of not mapping the input space into a feature space explicitly, making this mapping computationally feasible. It is shown that the application of dynamic (Δ) and acceleration (ΔΔ) coefficients, before and/or after the KPCA feature extraction procedure, is essential in order to obtain higher classification performance. Better results were obtained by using this approach when compared to the standard technique.
Jongcheol KIM Taewon KIM Yasuo SUGA
This paper proposes a new approach to fuzzy inference system for modeling nonlinear systems based on measured input and output data. In the suggested fuzzy inference system, the number of fuzzy rules and parameter values of membership functions are automatically decided by using the extended kernel method. The extended kernel method individually performs linear transformation and kernel mapping. Linear transformation projects input space into linearly transformed input space. Kernel mapping projects linearly transformed input space into high dimensional feature space. Especially, the process of linear transformation is needed in order to solve difficulty determining the type of kernel function which presents the nonlinear mapping in according to nonlinear system. The structure of the proposed fuzzy inference system is equal to a Takagi-Sugeno fuzzy model whose input variables are weighted linear combinations of input variables. In addition, the number of fuzzy rules can be reduced under the condition of optimizing a given criterion by adjusting linear transformation matrix and parameter values of kernel functions using the gradient descent method. Once a structure is selected, coefficients in consequent part are determined by the least square method. Simulated results of the proposed technique are illustrated by examples involving benchmark nonlinear systems.
Toshiaki TAKEDA Hiroki MIZOE Koichiro KISHI Takahide MATSUOKA
To investigate necessary conditions for the object recognition by simulations using neural network models is one of ways to acquire suggestions for understanding the neuronal representation of objects in the brain. In the present study, we trained a three layered neural network to form a geometrical feature representation in its output layer using back-propagation algorithm. After training using 73 learning examples, 65 testing patterns made by various combinations of above features could be recognized with the network at a rate of 95.3% appropriate response. We could classify four types of hidden layer units on the basis of effects on the output layer.
Klaus OBERMAYER Helge RITTER Klaus J. SCHULTEN
Topographic maps begin to be recognized as one of the major computational structures underlying neural computation in the brain. They provide dimension-reducing projections between feature spaces that seem to be established and maintained under the participation of selforganizing, adaptive processes. In this contribution, we investigate how well the structure of such maps can be replicated by simple adaptive processes of the kind proposed by Kohonen. We will particularly address the important issue, how the dimensionality of the input space affects the spatial organization of the resulting map.