Many single model methods have been applied to real-time short-term traffic flow prediction. However, since traffic flow data is mixed with a variety of ingredients, the performance of single model is limited. Therefore, we proposed Multi-Long-Short Term Memory Models, which improved traffic flow prediction accuracy comparing with state-of-the-art models.
Yasutaka KAMEI Takahiro MATSUMOTO Kazuhiro YAMASHITA Naoyasu UBAYASHI Takashi IWASAKI Shuichi TAKAYAMA
Nowadays, open source software (OSS) systems are adopted by proprietary software projects. To reduce the risk of using problematic OSS systems (e.g., causing system crashes), it is important for proprietary software projects to assess OSS systems in advance. Therefore, OSS quality assessment models are studied to obtain information regarding the quality of OSS systems. Although the OSS quality assessment models are partially validated using a small number of case studies, to the best of our knowledge, there are few studies that empirically report how industrial projects actually use OSS quality assessment models in their own development process. In this study, we empirically evaluate the cost and effectiveness of OSS quality assessment models at Fujitsu Kyushu Network Technologies Limited (Fujitsu QNET). To conduct the empirical study, we collect datasets from (a) 120 OSS projects that Fujitsu QNET's projects actually used and (b) 10 problematic OSS projects that caused major problems in the projects. We find that (1) it takes average and median times of 51 and 49 minutes, respectively, to gather all assessment metrics per OSS project and (2) there is a possibility that we can filter problematic OSS systems by using the threshold derived from a pool of assessment metrics. Fujitsu QNET's developers agree that our results lead to improvements in Fujitsu QNET's OSS assessment process. We believe that our work significantly contributes to the empirical knowledge about applying OSS assessment techniques to industrial projects.
Ryo MASUMURA Taichi ASAMI Takanobu OBA Hirokazu MASATAKI Sumitaka SAKAUCHI Akinori ITO
This paper proposes a novel domain adaptation method that can utilize out-of-domain text resources and partially domain matched text resources in language modeling. A major problem in domain adaptation is that it is hard to obtain adequate adaptation effects from out-of-domain text resources. To tackle the problem, our idea is to carry out model merger in a latent variable space created from latent words language models (LWLMs). The latent variables in the LWLMs are represented as specific words selected from the observed word space, so LWLMs can share a common latent variable space. It enables us to perform flexible mixture modeling with consideration of the latent variable space. This paper presents two types of mixture modeling, i.e., LWLM mixture models and LWLM cross-mixture models. The LWLM mixture models can perform a latent word space mixture modeling to mitigate domain mismatch problem. Furthermore, in the LWLM cross-mixture models, LMs which individually constructed from partially matched text resources are split into two element models, each of which can be subjected to mixture modeling. For the approaches, this paper also describes methods to optimize mixture weights using a validation data set. Experiments show that the mixture in latent word space can achieve performance improvements for both target domain and out-of-domain compared with that in observed word space.
Topic modeling as a well-known method is widely applied for not only text data mining but also multimedia data analysis such as video data analysis. However, existing models cannot adequately handle time dependency and multimodal data modeling for video data that generally contain image information and speech information. In this paper, we therefore propose a novel topic model, sequential symmetric correspondence hierarchical Dirichlet processes (Seq-Sym-cHDP) extended from sequential conditionally independent hierarchical Dirichlet processes (Seq-CI-HDP) and sequential correspondence hierarchical Dirichlet processes (Seq-cHDP), to improve the multimodal data modeling mechanism via controlling the pivot assignments with a latent variable. An inference scheme for Seq-Sym-cHDP based on a posterior representation sampler is also developed in this work. We finally demonstrate that our model outperforms other baseline models via experiments.
Yuan GAO Chengdong WU Xiaosheng YU Wei ZHOU Jiahui WU
Efficient optic disc (OD) segmentation plays a significant role in retinal image analysis and retinal disease screening. In this paper, we present a full-automatic segmentation approach called double boundary extraction for the OD segmentation. The proposed approach consists of the following two stages: first, we utilize an unsupervised learning technology and statistical method based on OD boundary information to obtain the initial contour adaptively. Second, the final optic disc boundary is extracted using the proposed LSO model. The performance of the proposed method is tested on the public DIARETDB1 database and the experimental results demonstrate the effectiveness and advantage of the proposed method.
This paper proposes an iterative scheme between human action classification and pose estimation in still images. Initial action classification is achieved only by global image features that consist of the responses of various object filters. The classification likelihood of each action weights human poses estimated by the pose models of multiple sub-action classes. Such fine-grained action-specific pose models allow us to robustly identify the pose of a target person under the assumption that similar poses are observed in each action. From the estimated pose, pose features are extracted and used with global image features for action re-classification. This iterative scheme can mutually improve action classification and pose estimation. Experimental results with a public dataset demonstrate the effectiveness of the proposed method both for action classification and pose estimation.
Thomas WILHELEM Hiroyuki OKUDA Tatsuya SUZUKI
This paper presents a novel identification method for hybrid dynamical system models, where parameters have stochastic and time-varying characteristics. The proposed parameter identification scheme is based on a modified implementation of particle filtering, together with a time-smoothing technique. Parameters of the identified model are considered as time-varying random variables. Parameters are identified independently at each time step, using the Bayesian inference implemented as an iterative particle filtering method. Parameters time dynamics are smoothed using a distribution based moving average technique. Modes of the hybrid system model are handled independently, allowing any type of nonlinear piecewise model to be identified. The proposed identification scheme has low computation burden, and it can be implemented for online use. Effectiveness of the scheme is verified by numerical experiments, and an application of the method is proposed: analysis of driving behavior through identified time-varying parameters.
This paper proposes a method for human pose estimation in still images. The proposed method achieves occlusion-aware appearance modeling. Appearance modeling with less accurate appearance data is problematic because it adversely affects the entire training process. The proposed method evaluates the effectiveness of mitigating the influence of occluded body parts in training sample images. In order to improve occlusion evaluation by a discriminatively-trained model, occlusion images are synthesized and employed with non-occlusion images for discriminative modeling. The score of this discriminative model is used for weighting each sample in the training process. Experimental results demonstrate that our approach improves the performance of human pose estimation in contrast to base models.
Somchai PHATTHANACHUANCHOM Rawesak TANAWONGSUWAN
Color transfer is a simple process to change a color tone in one image (source) to look like another image (target). In transferring colors between images, there are several issues needed to be considered including partial color transfer, trial-and-error, and multiple target color transfer. Our approach enables users to transfer colors partially and locally by letting users select their regions of interest from image segmentation. Since there are many ways that we can transfer colors from a set of target regions to a set of source regions, we introduce the region exploration and navigation approach where users can choose their preferred color tones to transfer one region at a time and gradually customize towards their desired results. The preferred color tones sometimes can come from more than one image; therefore our method is extended to allow users to select their preferred color tones from multiple images. Our experimental results have shown the flexibility of our approach to generate reasonable segmented regions of interest and to enable users to explore the possible results more conveniently.
Youwei LU Shogo OKADA Katsumi NITTA
We propose a novel method, built upon the hierarchical Dirichlet process hidden semi-Markov model, to reveal the content structures of unstructured domain-specific texts. The content structures of texts consisting of sequential local contexts are useful for tasks, such as text retrieval, classification, and text mining. The prominent feature of our model is the use of the recursive uniform partitioning, a stochastic process taking a view different from existing HSMMs in modeling state duration. We show that the recursive uniform partitioning plays an important role in avoiding the rapid switching between hidden states. Remarkably, our method greatly outperforms others in terms of ranking performance in our text retrieval experiments, and provides more accurate features for SVM to achieve higher F1 scores in our text classification experiments. These experiment results suggest that our method can yield improved representations of domain-specific texts. Furthermore, we present a method of automatically discovering the local contexts that serve to account for why a text is classified as a positive instance, in the supervised learning settings.
There are increasing demands for improved analysis of multimodal data that consist of multiple representations, such as multilingual documents and text-annotated images. One promising approach for analyzing such multimodal data is latent topic models. In this paper, we propose conditionally independent generalized relational topic models (CI-gRTM) for predicting unknown relations across different multiple representations of multimodal data. We developed CI-gRTM as a multimodal extension of discriminative relational topic models called generalized relational topic models (gRTM). We demonstrated through experiments with multilingual documents that CI-gRTM can more effectively predict both multilingual representations and relations between two different language representations compared with several state-of-the-art baseline models that enable to predict either multilingual representations or unimodal relations.
Qiao YU Shujuan JIANG Yanmei ZHANG
Class imbalance has drawn much attention of researchers in software defect prediction. In practice, the performance of defect prediction models may be affected by the class imbalance problem. In this paper, we present an approach to evaluating the performance stability of defect prediction models on imbalanced datasets. First, random sampling is applied to convert the original imbalanced dataset into a set of new datasets with different levels of imbalance ratio. Second, typical prediction models are selected to make predictions on these new constructed datasets, and Coefficient of Variation (C·V) is used to evaluate the performance stability of different models. Finally, an empirical study is designed to evaluate the performance stability of six prediction models, which are widely used in software defect prediction. The results show that the performance of C4.5 is unstable on imbalanced datasets, and the performance of Naive Bayes and Random Forest are more stable than other models.
Video data mining based on topic models as an emerging technique recently has become a very popular research topic. In this paper, we present a novel topic model named sequential correspondence hierarchical Dirichlet processes (Seq-cHDP) to learn the hidden structure within video data. The Seq-cHDP model can be deemed as an extended hierarchical Dirichlet processes (HDP) model containing two important features: one is the time-dependency mechanism that connects neighboring video frames on the basis of a time dependent Markovian assumption, and the other is the correspondence mechanism that provides a solution for dealing with the multimodal data such as the mixture of visual words and speech words extracted from video files. A cascaded Gibbs sampling method is applied for implementing the inference task of Seq-cHDP. We present a comprehensive evaluation for Seq-cHDP through experimentation and finally demonstrate that Seq-cHDP outperforms other baseline models.
Kei SAWADA Akira TAMAMORI Kei HASHIMOTO Yoshihiko NANKAKU Keiichi TOKUDA
This paper proposes a Bayesian approach to image recognition based on separable lattice hidden Markov models (SL-HMMs). The geometric variations of the object to be recognized, e.g., size, location, and rotation, are an essential problem in image recognition. SL-HMMs, which have been proposed to reduce the effect of geometric variations, can perform elastic matching both horizontally and vertically. This makes it possible to model not only invariances to the size and location of the object but also nonlinear warping in both dimensions. The maximum likelihood (ML) method has been used in training SL-HMMs. However, in some image recognition tasks, it is difficult to acquire sufficient training data, and the ML method suffers from the over-fitting problem when there is insufficient training data. This study aims to accurately estimate SL-HMMs using the maximum a posteriori (MAP) and variational Bayesian (VB) methods. The MAP and VB methods can utilize prior distributions representing useful prior information, and the VB method is expected to obtain high generalization ability by marginalization of model parameters. Furthermore, to overcome the local maximum problem in the MAP and VB methods, the deterministic annealing expectation maximization algorithm is applied for training SL-HMMs. Face recognition experiments performed on the XM2VTS database indicated that the proposed method offers significantly improved image recognition performance. Additionally, comparative experiment results showed that the proposed method was more robust to geometric variations than convolutional neural networks.
Spatial stochastic models have been much used for performance analysis of wireless communication networks. This is due to the fact that the performance of wireless networks depends on the spatial configuration of wireless nodes and the irregularity of node locations in a real wireless network can be captured by a spatial point process. Most works on such spatial stochastic models of wireless networks have adopted homogeneous Poisson point processes as the models of wireless node locations. While this adoption makes the models analytically tractable, it assumes that the wireless nodes are located independently of each other and their spatial correlation is ignored. Recently, the authors have proposed to adopt the Ginibre point process — one of the determinantal point processes — as the deployment models of base stations (BSs) in cellular networks. The determinantal point processes constitute a class of repulsive point processes and have been attracting attention due to their mathematically interesting properties and efficient simulation methods. In this tutorial, we provide a brief guide to the Ginibre point process and its variant, α-Ginibre point process, as the models of BS deployments in cellular networks and show some existing results on the performance analysis of cellular network models with α-Ginibre deployed BSs. The authors hope the readers to use such point processes as a tool for analyzing various problems arising in future cellular networks.
Denise H. GOYA Dionathan NAKAMURA Routo TERADA
Two new authenticated key agreement protocols in the certificateless setting are presented in this paper. Both are proved secure in the extended Canetti-Krawczyk model, under the BDH assumption. The first one is more efficient than the Lippold et al.'s (LBG) protocol, and is proved secure in the same security model. The second protocol is proved secure under the Swanson et al.'s security model, a weaker model. As far as we know, our second proposed protocol is the first one proved secure in the Swanson et al.'s security model. If no pre-computations are done, the first protocol is about 26% faster than LBG, and the second protocol is about 49% faster than LBG, and about 31% faster than the first one. If pre-computations of some operations are done, our two protocols remain faster.
Ryo MASUMURA Taichi ASAMI Takanobu OBA Hirokazu MASATAKI Sumitaka SAKAUCHI Satoshi TAKAHASHI
This paper aims to improve the domain robustness of language modeling for automatic speech recognition (ASR). To this end, we focus on applying the latent words language model (LWLM) to ASR. LWLMs are generative models whose structure is based on Bayesian soft class-based modeling with vast latent variable space. Their flexible attributes help us to efficiently realize the effects of smoothing and dimensionality reduction and so address the data sparseness problem; LWLMs constructed from limited domain data are expected to robustly cover unknown multiple domains in ASR. However, the attribute flexibility seriously increases computation complexity. If we rigorously compute the generative probability for an observed word sequence, we must consider the huge quantities of all possible latent word assignments. Since this is computationally impractical, some approximation is inevitable for ASR implementation. To solve the problem and apply this approach to ASR, this paper presents an n-gram approximation of LWLM. The n-gram approximation is a method that approximates LWLM as a simple back-off n-gram structure, and offers LWLM-based robust one-pass ASR decoding. Our experiments verify the effectiveness of our approach by evaluating perplexity and ASR performance in not only in-domain data sets but also out-of-domain data sets.
Ryo MASUMURA Taichi ASAMI Takanobu OBA Hirokazu MASATAKI Sumitaka SAKAUCHI Akinori ITO
This paper aims to investigate the performance improvements made possible by combining various major language model (LM) technologies together and to reveal the interactions between LM technologies in spontaneous automatic speech recognition tasks. While it is clear that recent practical LMs have several problems, isolated use of major LM technologies does not appear to offer sufficient performance. In consideration of this fact, combining various LM technologies has been also examined. However, previous works only focused on modeling technologies with limited text resources, and did not consider other important technologies in practical language modeling, i.e., use of external text resources and unsupervised adaptation. This paper, therefore, employs not only manual transcriptions of target speech recognition tasks but also external text resources. In addition, unsupervised LM adaptation based on multi-pass decoding is also added to the combination. We divide LM technologies into three categories and employ key ones including recurrent neural network LMs or discriminative LMs. Our experiments show the effectiveness of combining various LM technologies in not only in-domain tasks, the subject of our previous work, but also out-of-domain tasks. Furthermore, we also reveal the relationships between the technologies in both tasks.
We propose part-segment (PS) features for estimating an articulated pose in still images. The PS feature evaluates the image likelihood of each body part (e.g. head, torso, and arms) robustly to background clutter and nuisance textures on the body. While general gradient features (e.g. HOG) might include many nuisance responses, the PS feature represents only the region of the body part by iterative segmentation while updating the shape prior of each part. In contrast to similar segmentation features, part segmentation is improved by part-specific shape priors that are optimized by training images with fully-automatically obtained seeds. The shape priors are modeled efficiently based on clustering for fast extraction of PS features. The PS feature is fused complementarily with gradient features using discriminative training and adaptive weighting for robust and accurate evaluation of part similarity. Comparative experiments with public datasets demonstrate improvement in pose estimation by the PS features.
Akihiro TATENO Tomoaki NAGAOKA Kazuyuki SAITO Soichi WATANABE Masaharu TAKAHASHI Koichi ITO
With the development and diverse use of wireless radio terminals, it is necessary to estimate the specific absorption rate (SAR) of the human body from such devices under various exposure situations. In particular, tablet computers may be used for a long time while placed near the abdomen. There has been insufficient evaluation of the SAR for the human body from tablet computers. Therefore, we investigated the SAR of various configurations of a commercial tablet computer using a numerical model with the anatomical structures of Japanese males and females, respectively. We find that the 10-g-averaged SAR of the tablet computer is strongly altered by the tablet's orientation, i.e., from -7.3dB to -22.6dB. When the tablet computer is moved parallel to the height direction, the relative standard deviations of the 10-g averaged SAR for the male and female models are within 40%. In addition, those for the different tilts of the computer are within 20%. The fluctuations of the 10-g-averaged SAR for the seated human models are within ±1.5dB in all cases.