1-6hit |
Yasser MOHAMMAD Kazunori MATSUMOTO Keiichiro HOASHI
Activity recognition from sensors is a classification problem over time-series data. Some research in the area utilize time and frequency domain handcrafted features that differ between datasets. Another categorically different approach is to use deep learning methods for feature learning. This paper explores a middle ground in which an off-the-shelf feature extractor is used to generate a large number of candidate time-domain features followed by a feature selector that was designed to reduce the bias toward specific classification techniques. Moreover, this paper advocates the use of features that are mostly insensitive to sensor orientation and show their applicability to the activity recognition problem. The proposed approach is evaluated using six different publicly available datasets collected under various conditions using different experimental protocols and shows comparable or higher accuracy than state-of-the-art methods on most datasets but usually using an order of magnitude fewer features.
Kazunori MATSUMOTO Kazuo HASHIMOTO
Call tracking data contains a calling address, called address, service type, and other useful attributes to predict a customer's calling activity. Call tracking data is becoming a target of data mining for telecommunication carriers. Conventional data-mining programs control the number of association rules found with two types of thresholds (minimum confidence and minimum support), however, often they generate too many association rules because of the wide variety of patterns found in call tracking data. This paper proposes a new method to reduce the number of generated rules. The method proposed tests each generated rule based on Akaike Information Criteria (AIC) without using conventional thresholds. Experiments with artificial call tracking data show the high performance of the proposed method.
Chaima DHAHRI Kazunori MATSUMOTO Keiichiro HOASHI
Upcoming mood prediction plays an important role in different topics such as bipolar depression disorder in psychology and quality-of-life and recommendations on health-related quality of life research. The mood in this study is defined as the general emotional state of a user. In contrast to emotions which is more specific and varying within a day, the mood is described as having either a positive or negative valence[1]. We propose an autonomous system that predicts the upcoming user mood based on their online activities over cyber, social and physical spaces without using extra-devices and sensors. Recently, many researchers have relied on online social networks (OSNs) to detect user mood. However, all the existing works focused on inferring the current mood and only few works have focused on predicting the upcoming mood. For this reason, we define a new goal of predicting the upcoming mood. We, first, collected ground truth data during two months from 383 subjects. Then, we studied the correlation between extracted features and user's mood. Finally, we used these features to train two predictive systems: generalized and personalized. The results suggest a statistically significant correlation between tomorrow's mood and today's activities on OSNs, which can be used to develop a decent predictive system with an average accuracy of 70% and a recall of 75% for the correlated users. This performance was increased to an average accuracy of 79% and a recall of 80% for active users who have more than 30 days of history data. Moreover, we showed that, for non-active users, referring to a generalized system can be a solution to compensate the lack of data at the early stage of the system, but when enough data for each user is available, a personalized system is used to individually predict the upcoming mood.
Dung Duc NGUYEN Maike ERDMANN Tomoya TAKEYOSHI Gen HATTORI Kazunori MATSUMOTO Chihiro ONO
The abundance of information published on the Internet makes filtering of hazardous Web pages a difficult yet important task. Supervised learning methods such as Support Vector Machines (SVMs) can be used to identify hazardous Web content. However, scalability is a big challenge, especially if we have to train multiple classifiers, since different policies exist on what kind of information is hazardous. We therefore propose two different strategies to train multiple SVMs for personalized Web content filters. The first strategy identifies common data clusters and then performs optimization on these clusters in order to obtain good initial solutions for individual problems. This initialization shortens the path to the optimal solutions and reduces the training time on individual training sets. The second approach is to train all SVMs simultaneously. We introduce an SMO-based kernel-biased heuristic that balances the reduction rate of individual objective functions and the computational cost of kernel matrix. The heuristic primarily relies on the optimality conditions of all optimization problems and secondly on the pre-calculated part of the whole kernel matrix. This strategy increases the amount of information sharing among learning tasks, thus reduces the number of kernel calculation and training time. In our experiments on inconsistently labeled training examples, both strategies were able to predict hazardous Web pages accurately (> 91%) with a training time of only 26% and 18% compared to that of the normal sequential training.
Kazuo HASHIMOTO Kazunori MATSUMOTO Norio SHIRATORI
This paper introduces a probabilistic modeling of alarm observation delay, and shows a novel method of model-based diagnosis for time series observation. First, a fault model is defined by associating an event tree rooted by each fault hypothesis with probabilistic variables representing temporal delay. The most probable hypothesis is obtained by selecting one whose Akaike information criterion (AIC) is minimal. It is proved by simulation that the AIC-based hypothesis selection achieves a high precision in diagnosis.
Gen HATTORI Chihiro ONO Kazunori MATSUMOTO Fumiaki SUGAYA
Mobile agent technology is applied to enhance the remote network management of large-scale networks, and real-world oriented entertainment systems, and so forth. In order to communicate, the agents exchange messages mutually and migrate repeatedly among terminals. Although these systems efficiently accomplish the tasks by using a large quantity of mobile agents, they have a serious problem in that the number of messages between agents increases in proportion to the square of the number of agents. These systems have to reduce the communication costs, such as the number of hosts relaying messages; however, the conventional message-delivering schemes alone cannot keep the communication costs to a minimum under all conditions. To minimize the communication costs, we propose a hybrid message-delivering scheme which dynamically selects the optimal message-delivering schemes. Firstly, we evaluate the communication costs of conventional schemes, and we design the hybrid message-delivering scheme. Then we perform simulation evaluations to derive the threshold value for switching a scheme to minimize the communication costs.