1-5hit |
Kazumichi SATO Keisuke ISHIBASHI Tsuyoshi TOYONO Haruhisa HASEGAWA Hideaki YOSHINO
Botnet threats, such as server attacks or sending of spam e-mail, have been increasing. Therefore, infected hosts must be found and their malicious activities mitigated. An effective method for finding infected hosts is to use a blacklist of domain names. When a bot receives attack commands from a Command and Control (C&C) server, it attempts to resolve domain names of C&C servers. We can thus detect infected hosts by finding these that send queries on black domain names. However, we cannot find all infected hosts because of the inaccuracy of blacklists. There are many black domain names, and the lifetimes of these domain names are short; therefore a blacklist cannot cover all black domain names. We thus present a method for finding unknown black domain names by using DNS query data and an existing blacklist of known black domain names. To achieve this, we focus on DNS queries sent by infected hosts. One bot sends several queries on black domain names due to C&C server redundancy. We use the co-occurrence relation of two different domain names to find unknown black domain names and extend the blacklist. If a domain name frequently co-occurs with a known black name, we assume that the domain name is also black. A cross-validation evaluation of the proposed method showed that 91.2% of domain names that are on the validation list scored in the top 1%.
Akio WATANABE Keisuke ISHIBASHI Tsuyoshi TOYONO Keishiro WATANABE Tatsuaki KIMURA Yoichi MATSUO Kohei SHIOMOTO Ryoichi KAWAHARA
In current large-scale IT systems, troubleshooting has become more complicated due to the diversification in the causes of failures, which has increased operational costs. Thus, clarifying the troubleshooting process also becomes important, though it is also time-consuming. We propose a method of automatically extracting a workflow, a graph indicating a troubleshooting process, using multiple trouble tickets. Our method extracts an operator's actions from free-format texts and aligns relative sentences between multiple trouble tickets. Our method uses a stochastic model to detect a resolution, a frequent action pattern that helps us understand how to solve a problem. We validated our method using real trouble-ticket data captured from a real network operation and showed that it can extract a workflow to identify the cause of a failure.
Katsunori MATSUURA Yoshitsugu TSUCHIYA Tsuyoshi TOYONO Kenji TAKAHASHI
Availability of network access "anytime and anywhere" will impose new requirements to presence services - server load sharing and privacy protection. In such cases, presence services would have to deal with sensor device information with maximum consideration of user's privacy. In this paper, we propose FieldCast: peer-to-peer system architecture for presence information exchange in ubiquitous computing environment. According to our proposal, presence information is exchanged directly among user's own computing resources. We illustrate our result of evaluation that proves the feasibility of our proposal.
Tatsuaki KIMURA Akio WATANABE Tsuyoshi TOYONO Keisuke ISHIBASHI
Recent carrier-grade networks use many network elements (switches, routers) and servers for various network-based services (e.g., video on demand, online gaming) that demand higher quality and better reliability. Network log data generated from these elements, such as router syslogs, are rich sources for quickly detecting the signs of critical failures to maintain service quality. However, log data contain a large number of text messages written in an unstructured format and contain various types of network events (e.g., operator's login, link down); thus, genuinely important log messages for network operation are difficult to find automatically. We propose a proactive failure-detection system for large-scale networks. It automatically finds abnormal patterns of log messages from a massive amount of data without requiring previous knowledge of data formats used and can detect critical failures before they occur. To handle unstructured log messages, the system has an online log-template-extraction part for automatically extracting the format of a log message. After template extraction, the system associates critical failures with the log data that appeared before them on the basis of supervised machine learning. By associating each log message with a log template, we can characterize the generation patterns of log messages, such as burstiness, not just the keywords in log messages (e.g. ERROR, FAIL). We used real log data collected from a large production network to validate our system and evaluated the system in detecting signs of actual failures of network equipment through a case study.
Tatsuaki KIMURA Keisuke ISHIBASHI Tatsuya MORI Hiroshi SAWADA Tsuyoshi TOYONO Ken NISHIMATSU Akio WATANABE Akihiro SHIMODA Kohei SHIOMOTO
Network equipment, such as routers, switches, and RADIUS servers, generate various log messages induced by network events such as hardware failures and protocol flaps. In large production networks, analyzing the log messages is crucial for diagnosing network anomalies; however, it has become challenging due to the following two reasons. First, the log messages are composed of unstructured text messages generated in accordance with vendor-specific rules. Second, network events that induce the log messages span several geographical locations, network layers, protocols, and services. We developed a method to tackle these obstacles consisting of two techniques: statistical template extraction (STE) and log tensor factorization (LTF). The former leverages a statistical clustering technique to automatically extract primary templates from unstructured log messages. The latter builds a statistical model that collects spatial-temporal patterns of log messages. Such spatial-temporal patterns provide useful insights into understanding the impact and patterns of hidden network events. We evaluate our techniques using a massive amount of network log messages collected from a large operating network and confirm that our model fits the data well. We also investigate several case studies that validate the usefulness of our method.