IEICE global.ieice.org Site

Keyword Search Result

[Keyword] failure detection(6hit)

1-6hit

Proactive Failure Detection Learning Generation Patterns of Large-Scale Network Logs
Tatsuaki KIMURA Akio WATANABE Tsuyoshi TOYONO Keisuke ISHIBASHI

PAPER-Network Management/Operation

Pubricized:
2018/08/13
Vol:
E102-B No:2
Page(s):
306-316
Recent carrier-grade networks use many network elements (switches, routers) and servers for various network-based services (e.g., video on demand, online gaming) that demand higher quality and better reliability. Network log data generated from these elements, such as router syslogs, are rich sources for quickly detecting the signs of critical failures to maintain service quality. However, log data contain a large number of text messages written in an unstructured format and contain various types of network events (e.g., operator's login, link down); thus, genuinely important log messages for network operation are difficult to find automatically. We propose a proactive failure-detection system for large-scale networks. It automatically finds abnormal patterns of log messages from a massive amount of data without requiring previous knowledge of data formats used and can detect critical failures before they occur. To handle unstructured log messages, the system has an online log-template-extraction part for automatically extracting the format of a log message. After template extraction, the system associates critical failures with the log data that appeared before them on the basis of supervised machine learning. By associating each log message with a log template, we can characterize the generation patterns of log messages, such as burstiness, not just the keywords in log messages (e.g. ERROR, FAIL). We used real log data collected from a large production network to validate our system and evaluated the system in detecting signs of actual failures of network equipment through a case study.
A Novel Failure Detection Circuit for SUMPLE Using Variability Index
Leiou WANG Donghui WANG Chengpeng HAO

BRIEF PAPER-Electronic Circuits

Vol:
E101-C No:2
Page(s):
139-142
SUMPLE, one of important signal combining approaches, its combining loss increases when a sensor in an array fails. A novel failure detection circuit for SUMPLE is proposed by using variability index. This circuit can effectively judge whether a sensor fails or not. Simulation results validate its effectiveness with respect to the existing algorithms.
Failure Detection in P2P-Grid System
Huan WANG Hideroni NAKAZATO

PAPER-Grid System

Pubricized:
2015/09/15
Vol:
E98-D No:12
Page(s):
2123-2131
Peer-to-peer (P2P)-Grid systems are being investigated as a platform for converging the Grid and P2P network in the construction of large-scale distributed applications. The highly dynamic nature of P2P-Grid systems greatly affects the execution of the distributed program. Uncertainty caused by arbitrary node failure and departure significantly affects the availability of computing resources and system performance. Checkpoint-and-restart is the most common scheme for fault tolerance because it periodically saves the execution progress onto stable storage. In this paper, we suggest a checkpoint-and-restart mechanism as a fault-tolerant method for applications on P2P-Grid systems. Failure detection mechanism is a necessary prerequisite to fault tolerance and fault recovery in general. Given the highly dynamic nature of nodes within P2P-Grid systems, any failure should be detected to ensure effective task execution. Therefore, failure detection mechanism as an integral part of P2P-Grid systems was studied. We discussed how the design of various failure detection algorithms affects their performance in average failure detection time of nodes. Numerical analysis results and implementation evaluation are also provided to show different average failure detection times in real systems for various failure detection algorithms. The comparison shows the shortest average failure detection time by 8.8s on basis of the WP failure detector. Our lowest mean time to recovery (MTTR) is also proven to have a distinct advantage with a time consumption reduction of about 5.5s over its counterparts.
Bayesian Optimal Release Time Based on Inflection S-Shaped Software Reliability Growth Model
Hee Soo KIM Dong Ho PARK Shigeru YAMADA

PAPER-Reliability, Maintainability and Safety Analysis

Vol:
E92-A No:6
Page(s):
1485-1493
The inflection S-shaped software reliability growth model (SRGM) proposed by Ohba (1984) is one of the well- known SRGMs. This paper deals with the optimal software release problem with regard to the expected software cost under this model based on the Bayesian approach. To reflect the effect of the learning experience for the updated software system, we consider several improvement factors to adjust the values of parameters characterizing the inflection S-shaped SRGM. Appropriate prior distributions are assumed for such factors and the expected total software cost is formulated. The optimal release time is shown to be finite and uniquely determined. Because of the flexibility of prior distributions, the proposed Bayesian methods may be applied in many different situations. Numerical results are presented on the basis of the real data.
Object Tracking with Target and Background Samples
Chunsheng HUA Haiyuan WU Qian CHEN Toshikazu WADA

PAPER-Image Recognition, Computer Vision

Vol:
E90-D No:4
Page(s):
766-774
In this paper, we present a general object tracking method based on a newly proposed pixel-wise clustering algorithm. To track an object in a cluttered environment is a challenging issue because a target object may be in concave shape or have apertures (e.g. a hand or a comb). In those cases, it is difficult to separate the target from the background completely by simply modifying the shape of the search area. Our algorithm solves the problem by 1) describing the target object by a set of pixels; 2) using a K-means based algorithm to detect all target pixels. To realize stable and reliable detection of target pixels, we firstly use a 5D feature vector to describe both the color ("Y, U, V") and the position ("x, y") of each pixel uniformly. This enables the simultaneous adaptation to both the color and geometric features during tracking. Secondly, we use a variable ellipse model to describe the shape of the search area and to model the surrounding background. This guarantees the stable object tracking under various geometric transformations. The robust tracking is realized by classifying the pixels within the search area into "target" and "background" groups with a K-means clustering based algorithm that uses the "positive" and "negative" samples. We also propose a method that can detect the tracking failure and recover from it during tracking by making use of both the "positive" and "negative" samples. This feature makes our method become a more reliable tracking algorithm because it can discover the target once again when the target has become lost. Through the extensive experiments under various environments and conditions, the effectiveness and efficiency of the proposed algorithm is confirmed.
DRIC: Dependable Grid Computing Framework
Hai JIN Xuanhua SHI Weizhong QIANG Deqing ZOU

PAPER-Grid Computing

Vol:
E89-D No:2
Page(s):
612-623
Grid computing presents a new trend to distributed and Internet computing to coordinate large scale resources sharing and problem solving in dynamic, multi-institutional virtual organizations. Due to the diverse failures and error conditions in the grid environments, developing, deploying, and executing applications over the grid is a challenge, thus dependability is a key factor for grid computing. This paper presents a dependable grid computing framework, called DRIC, to provide an adaptive failure detection service and a policy-based failure handling mechanism. The failure detection service in DRIC is adaptive to users' QoS requirements and system conditions, and the failure-handling mechanism can be set optimized based on decision-making method by a policy engine. The performance evaluation results show that this framework is scalable, high efficiency and low overhead.