A Proposal of Event Correlation for Distributed Network Fault Management and Its Evaluation

Nei KATO; Kohei OHTA; Tomohiro IKA; Glenn MANSFIELD; Yoshiaki NEMOTO

A Proposal of Event Correlation for Distributed Network Fault Management and Its Evaluation

Nei KATO, Kohei OHTA, Tomohiro IKA, Glenn MANSFIELD, Yoshiaki NEMOTO

Full Text Views

0

Cite this

Summary :

In a distributed network management environment, a NMS (Network Management Station) interacts with several agents in different sub-networks. In the network fault management context, the NMS detects symptoms that indicate some abnormality e. g. a surge in ICMP traffic, which may be caused by some network malfunction or misuse. The occurrence of a symptom is an event. Large number of events may be detected by an NMS. The sheer number of these events makes it difficult, if not impossible, for an NMS to diagnose these events. Generally, a fault may have a cascading effect which may, in turn, give rise to a very large number of events. The sequence of events and their correlation play an important role in fault management and diagnosis. In the distributed environment of todays networks, the absence of any uniform time for reference makes this a challenging task. In the present network management framework of SNMP, a Manager maintains a notion of the clock of the agent it interacts with. But this mechanism is inadequate to determine the sequence of events and their correlation, more so, in a distributed environment which may involve several managers. In this paper we propose a mechanism for ordering and correlating events detected in large-scale network which is managed in a distributed manner within the SNMP framework. Our algorithm uses the concept of a Network Management Clock (NMC). The NMC is a virtual clock maintained by a manager based on sysUpTime readings from each SNMP agent. In this paper, the algorithm, its implementation and evaluation will be discussed.

Publication: IEICE TRANSACTIONS on Communications Vol.E82-B No.6 pp.859-867

Publication Date: 1999/06/25

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section PAPER (Special Issue on Distributed Processing for Controlling Telecommunications Systems)

Category

Cite this

Copy

Nei KATO, Kohei OHTA, Tomohiro IKA, Glenn MANSFIELD, Yoshiaki NEMOTO, "A Proposal of Event Correlation for Distributed Network Fault Management and Its Evaluation" in IEICE TRANSACTIONS on Communications, vol. E82-B, no. 6, pp. 859-867, June 1999, doi: .
Abstract: In a distributed network management environment, a NMS (Network Management Station) interacts with several agents in different sub-networks. In the network fault management context, the NMS detects symptoms that indicate some abnormality e. g. a surge in ICMP traffic, which may be caused by some network malfunction or misuse. The occurrence of a symptom is an event. Large number of events may be detected by an NMS. The sheer number of these events makes it difficult, if not impossible, for an NMS to diagnose these events. Generally, a fault may have a cascading effect which may, in turn, give rise to a very large number of events. The sequence of events and their correlation play an important role in fault management and diagnosis. In the distributed environment of todays networks, the absence of any uniform time for reference makes this a challenging task. In the present network management framework of SNMP, a Manager maintains a notion of the clock of the agent it interacts with. But this mechanism is inadequate to determine the sequence of events and their correlation, more so, in a distributed environment which may involve several managers. In this paper we propose a mechanism for ordering and correlating events detected in large-scale network which is managed in a distributed manner within the SNMP framework. Our algorithm uses the concept of a Network Management Clock (NMC). The NMC is a virtual clock maintained by a manager based on sysUpTime readings from each SNMP agent. In this paper, the algorithm, its implementation and evaluation will be discussed.
URL: https://global.ieice.org/en_transactions/communications/10.1587/e82-b_6_859/_p

Copy

@ARTICLE{e82-b_6_859,
author={Nei KATO, Kohei OHTA, Tomohiro IKA, Glenn MANSFIELD, Yoshiaki NEMOTO, },
journal={IEICE TRANSACTIONS on Communications},
title={A Proposal of Event Correlation for Distributed Network Fault Management and Its Evaluation},
year={1999},
volume={E82-B},
number={6},
pages={859-867},
abstract={In a distributed network management environment, a NMS (Network Management Station) interacts with several agents in different sub-networks. In the network fault management context, the NMS detects symptoms that indicate some abnormality e. g. a surge in ICMP traffic, which may be caused by some network malfunction or misuse. The occurrence of a symptom is an event. Large number of events may be detected by an NMS. The sheer number of these events makes it difficult, if not impossible, for an NMS to diagnose these events. Generally, a fault may have a cascading effect which may, in turn, give rise to a very large number of events. The sequence of events and their correlation play an important role in fault management and diagnosis. In the distributed environment of todays networks, the absence of any uniform time for reference makes this a challenging task. In the present network management framework of SNMP, a Manager maintains a notion of the clock of the agent it interacts with. But this mechanism is inadequate to determine the sequence of events and their correlation, more so, in a distributed environment which may involve several managers. In this paper we propose a mechanism for ordering and correlating events detected in large-scale network which is managed in a distributed manner within the SNMP framework. Our algorithm uses the concept of a Network Management Clock (NMC). The NMC is a virtual clock maintained by a manager based on sysUpTime readings from each SNMP agent. In this paper, the algorithm, its implementation and evaluation will be discussed.},
keywords={},
doi={},
ISSN={},
month={June},}

Copy

TY - JOUR
TI - A Proposal of Event Correlation for Distributed Network Fault Management and Its Evaluation
T2 - IEICE TRANSACTIONS on Communications
SP - 859
EP - 867
AU - Nei KATO
AU - Kohei OHTA
AU - Tomohiro IKA
AU - Glenn MANSFIELD
AU - Yoshiaki NEMOTO
PY - 1999
DO -
JO - IEICE TRANSACTIONS on Communications
SN -
VL - E82-B
IS - 6
JA - IEICE TRANSACTIONS on Communications
Y1 - June 1999
AB - In a distributed network management environment, a NMS (Network Management Station) interacts with several agents in different sub-networks. In the network fault management context, the NMS detects symptoms that indicate some abnormality e. g. a surge in ICMP traffic, which may be caused by some network malfunction or misuse. The occurrence of a symptom is an event. Large number of events may be detected by an NMS. The sheer number of these events makes it difficult, if not impossible, for an NMS to diagnose these events. Generally, a fault may have a cascading effect which may, in turn, give rise to a very large number of events. The sequence of events and their correlation play an important role in fault management and diagnosis. In the distributed environment of todays networks, the absence of any uniform time for reference makes this a challenging task. In the present network management framework of SNMP, a Manager maintains a notion of the clock of the agent it interacts with. But this mechanism is inadequate to determine the sequence of events and their correlation, more so, in a distributed environment which may involve several managers. In this paper we propose a mechanism for ordering and correlating events detected in large-scale network which is managed in a distributed manner within the SNMP framework. Our algorithm uses the concept of a Network Management Clock (NMC). The NMC is a virtual clock maintained by a manager based on sysUpTime readings from each SNMP agent. In this paper, the algorithm, its implementation and evaluation will be discussed.
ER -