Information about Fault Management
In network management, fault management is the set of functions that detect, isolate, and correct malfunctions in a telecommunications network, compensate for environmental changes, and include maintaining and examining error logs, accepting and acting on error detection notifications, tracing and identifying faults, carrying out sequences of diagnostics tests, correcting faults, reporting error conditions, and localizing and tracing faults by examining and manipulating database information.
When a fault or event occurs, a network component will often send a notification to the network operator using a protocol such as SNMP. An alarm is a persistent indication of a fault that clears only when the triggering condition has been resolved. A current list of problems occurring on the network component is often kept in the form of an active alarm list such as is defined in RFC 3877,the Alarm MIB.
Fault management systems may use complex filtering systems to assign alarms to severity levels. These can range in severity from debug to emergency, as in the syslog protocol.[1] Alternatively, they could use the ITU X.733 Alarm Reporting Function's perceived severity field. This takes on values of cleared, indeterminate, critical, major, minor or warning. Note that the latest version of the syslog protocol draft under development within the IETF includes a mapping between these two different sets of severities. It is considered good practice to send a notification not only when a problem has occurred, but also when it has been resolved. The latter notification would have a severity of clear.
A fault management console allows a network administrator or system operator to monitor events from multiple systems and perform actions based on this information. Ideally, a fault management system should be able to correctly identify events and automatically take action, either launching a program or script to take corrective action, or activating notification software that allows a human to take proper intervention (i.e. send e-mail or SMS text to a mobile phone). Some notification systems also have escalation rules that will notify a chain of individuals based on availability and severity of alarm.
There are two primary ways to perform fault management - these are active and passive. Passive fault management is done by collecting alarms from devices (normally via SNMP) when something happens in the devices. In this mode, the fault management system only knows if a device it is monitoring is smart enough to throw and error and report it to the management tool. However, if the device being monitored fails completely or locks up, it won't throw an alarm and the problem will not be detected. Active fault management addresses this issue by actively monitoring devices via tools such as PING to determine if the device is active and responding. If the device stops responding, active monitoring will throw an alarm showing the device as unavailable and allows for the proactive correction of the problem.
..... Click the link for more information.
When a fault or event occurs, a network component will often send a notification to the network operator using a protocol such as SNMP. An alarm is a persistent indication of a fault that clears only when the triggering condition has been resolved. A current list of problems occurring on the network component is often kept in the form of an active alarm list such as is defined in RFC 3877,the Alarm MIB.
Fault management systems may use complex filtering systems to assign alarms to severity levels. These can range in severity from debug to emergency, as in the syslog protocol.[1] Alternatively, they could use the ITU X.733 Alarm Reporting Function's perceived severity field. This takes on values of cleared, indeterminate, critical, major, minor or warning. Note that the latest version of the syslog protocol draft under development within the IETF includes a mapping between these two different sets of severities. It is considered good practice to send a notification not only when a problem has occurred, but also when it has been resolved. The latter notification would have a severity of clear.
A fault management console allows a network administrator or system operator to monitor events from multiple systems and perform actions based on this information. Ideally, a fault management system should be able to correctly identify events and automatically take action, either launching a program or script to take corrective action, or activating notification software that allows a human to take proper intervention (i.e. send e-mail or SMS text to a mobile phone). Some notification systems also have escalation rules that will notify a chain of individuals based on availability and severity of alarm.
There are two primary ways to perform fault management - these are active and passive. Passive fault management is done by collecting alarms from devices (normally via SNMP) when something happens in the devices. In this mode, the fault management system only knows if a device it is monitoring is smart enough to throw and error and report it to the management tool. However, if the device being monitored fails completely or locks up, it won't throw an alarm and the problem will not be detected. Active fault management addresses this issue by actively monitoring devices via tools such as PING to determine if the device is active and responding. If the device stops responding, active monitoring will throw an alarm showing the device as unavailable and allows for the proactive correction of the problem.
Notes
References
Network management refers to the maintenance and administration of computer networks and telecommunications networks at the top level.
Network management is the execution of the set of functions required for controlling, planning, allocating, deploying, coordinating, and
..... Click the link for more information.
Network management is the execution of the set of functions required for controlling, planning, allocating, deploying, coordinating, and
..... Click the link for more information.
SET may stand for:
..... Click the link for more information.
- Sanlih Entertainment Television, a television channel in Taiwan
- Secure electronic transaction, a protocol used for credit card processing,
..... Click the link for more information.
glitch (Also known as "Bug") is a short-lived fault in a system. The term is particularly common in the computing and electronics industries, and in circuit bending, as well as among players of video games, although it is applied to all types of systems including human
..... Click the link for more information.
..... Click the link for more information.
Data logging is the practice of recording sequential data, often chronologically.
..... Click the link for more information.
Etymology
To log is a verbed derivative of the noun logbook; the verb form means to record in a logbook, and may have been coined in the 1820s...... Click the link for more information.
In general, detection is the extraction of intelligence from a carrier signal in a communication system. Note that this may be either an overt signal, as in a conventional radio broadcast, or a covert signal, as in steganography.
..... Click the link for more information.
..... Click the link for more information.
database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries. The records retrieved in answer to queries are information that can be used to make decisions.
..... Click the link for more information.
..... Click the link for more information.
Information is the result of processing, gathering, manipulating and organizing data in a way that adds to the knowledge of the receiver. In other words, it is the context in which data is taken.
..... Click the link for more information.
..... Click the link for more information.
The simple network management protocol (SNMP) forms part of the internet protocol suite as defined by the Internet Engineering Task Force (IETF). SNMP is used by network management systems to monitor network-attached devices for conditions that warrant administrative
..... Click the link for more information.
..... Click the link for more information.
A management information base (MIB) stems from the OSI/ISO Network management model and is a type of database used to manage the devices in a communications network. It comprises a collection of objects in a (virtual) database used to manage entities (such as routers and switches)
..... Click the link for more information.
..... Click the link for more information.
syslog is a standard for forwarding log messages in an IP network. The term "syslog" is often used for both the actual syslog protocol, as well as the application or library sending syslog messages.
..... Click the link for more information.
..... Click the link for more information.
Overview
The terms network administrator, network specialist and network analyst designate job positions of engineers involved in computer networks, the people who carry out network administration...... Click the link for more information.
Sysop (pronounced /ˈsɪs.ɒp/, or more rarely, /ˈsaɪ.sɒp/) is short for "system operator".
..... Click the link for more information.
..... Click the link for more information.
E-mail (short for electronic mail; often also abbreviated as e-mail, email or simply mail) is a store and forward method of composing, sending, storing, and receiving messages over electronic communication systems.
..... Click the link for more information.
..... Click the link for more information.
Text messaging, or texting is the common term for the sending of "short" (160 characters or fewer) text messages, using the Short Message Service, from mobile phones.
..... Click the link for more information.
..... Click the link for more information.
mobile phone or cell phone is a long-range, portable electronic device used for mobile communication. In addition to the standard voice function of a telephone, current mobile phones can support many additional services such as SMS for text messaging, email, packet switching
..... Click the link for more information.
..... Click the link for more information.
The simple network management protocol (SNMP) forms part of the internet protocol suite as defined by the Internet Engineering Task Force (IETF). SNMP is used by network management systems to monitor network-attached devices for conditions that warrant administrative
..... Click the link for more information.
..... Click the link for more information.
ping is a computer network tool used to test whether a particular host is reachable across an IP network. It works by sending ICMP “echo request” packets to the target host and listening for ICMP “echo response” replies.
..... Click the link for more information.
..... Click the link for more information.
Federal Standard 1037C, entitled Telecommunications: Glossary of Telecommunication Terms is a United States Federal Standard, issued by the General Services Administration pursuant to the Federal Property and Administrative Services Act of 1949, as amended.
..... Click the link for more information.
..... Click the link for more information.
MIL-STD-188 is a series of U.S. military standards relating to telecommunications.
..... Click the link for more information.
Purpose
Faced with “past technical deficiencies in telecommunications systems and equipment and software…that were traced to basic inadequacies in the application of..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus