Why is fault management needed
Provided that your network is large enough to justify a central master station, your alarm remotes will typically report via a standard protocol. If you have a smaller network, you'll probably want to purchase alarm remotes that can notify you directly without the need for central master at all.
While most RTUs can be used alone, accessed using their web interface, it may benefit you to manage your fault detection system with a master station as your network and fault detection needs grow. This organizes alarms and notifications and gives network administrators a comprehensive view of the health of their network.
Your master station will be central player, and only one is typically required multiple can be useful for either redundancy , multi-tiered monitoring of massive networks, or both.
Alarm remotes, although much smaller and less expensive than your central master, are much more numerous and durable. That's because they must be deployed at your remote sites, which can experience some very extreme conditions depending on your local geography.
Therefore, a master station that can display alarms on a map reduces the training requirements you have for your staff. One thing to keep in mind is that one of the most important features of a central master station is the ability to provide you with situational awareness, especially during a crisis.
A master station with a user-friendly, intuitive interface is especially helpful. Understanding an especially formatted list of alarms, even if they are conveniently color-coded, is much less intuitive than viewing alarms on an overhead map of your network's region. A network fault management system is key to guard against expensive downtime and otherwise preventable equipment failure.
However, if you've just been put in charge of purchasing, selecting, or recommending a network management system for your organization, it's natural to have many questions in mind. Where to start? What to look for in remote monitoring equipment? Which features are essential , and which can you live without? How can you make sure your network is fully protected, without spending budget on equipment you won't use?
We're an experienced and trusted monitoring systems manufacturer, so we've hard and answered all these questions from many of our clients. Notify me of follow-up comments by email. Notify me of new posts by email.
Search for:. What are the basics of network fault management and monitoring? Find out below. The fault management cycle Fault management operates on a continuous cycle that always looks for problems on your network. The fault management tool checks the network and discovers problems that affect performance or data transmission.
The tool alerts the user to the problem. If a tool creates multiple alerts about the same problem, it automatically correlates them and combines them into one alert before sending it.
By doing so, the fault management system can better prevent faults in those areas. It does so by correcting the conditions that may cause those faults. To achieve that, the system executes programs or scripts to perform minor fixes that are neither complex nor time-consuming. The same programs or scripts also enable the fault management system to automatically solve actual faults. A fault management system creates detailed logs of system status and the preventive or reactive actions it took.
From the perspective of fault prevention, logging with details is extremely important. Now you know how a fault management system works and what its main features are. The next step is to distinguish between active and passive fault management systems.
Active fault management systems use strategies such as ping or port status checks to query devices and nodes. That allows determining the status of those devices and nodes by routine. That is to say that the identification and correction of conditions that potentially lead to future faults are proactive.
On the other hand, passive fault management systems monitor the network for actual fault events that have already occurred. Check it out in the next section! The fault management workflow is cyclical and continuous. It starts with fault detection, follows some steps until fault resolution, and ends where it began: fault detection. This is the general fault management cycle, as you may find below in more detail.
However, any fault management system may implement a specific process that goes beyond the basic steps below. Consider that a fault management system is monitoring a network. Consequently, it discovers an interruption in the service delivery or that the service delivery performance is deficient. The fault management system determines the source of the fault and its location in the network topology. So, the system already knows where smoke is coming from.
But you know a bad thing never comes alone. What if there are a bunch of fault events all related to each other? A single fault can buzz multiple alarms.
But that could be disturbing to the network administrators. Only after that, those systems fire an aggregated alarm for network administrators. Once the alarm is out to the network administrator, the fault management system automatically performs a quick and simple fix. It executes programs or scripts to get the service up and running again as soon as possible. Service automatically restored, available, and working? But what if the kind of fault demands a less quick and more complex fix?
Depending on the complexity of the fault, automatic restoration of service may not be possible. In those cases, the network administrator or a competent technician performs a manual intervention. In this last step of the workflow, someone manually solves the fault. The resolution may be a correction, a repair, or a replacement. At this point, you may be wondering what you need to do to put things into practice.
Allow me to show you the way in the next section. You can either develop your own fault management system or buy one. You can start by working on the most important root causes and observed signs of fault.
Or focus on an area of your network. Or even on a type of device or node. FCAPS is useful to set a straightforward common ground for talking about network management with corporate management. And it still applies today. What is fault tolerance in networking? Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of or one or more faults within some of its components.
What is a network management system? A network management system NMS is an application or set of applications that lets network administrators manage a network's independent components inside a bigger network management framework. NMS may be used to monitor both software and hardware components in a network. What is Fcaps in network management?
FCAPS is an acronym for fault, configuration, accounting, performance, security, the management categories into which the ISO model defines network management tasks. What is a configuration management system? Configuration management CM is a systems engineering process for establishing and maintaining consistency of a product's performance, functional, and physical attributes with its requirements, design, and operational information throughout its life.
What is alarm in networking? Network Alarm Monitoring Systems are devices that are mostly centrally located.
0コメント