The most critical HMI notification, the alarm, has important information for both operators and maintenance personnel. Our panel of experts advises how to manage alarms to ensure high-severity alarms receive the attention they need.
Alarms are an important part of any machine interface. But how do you design for alarm management to be certain operators receive only relevant alarms and aren’t overwhelmed by multiple alarms or false trips?
Marcel Voigt, senior solutions engineer, B&R Industrial Automation
We can design the alarm system in such a way that, based on the user logged in, only top-level alarms are being displayed.
For the most part this should be limited to a "one liner" in the alarm banner.
Details can be provided by request, as well as potential options as to how to fix the problem. This could be in the form of text, pictures or videos.
For more advanced users such as service technicians, the level and details of a given alarm would include a back trace of the alarm.
That would allow diagnosis of the alarm in more detail, such as drive faults.
For example, in the case of an e-stop, there would be several follow-up faults, like from all the drives losing the Drive Enable.
The operator would simply see the e-stop alarm with the advice to release the e-stop.
If the service technician logs in, he would see all of the alarms and faults that led or followed the e-stop being pressed.
Another example could be a temperature alarm. The operator doesn't really care at what temperature the alarm triggered, but the service tech might want to know.
In summary, show the operator only the essentials that help him/her get the machine running again, while showing the service tech details that allow him/her to detect or fix more substantial problems.
Vaidhyanath “Doc” Nanjundaiah, director—marketing & customer success, EZAutomation
If your PLC has the IIoT (MQTT)/Industry 4.0 feature, you can definitely implement or design a good alarm-management system. For instance, if the PLC is IIoT-ready, you can program it to send alarms or messages. You can optimize it by authorizing certain people or groups such as operators, supervisors and managers to receive specific alarms. This way plant personnel are not overwhelmed and the relevant real-time data are only sent to the appropriate personnel.
Color, sound or both are the most and the fastest recognizable anomaly for human interface, followed by intelligent text.
Jeff Winter, CSP, FS Eng (TÜV Rheinland), director, safety practice, at Grantek Systems Integration, Control System Integrators Association (CSIA) member
Alarms can absolutely be overwhelming and counter-productive if overused. More successful alarm-management programs include the following two aspects:
1. Standardization in visual/audible alarming requirements. Ensure all alarm types (for example, color of indicator light, solid vs. flashing) and methods of alarming (for example, stack lights, operator displays, buzzers) are consistent across all pieces of equipment and have the same look and feel. Without this standardization, it can be very difficult to develop operator training and associated safety policies to ensure safe interaction with equipment. The last thing you want is for a flashing red light on one machine to be critical and a flashing red light on another machine to be noncritical.
2. Standardization in alarm hierarchy and employee training. This includes ensuring that specific alarms are easily distinguishable as being higher priorities than others. Typically safety-related alarms are at the top of the alarm hierarchy and should be uniformly conveyed to all affected personnel. Along with having an alarm hierarchy comes an effective alarm training of employees. Similar to a fire alarm system, every employee in the company gets training on understanding how the system functions and what to do when it is activated. All employs, not just those actively working on the equipment, should be training on how to identify a safety-specific alarm on a piece of equipment, especially one that may be in another part of the facility, and how to react in the event of an emergency.
Todd Ebright, staff engineer, MartinCSI, Control System Integrators Association (CSIA) member
Following a standard practice that includes alarm prioritization, color coding, succinct descriptions and corrective action is essential in the design of alarm management. By prioritizing alarms, putting the most severe at the top, operators will be able to determine where to focus their attention in troubleshooting issues. Color coding is also very helpful in prioritizing alarms, and it is best to limit color variations. For instance, a dark red could represent a high-level alarm, a lighter red could indicate a less-severe alarm, and yellow could show a warning that does not require immediate attention. Restricting the color palette to only a few variations will make it easy for operators to determine severity with a quick glance. It is also recommended to keep alarm descriptors brief. Wordy alarm descriptions can cause confusion and waste time. The use of succinct descriptions can distinguish to an operator what is alarmed and where to look. Including a corrective action on the alarm display will enable the operator to quickly resolve any issues. Alarms can be filtered by user role to show only the ones that are applicable to them. Assigning alarm types to different user roles will ensure users will see alarms that are relevant to them when logging into the machine interface. It can also be helpful to programmatically look for alarms that are recurring regularly by including an occurrence count on the alarm display. A large count could indicate a potential hardware issue that needs to be resolved—for example, a faulty sensor that is tripping more than usual and potentially needs recalibrated.
HMIs are great with all the information they can display and offer multiple levels of interactions, and depending on the person interacting with the HMI it can offer a great deal of information to diagnose a problem or initiate a troubleshooting process. With industry kicking into a high productivity era, it is equally important that the right person or the group is alerted depending on the problem at hand, may it be troubleshooting false triggers by sensors or material outage. As the condition stands today, most machines come with standard three- or five-segment stack lights and buzzers, and the most information they can provide is that the machine is down and bring someone to troubleshoot. Naturally, the closest person would be the operator of that machine, and, beyond material replenishment and basic troubleshooting, there is not much to gain by having an operator to respond to that alarm. In short, if the machine is demanding attention more than the operator can handle, then the plant is looking at some downtime and waste of operator's time attending to that alarm.
Figure 1: Use visual-indication tower lights that could change instantaneously from stack-light mode to a run-light mode.
If manufacturers are really serious about shortening that downtime, there are multiple ways to do that. The most involved way would be sending texts and alerts about each problem to the maintenance teams or operators, depending on the situation. That could quickly become expensive and overwhelming to your maintenance teams and operators.
The other alternative could be to use visual-indication tower lights that could change instantaneously from stack-light mode to a run-light mode (Figure 1). There is a wide variety of information the light can communicate by selecting combinations of background and foreground (running segment) colors, and the speed and intensity of the running segment. They can tremendously help to visualize the entire state of the plant in just a glance. For example, under the normal operator interaction conditions, such as material replenishment, one can choose the background color as green indicating there are no machine troubles, but running segment would show orange color, indicating that the machine will soon need replenishment. When the machine is out of materials for processing, the running segment color could change to red to indicate that the machine is stopped due to material outage—something that the operator can handle.
In another scenario if machine stops in the middle of operation unexpectedly due to some sensor tripping inadvertently, it's time to call electrical maintenance and the light could change the background color to blue or something similar with running segment red, indicating that the machine is stopped. So, the maintenance supervisor can see that light from far to ensure sending the right person.
If that sensor tripping is a common ailment of the system and may someday need replacement, in that situation, the run-light can continue with blue background color but green running segment.
The point is, with various color combinations to choose from, the entire plant can create its own alarm manual that would offer a consistent approach to handling all different alarms for maintenance teams or operators without overwhelming them.
Alarms are intended for two different audiences: operators and maintenance engineers. While maintenance should have access to all current and past alarms, machine operators are only interested in knowing whether the machine is operational and if any action is required. You should serve operators simple alarms that are easy to understand. To avoid overwhelming them with multiple alarms, it is best practice to only display the current alarms and clear them from the operator view when the issue is resolved. HMIs with sound output also make it easy to provide audio instructions to the operator. For the maintenance team, you should display full records complete with details and timestamps that will help to diagnose and troubleshoot the machine. Coupling data logging and trend graphs with alarms also makes troubleshooting easier when you can jump directly to the relevant data. When a smart Web server is enabled, these records and even troubleshooting videos and documents can be viewed remotely on a tablet without interfering with the operation of the machine. With some HMIs, you can also issue push e-mail notifications of high-severity alarms to promptly notify management of bigger concerns.
Operators can be simply alerted by alarm pilot devices on the machine, and operations can use the alarms from the HMI to get more detailed information. This is a simple solution. Also, HMI alarms typically have many different levels of severity that can be programmed to dictate how each one gets reported. Different alarms can notify different users, depending on job level.
Travis Cox, co-director of sales engineering, Inductive Automation
With more devices than ever being connected to monitoring systems, there’s a greater chance that a critical alarm could be lost in a sea of ordinary ones. You need a system that keeps alarm notifications organized, so you can quickly sift through the clutter and hone in on those alarms that should be addressed immediately.
Four key processes are especially important—prioritization of alarms, removing chattering alarms, consolidation and escalation. When you prioritize alarms correctly, you make sure all alarms are not treated equally. According to The Alarm Management Handbook, an excellent reference on SCADA alarming, only 20% of your alarms should be categorized as “high” or “emergency.” Most of your alarms should be set at lower priorities. The lowest level, requiring no action, should be designated as “diagnostic.” Prioritization helps you to focus on critical alarms first.
Chattering alarms are alarms that repeat excessively in a short period of time. For example, temperature readings can chatter because temperature sensors are very sensitive, and small fluctuations can occur often as the temperature goes over a threshold briefly and then goes back to normal. If you don’t adjust your system, this would cause several unnecessary alarms in a short period of time. You can remove the chattering by telling your system to send an alarm only if the temperature stays beyond the threshold for a defined period of time.
Consolidation involves telling your system to hold onto alarms for a while before notifying you. That gives the system time to take in more alarms, so it can collect a batch and send them all at once. That way, you get one message with a list of alarms, rather than each alarm notification coming at you individually. Setting a delay of a few seconds can bring the alarms to you in groups. This helps operators to process the information more quickly.
Escalation involves telling your system to notify others if the primary operator isn’t responding within a defined timeframe. And the type of notification can be escalated, as well. For example, instead of an email, an escalated alarm notification could be a voice message or text message. An escalated message can be sent to an individual or to groups. You should set up your escalation so repeated messages are sent to the right people until the issue is resolved.
If you use these four techniques, you’ll go a long way toward helping your operators respond quickly and efficiently to the alarms that need to be addressed.
Sopan Khurana, applications engineer, Patlite
Many operators develop alarm fatigue, tuning out repeating alarms that occur in excess. To combat this issue, we suggest designing alarm management to minimize the number of different alarm tones and integrate a mixture of visual and voice alerts to engage more senses. MP3 voice alert devices are typically field-programmable and can be customized to annunciate precise alarm conditions compared to identifying a multitude of different tones and sounds. Including a visual-indication component such as an LED signal tower adds another alert layer even if operators decide to deactivate the audible alarm component. Implementing triggers for sending custom-worded text messages and emails to managers for certain alarm conditions is a great option to consider, as well.
False alarms are usually triggered in short intervals as the alarm is turned off. Wireless data-acquisition systems attached to the alarms can map the history of all alarms on the factory floor. This overview of all the machines on the floor can show the exact alarms that are the most problematic due to false trips.