Alarms

alarms

Introduction

IoT applications often make use of notifications via e.g. SMS or email but in some cases a native alarm functionality is required. Waylay supports such a native alarm service, which is exposed over REST and which has a user interface that is part of the Waylay administration console. The alarm functionality integrates with the rules engine, in the sense that Waylay supports actuators that allow you to create, update or clear alarms in an automated way based on the outcome of rules.

How can this feature be used in practice?

Alarms allow you to track incident persistency over time. The Waylay Alarm Service gives you the status and the count of the alarms, but also provides interfaces that allow you to acknowledge alarms, change the severity level or simply close the alarm. Since the alarm service is REST exposed, it is possible to integrate the alarm information within your own application.

Example of using the Alarm Service

The best way to describe this feature is by using one simple rule. We will create alarms as soon as a dice sensor turns odd states (ONE, THREE, FIVE), and clear all alarms of a given type (in this case type DICE), when dice gives back even states (TWO, FOUR, SIX).

alarm

In case the alarm actuator is triggered and the alarm has previously been created but was not yet cleared, the count of the opened alarm will be increased by the Alarm Service. After running the task for a couple of minutes, this is what we can see:

alarms

if we look at one alarm that has the count higher than 1, we can see the following:

alarms

Listening to alarm events

The Waylay engine also listens to events sent by the alarm service and injects these events in tasks with AlarmEventSensors which run for the resources on which the alarm was created. By using the AlarmEventSensor as a filter, more automated handling of alarms can be implemented.

Alarm severity escalation

As an example of this more automated handling, consider the following template that implements automatic alarm severity escalation, i.e. increase the severity of an alarm if the alarm happens a couple of times. Remember that, in order to avoid an explosion of open alarms, only 1 alarm of a certain type on the same resource can be open at any point in time. If the error condition leading to this alarm keeps on happening, a user probably would like to automatically increase its severity.

alarm

The template uses the AlarmEventSensor, the Function sensor and the updateWaylayAlarm actuator. The settings for each of them are :

alarm

Alarm Handling Service

Some devices can have multiple embedded sensors (location, motion, status, temperature, etc), and they can be used at the same time for different business use cases. It often happens that these business rules are very different and fire different alarms, such as motion alarm, or temperature alarm, or network outage, etc. Depending on the severity and type of these alarms, different alarm handling logic may be needed: e.g. simply log the minor alarms versus sending an sms for the critical alarms. The alarm handling service feature allows you to separate the alarm handling logic from the metric handling logic, which improves the readability and maintainability of the business logic.

Business rules can create alarms with a type and severity. The alarm service will publish them on the alarm channel and the alarm handling rules would listen on the alarm channel and process the alarms using the ‘AlarmEventSensor’. The AlarmEventSensor can use a filter criteria on a resource, severity or alarm type.

alarm

The below example shows a rule dedicated to handle alarms of different alarm types and severities.

alarm

We used the “IsParameterEqual” sensor in this rule. This is a convenient sensor that checks if the value of a parameter in the rawData equals a given regular expression.

alarm

When a request to create an alarm comes in into the alarm service, it is checked if there is already an open (ACTIVE or ACKNOWLEDGED) alarm of the same type exists on the same source (resource). If there is such an open alarm, no new alarm is opened, but the count on the existing alarm is incremented. This avoids an explosion of open alarms if so error condition continues to happen.

We can write dedicated logic to update the alarm severity level depending on the occurrences count using the ‘UpdateWaylayAlarm’ actuator.

alarm

So each of the AlarmEventSensor nodes filters for alarms of a certain severity (warning/minor/major). The function nodes check if the count of the associated alarms has reached a certain threshold. If the count is above the threshold the updateWaylayAlarm actuators update the alarm to the “next” severity.

Notice also that when triggering the function nodes on “Updated” alarmEvent state, the task also “prohibits” manually setting the severity of an alarm to a lower severity than “allowed” by the count, i.e. if a user lowers the severity of an alarm, the rule triggers, checks the count and will update the severity again to the severity for the count. Changing the above template to only trigger the function nodes on “Occured again” state enables a user to lower the severity of an alarm.

alarm