IoT applications often make use of notifications via e.g. SMS or email but in some cases a native alarm functionality is required. Waylay supports such a native alarm service, which is exposed over REST and which has a user interface that is part of the Waylay administration console. The alarm functionality integrates with the rules engine, in the sense that Waylay supports actuators that allow you to create, update or clear alarms in an automated way based on the outcome of rules.
How can this feature be used in practice?
Alarms allow you to track incident persistency over time. The Waylay Alarm Service gives you the status and the count of the alarms, but also provides interfaces that allow you to acknowledge alarms, change the severity level or simply close the alarm. Since the alarm service is REST exposed, it is possible to integrate the alarm information within your own application.
Example of using the Alarm Service
The best way to describe this feature is by using one simple rule. We will create alarms as soon as a dice sensor turns odd states (
and clear all alarms of a given type (in this case type
dice gives back even states (
In case the alarm actuator is triggered and the alarm has previously been created but was not yet cleared, the count of the opened alarm will be increased by the Alarm Service. After running the task for a couple of minutes, this is what we can see:
if we look at one alarm that has the count higher than 1, we can see the following:
Listening to alarm events
The Waylay engine also listens to events sent by the alarm service and injects these events in tasks with AlarmEventSensors
which run for the resources on which the alarm was created.
By using the
AlarmEventSensor as a filter, more automated handling of alarms can be implemented.
Alarm severity escalation
As an example of this more automated handling, consider the following template that implements automatic alarm severity escalation, i.e. increase the severity of an alarm if the alarm happens a couple of times. Remember that, in order to avoid an explosion of open alarms, only 1 alarm of a certain type on the same resource can be open at any point in time. If the error condition leading to this alarm keeps on happening, a user probably would like to automatically increase its severity.
The template uses the AlarmEventSensor, the
Function sensor and the updateWaylayAlarm actuator.
The settings for each of them are :
Alarm Handling Service
Some devices can have multiple embedded sensors (location, motion, status, temperature, etc), and they can be used at the same time for different business use cases. It often happens that these business rules are very different and fire different alarms, such as motion alarm, or temperature alarm, or network outage, etc. Depending on the severity and type of these alarms, different alarm handling logic may be needed: e.g. simply log the minor alarms versus sending an sms for the critical alarms. The alarm handling service feature allows you to separate the alarm handling logic from the metric handling logic, which improves the readability and maintainability of the business logic.
Business rules can create alarms with a type and severity. The alarm service will publish them on the alarm channel and the alarm handling rules would listen on the alarm channel and process the alarms using the ‘AlarmEventSensor’. The AlarmEventSensor can use a filter criteria on a resource, severity or alarm type.
The below example shows a rule dedicated to handle alarms of different alarm types and severities.
We used the “IsParameterEqual” sensor in this rule. This is a convenient sensor that checks if the value of a parameter in the rawData equals a given regular expression.
When a request to create an alarm comes in into the alarm service, it is checked if there is already an open (ACTIVE or ACKNOWLEDGED) alarm of the same type exists on the same source (resource). If there is such an open alarm, no new alarm is opened, but the count on the existing alarm is incremented. This avoids an explosion of open alarms if so error condition continues to happen.
We can write dedicated logic to update the alarm severity level depending on the occurrences count using the ‘UpdateWaylayAlarm’ actuator.
So each of the AlarmEventSensor nodes filters for alarms of a certain severity (warning/minor/major). The function nodes check if the count of the associated alarms has reached a certain threshold. If the count is above the threshold the updateWaylayAlarm actuators update the alarm to the “next” severity.
Notice also that when triggering the function nodes on “Updated” alarmEvent state, the task also “prohibits” manually setting the severity of an alarm to a lower severity than “allowed” by the count, i.e. if a user lowers the severity of an alarm, the rule triggers, checks the count and will update the severity again to the severity for the count. Changing the above template to only trigger the function nodes on “Occured again” state enables a user to lower the severity of an alarm.
- Alarms (in any state) that are older than 1 year, will be automatically purged from the database.
- The history of the last 1000 event occurrences of alarms will be retained.
- The alarm count will will be kept after the purge and will never be decreased