AWS CloudWatch Alarms
Overview
AWS CloudWatch Alarms is a monitoring tool that helps you create alarms on your AWS Resouces and AWS Services. This enables you to be alerted immediately when issues occur in your infrastructure. AWS CloudWatch Alarms can be configured for any AWS CloudWatch metric like AWS EC2 Instance's CPU Utilization or the number of 5XX Errors on an AWS Application Load Balancer. AWS CloudWatch Alarms can automatically send alerts via AWS Simple Notification Service when alarms are triggered.
Introduction to AWS CloudWatch Alarms
AWS CloudWatch is an extensive observability platform built into AWS. It can collect metrics and logs, monitor AWS Resources and `AWS Services, act to operational changes, and help you analyze your metrics. It offers many services like Dashboards, Events, Logs, and the topic for this article - is AWS CloudWatch Alarms.
AWS CloudWatch Alarms is part of the monitoring tools that AWS CloudWatch offers. AWS CloudWatch Alarms, as the name suggests, allows you to create alarms to monitor different AWS Resouces and AWS Services with certain conditions. These alarms can be configured to alert you or your team when the specified conditions are met. AWS CloudWatch Alarms can be used for simple use cases like monitoring the CPU Usage of an AWS EC2 Instance to more complex ones like anomaly detection using past metric data.
Concepts in CloudWatch Alarms
Before we jump into using AWS CloudWatch Alarms, we need to understand the following concepts:
Taken from here
- Namespaces: A namespace is essentially a group of CloudWatch metrics. The AWS namespaces typically use the following naming convention: AWS/service. For example, AWS EC2 uses the AWS/EC2 namespace.
- Metrics: Metrics are the fundamental component in CloudWatch. A metric represents the data points that are published to CloudWatch over some time. For example, the CPU usage of a particular AWS EC2 inEC2ce is one metric provided by AWS EC2.
- Units: Each metric has a unit of measure - this can be Bytes, Seconds, Count, and Percent. AWS CloudWatch associates a unit with each metric.
- Periods: A period is the length of time associated with a specified CloudWatch statistic. Each statistic represents an aggregation of the metrics data collected for a specified period Periods is defined in a number of seconds. For example, to specify ten minutes the period would be 600.
Features of CloudWatch Alarms
Let's take a look at the features of AWS CloudWatch Alarms:
- Metric Alarms: A metric alarm monitors a single CloudWatch metric or the result of a math expression based on CloudWatch metrics. For example, a metric alarm can be set for an AWS EC2 Instance when the CPU utilization is over 60% for three periods of five minutes.
- High-Resolution Alarms: Instead of collecting metrics, you can enable metrics to be collected within ten seconds or thirty seconds. These are called High-Resolution metrics, and alarms can be set against these High-Resolution metrics. High-Resolution metrics cost more than normal metric alarms. For example, a high-resolution alarm can be set for an AWS EC2 Instance when the CPU utilization is over 60% for three periods of thirty seconds.
- Composite Alarms: A composite alarm takes into account the alarm states of a combination of multiple alarms. This combination is defined by a rule expression. The composite alarm goes into the ALARM state only if all conditions of the rule expression are met. For example, a composite alarm can be set for an AWS EC2 Instance when the CPU utilization is over 60% and disk write bytes operations are over 2000 for three periods of five minutes.
- Alerts: You can add alerts and notifications for when the state of a CloudWatch Alarm changes. These alerts can be set up easily using an AWS Simple Notification Service (SNS) topic. The AWS SNS topic can be configured to send notifications as SMS or email.
- Alarm Actions: Other than Alerts, you can also perform some other actions when the CloudWatch Alarm state changes. For EC2-based metrics, you can perform EC2 actions such as stopping or terminating instances. You can also perform actions to scale an Auto Scaling group.
How are AWS CloudWatch Alarms Evaluated?
A metric alarm can have the following possible states:
- OK: The metric is within the defined threshold
- ALARM: The metric is outside of the defined threshold
- INSUFFICIENT_DATA: The metric is not available or not enough data is available for the metric to determine the alarm state
Let's take a look at how AWS CloudWatch Alarms are evaluated:
- When you create an alarm, you need to specify a threshold and three settings:
- Threshold: The value of the metric that CloudWatch compares when evaluating an alarm. If a period's value is greater than the threshold, then the metric is considered as breached or crossed for that period or data point. For example, for an AWS EC2 Instance metric alarm on the CPU Utilization, the threshold can be set to 60%.
- Period: The length of time to evaluate the metric or express, in seconds.
- Evaluation Period: The number of the most recent periods of data points.
- Datapoints to Alarm: The number of data points within the Evaluation Period that must be crossed to go to the ALARM state.
- The "Datapoints to Alarm" value must be less than or equal to the "Evaluation Period". For example, let's take one minute, an Evaluation Period of five, and Datapoints to an Alarm of three.
- After you have created the AWS CloudWatch Alarm, AWS starts to continuously monitor the metric as per the defined period.
- Initially the CloudWatch Alarm is in the INSUFFICIENT_DATA state as there is not enough data to determine the state.
- When the Evaluation Period has passed, CloudWatch checks if the threshold value has been crossed three consecutive times.
- If no, the alarm state is maintained as OK.
- If yes, the alarm state is changed to ALARM. If there are notifications configured, CloudWatch automatically triggers these notifications.
- This repeats indefinitely until the alarm is stopped or deleted.
Advanced AWS CloudWatch Alarms
AWS CloudWatch supports two advanced alarms:
- Composite Alarms: As described earlier, Composite Alarms use a combination of alarms to determine the final state of the Composite Alarm. This reduces the alarm noise and you will receive a single alarm notification instead of one for each affected resource.
- Anomaly Detection: AWS CloudWatch Anomaly Detection applies Machine Learning models to continuously analyze metric data and identify anomalous behavior. AWS CloudWatch continuously captures various AWS Resource metrics, that it uses to generate trend patterns. With these trend patterns, AWS can predict anomalies in your metric data. You can then create AWS CloudWatch Alarms that adjust thresholds based on these patterns, such as time, day of the week, the current season, or trends.
Configuring How CloudWatch Alarms Treat Missing Data
There may be instances where not every data point for a metric gets reported to CloudWatch. For example, this can happen when a network connection is lost or an instance is abruptly restarted. CloudWatch provides options to specify how to treat missing data points when evaluating an alarm. This lets you avoid false positives when missing data doesn't indicate a problem.
For each alarm, CloudWatch can be configured to treat missing data points as any of the following options:
- Not Breaching: The missing data points are considered as "good", that have not crossed the threshold
- Breaching: The missing data points are considered "bad", that have crossed the threshold
- Ignore: The current alarm state is maintained
- Missing: The default state, considers the data as missing. If all the data points in the alarm evaluation range are missing, the alarm transitions to the INSUFFICIENT_DATA state.
You need to choose the right option based on the metric you are evaluating and the circumstances. When evaluating an AWS CloudWatch Alarm with missing data, the same steps as a normal AWS CloudWatch Alarm are followed - except the option you choose above is applied accordingly.
Getting Started with AWS CloudWatch Alarms
Let's get started with using AWS CloudWatch Alarms by creating a simple alarm for CPU Utilization on an AWS EC2 Instance. This CloudWatch Alarm will be triggered if the CPU Utilization of the EC2 Instance crosses 60% over three consecutive periods of five minutes.
Pre-Requisites
- An AWS Account with an AWS EC2 Instance
If you do not have an AWS EC2 Instance, create one by following the instructions in this link.
Steps
- log in to your AWS Account. Go to the AWS Region where you have your AWS EC2 Instance.
- Open the AWS Console. Search for "CloudWatch" in the Search Bar. Select CloudWatch.
- On the left navigation, select All alarms under Alarms. Then click Create alarm.
-
Click on Select metric.
-
Search for "CPUUtilization" in the search bar. Find and select your AWS EC2 instance. Then click Select metric
-
Configure the threshold as "60" to represent the 60% CPU utilization threshold value. Configure the Datapoints to alarm to be "3 out of 3". Keep all other settings as the default values. Click "Next".
7. In the "Configure actions" section, you can set up notifications to an SNS topic. For now, you can click Remove to clear all notifications. Then click "Next". 8. In the "Add name and description" section, add a name. Then click "Next". 9. Review the configuration and then click Create alarm. 10. Your AWS CloudWatch Alarm will be created and you will be able to monitor the alarm using the AWS Console.
How to Delete AWS CloudWatch Alarms?
You can follow the below steps to delete an AWS CloudWatch Alarm.
Steps
- log in to your AWS Account.
- Open the AWS Console. Search for "CloudWatch" in the Search Bar. Select CloudWatch.
- On the left navigation, select All alarms under Alarms.
- Select the Alarm you want to delete.
- Go to Options. Click Delete.
- After confirming you want to delete the CloudWatch Alarm, AWS will delete the alarm in a few seconds.
Pricing
AWS CloudWatch Alarm, like other `AWS CloudWatch services, does not have any up-front costs. You only pay for what you use, every month.
As part of the AWS Free Tier, you get 10 AWS CloudWatch Alarms. High-Resolution alarms are excluded.
AWS CloudWatch Alarms are prorated by the hour and you are charged only while the alarms are running. The pricing model is:
- Standard Metric Alarm: $0.10 per alarm metric
- High Resolution Metric Alarm: $0.30 per alarm metric
- Composite Alarm: $0.50 per composite alarm
Note: The above values are provided for us-west-2 (Oregon). The pricing might vary in different AWS Regions.
Conclusion
- AWS CloudWatch is a monitoring tool that lets you create alarms on AWS Resouces and AWS Services.
- AWS CloudWatch Alarms have based on a few core CloudWatch concepts - namespaces, metrics, units, and periods.
- AWS CloudWatch Alarms have the following features - metric alarms, high-resolution alarms, composite alarms, alerts, and alarm actions.
- AWS CloudWatch Alarms can be in three different states - OK, ALARM, and INSUFFICIENT_DATA.
- There are two advanced AWS CloudWatch Alarms - composite alarms and anomaly detection.
- AWS CloudWatch Alarms can treat missing data based on four options - missing, not breaching, breaching, and ignoring
- AWS CloudWatch charges only for what you use and there are no upfront costs.