I am creating a Cloudwatch metric alarm for CPU utilization metric with a threshold of 80% and with comparison operator “GreaterThanOrEqualToThreshold” as below:-
resource "aws_cloudwatch_metric_alarm" "ec2-high-cpu-alarm" {
count = length(local.monitor_instance_tags)
alarm_name = "ec2-high-cpu-alarm-for-${local.monitor_instance_tags[count.index]}"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "1"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = var.alarm_monitor_period
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ec2 cpu utilization exceeding 80%"
dimensions = {
InstanceId = local.monitor_instance_ids[count.index]
}
alarm_actions = local.alarm_notify_actions
ok_actions = local.alarm_notify_actions
insufficient_data_actions = []
actions_enabled = true
}
So with this alarm, I get “ALARM” state notifications as the threshold is exceeded. But notification messages do not expose the “current/present” metric value.
My requirement is to receive a “WARNING” notification when the threshold is between 80% to 90%. After crossing the 90% threshold, I need to receive a critical “ALARM” notification. Ideally, it would have been better if Cloudwatch alarms had an additional ALARM state as WARNING; but currently, there are only 3 states possible: OK, INSUFFICIENT_DATA, ALARM.
Case 1:
Question: How to check the current threshold value in the ALARM state?
In that case, I would customize the alarm notification message (alarm_name or description) to add a “WARNING”/“CRITICAL ALARM” label.
Case 2:
Question: Else, is it possible to get the notification message along with the current metric value?
So the receiver will understand the severity of the issue.
NOTE: I want to avoid adding multiple alarms for the same metric with just different threshold values and required notification messages.
Please suggest ways to handle this requirement.