Issue
I'm running an AWS EC2 m5.large (a none burstable instance). I have setup one of AWS CloudWatch's default metrics (CPU %) + some custom metrics (memory + disk usage) in my dashboard.
But when I compare the numbers CloudWatch report to me they are pretty far from then actually usage of the Ubuntu 20.04 server when I log in to it...
Actual usage:
CPU: ~ 35 %
Memory: ~ 33 %
CloudWatch report:
CPU ~ 10 %
Memory: ~ 50-55
https://www.screencast.com/t/o1nAnOFjVZW
I have followed AWS own instructions to add the metrics for memory and disk usage (Because CloudWatch doesn't out of the box have access to O/S level stuff): https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/mon-scripts.html
When numbers are so far from each other - then it would be impossible to setup useful alarms and notifications. I can't believe that is what AWS wants to provide to the people who chose to followed their original instructions? The only thing with match exactly is the disk usage %.
Solution
HOW TO INSTALL AWS AGENT AT UBUNTU 20.04 (NEWER WAY INSTEAD OF THE OLD SCRIPT: "CloudWatchMonitoringScripts")
1. sudo wget https://s3.amazonaws.com/amazoncloudwatch-agent/debian/amd64/latest/amazon-cloudwatch-agent.deb
2. sudo dpkg -i -E ./amazon-cloudwatch-agent.deb
3. sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
4. Go through all the steps in the wizard (The result is saved here: /opt/aws/amazon-cloudwatch-agent/bin/config.json)
Hint: I answered:
- Default to most questions and otherwise:
- NO --> Do you want to store the config in the SSM parameter store? (Because when I answered YES it failed later on because of some permission-issue and I didn't know how to make it happy and I don't think I need SSM in regards to this)
- YES --> Do you want to turn on StatsD daemon?
- YES --> Do you want to monitor metrics from CollectD?
- NO --> Do you have any existing CloudWatch Log Agent?
Now to prevent this error: Error parsing /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml, open /usr/share/collectd/types.db: no such file or directory https://github.com/awsdocs/amazon-cloudwatch-user-guide/issues/1
5. sudo mkdir -p /usr/share/collectd/
6. sudo touch /usr/share/collectd/types.db
7. sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
8. /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status
{
"status": "running",
"starttime": "2020-06-07T10:04:41+00:00",
"version": "1.245315.0"
}
- https://www.screencast.com/t/42VWgoS88Z (Create IAM role, add policies and make it the server default role).
- https://www.screencast.com/t/fAUUHCPe (CloudWatch new custom metrics)
- https://www.screencast.com/t/8J0Saw0co (data match OK now)
- https://www.screencast.com/t/x0PxOa799 (data match OK now)
I realized - that the second I login to the machine the CPU % usage goes up from 10 % to 30% and stays there (of course some increase was to be expected - but not that much in my opinion) which in my case explains the big difference earlier...I honestly don't now if this way in more precise than the older script - but this should be the right way to do it in year 2020 :-) And you get access to 179 custom metrics when selecting "Advanced" during the wizard (even though only few would be valuable to most people)
Answered By - PabloDK