Automating Node Metrics Collection Using Ansible Dynamic Inventory

Abhishek MishraAbhishek Mishra
5 min read

1. Introduction

In large-scale infrastructure environments, it’s common to manage dozens or even hundreds of servers or virtual machines. Suppose you’re asked to provide a node metrics report — including CPU usage, RAM usage, and available storage — for 100+ servers every day at a specific time (e.g., 3 PM). Doing this manually is inefficient and error-prone.

This project demonstrates how to automate the process using Ansible with a dynamic inventory file, specifically for AWS EC2 instances.


2. Use Case

  • Scenario: 100+ servers/VMs in AWS.

  • Goal: Automatically collect and report CPU, RAM, and storage metrics at 3 PM.

  • Solution: Ansible with AWS EC2 dynamic inventory and SSH key automation.


3. Environment Setup

3.1 Install Ansible on Master Server

We only need to install Ansible on the master (control) server — not on worker nodes.

sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install ansible -y

Write a Tag.sh script to get the ip address of the 10 VMs

Automating EC2 Instance Renaming with AWS CLI

When managing multiple EC2 instances in AWS, especially in environments like development or staging, consistent and meaningful naming is crucial. Instead of manually renaming instances via the AWS Console, we can automate the process with a simple Bash script that tags EC2 instances sequentially.

In this post, we’ll walk through a script that:

  1. Finds all running EC2 instances in a specific environment (e.g., dev)

  2. Sorts them in a predictable order

  3. Tags them with sequential names like web-01, web-02, etc.


The Script

#!/bin/bash

# Fetch instance IDs that match Environment=dev and Role=web
instance_ids=$(aws ec2 describe-instances \
  --filters "Name=tag:Environment,Values=dev" "Name=instance-state-name,Values=running" \
  --query 'Reservations[*].Instances[*].InstanceId' \
  --output text)

# Sort instance IDs deterministically
sorted_ids=($(echo "$instance_ids" | tr '\t' '\n' | sort))

# Rename instances sequentially
counter=1
for id in "${sorted_ids[@]}"; do
  name="web-$(printf "%02d" $counter)"
  echo "Tagging $id as $name"
  aws ec2 create-tags --resources "$id" \
    --tags Key=Name,Value="$name"
  ((counter++))
done


4. SSH Key Generation

Create an SSH key pair on the master node to enable passwordless login to worker nodes:

ssh-keygen -t rsa -b 4096 -C "Ansible-Master"

5. Inventory Configuration

5.1 Static vs Dynamic Inventory

  • Static: Best for a fixed, small set of servers.

  • Dynamic: Automatically fetches instances from AWS — ideal for large-scale, frequently changing environments.

5.2 Dynamic Inventory File (inventory/aws_ec2.yaml)

plugin: amazon.aws.aws_ec2
regions:
  - ap-south-1
filters:
  tag:Environment: dev
  instance-state-name: running
compose:
  ansible_host: public_ip_address
keyed_groups:
  - key: tags.Name
    prefix: name
  - key: tags.Environment
    prefix: env

6. Viewing Inventory

List all IP addresses and group hierarchy:

ansible-inventory -i inventory/aws_ec2.yaml --graph

7. Ansible Configuration

Disable interactive SSH host key checking:

[defaults]
inventory = ./inventory/aws_ec2.yaml
host_key_checking = False

[ssh_connection]
ssh_args = -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null

8. Injecting Public Keys into Worker Nodes

Use the following script to push the master node’s public key to all worker nodes:

#!/bin/bash

# Define vars
PEM_FILE="big-project.pem"
PUB_KEY=$(cat ~/.ssh/id_rsa.pub)
USER="ubuntu"  # or ec2-user
INVENTORY_FILE="inventory/aws_ec2.yaml"

# Extract hostnames/IPs from dynamic inventory
HOSTS=$(ansible-inventory -i $INVENTORY_FILE --list | jq -r '._meta.hostvars | keys[]')

for HOST in $HOSTS; do
  echo "Injecting key into $HOST"
  ssh -o StrictHostKeyChecking=no -i $PEM_FILE $USER@$HOST "
    mkdir -p ~/.ssh && \
    echo \"$PUB_KEY\" >> ~/.ssh/authorized_keys && \
    chmod 700 ~/.ssh && \
    chmod 600 ~/.ssh/authorized_keys
  "
done

9. Email Integration

  • Use an App Password (for Gmail/Outlook) to securely send automated reports via email.

  • The Ansible playbook can be configured to gather node metrics and send them at the specified time.


10. Run Ansible playbook

Collect-matrix.yaml

- name: Collect VM metrics
  hosts: env_dev
  become: true
  gather_facts: true
  tasks:

    - name: Install sysstat (for mpstat)
      apt:
        name: sysstat
        state: present
      when: ansible_os_family == "Debian"

    - name: Install sysstat (RedHat/CentOS)
      yum:
        name: sysstat
        state: present
      when: ansible_os_family == "RedHat"

    - name: Get CPU usage via mpstat
      shell: "mpstat 1 1 | awk '/Average/ && $NF ~ /[0-9.]+/ {print 100 - $NF}'"
      register: cpu_usage

    - name: Get memory usage
      shell: "free | awk '/Mem/{printf(\"%.2f\", $3/$2 * 100.0)}'"
      register: mem_usage

    - name: Get disk usage
      shell: "df / | awk 'NR==2 {print $5}' | tr -d '%'"
      register: disk_usage

    - name: Set metrics fact
      set_fact:
        vm_metrics:
          hostname: "{{ inventory_hostname }}"
          cpu: "{{ cpu_usage.stdout | float | round(2) }}"
          mem: "{{ mem_usage.stdout | float | round(2) }}"
          disk: "{{ disk_usage.stdout | float | round(2) }}"

playbook.yaml

- import_playbook: collect_metrics.yaml
- import_playbook: send_report.yaml

send_report.yaml

- name: Send consolidated VM report
  hosts: localhost
  gather_facts: true
  vars:
    collected_metrics: >-
      {{
        hostvars |
        dict2items |
        selectattr('value.vm_metrics', 'defined') |
        map(attribute='value.vm_metrics') |
        list
      }}
    timestamp: "{{ ansible_date_time.date }} {{ ansible_date_time.time }}"
    subject_line: "📊 VM Report – {{ ansible_date_time.date }} {{ ansible_date_time.hour }}:{{ ansible_date_time.minute }}"
  tasks:
    - name: Send animated HTML report via email
      mail:
        host: "{{ smtp_server }}"
        port: "{{ smtp_port }}"
        username: "{{ email_user }}"
        password: "{{ email_pass }}"
        to: "{{ alert_recipient }}"
        subject: "{{ subject_line }}"
        body: "{{ lookup('template', 'templates/report_email_animated.html.j2') }}"
        subtype: html

ansible-playbook playbook.yaml

11. Output

  • Dynamic inventory listing all active instances.

  • Automated SSH key injection for password less connections.

  • Scheduled Ansible playbook to collect CPU, RAM, and storage usage and email the report.


12. Conclusion

With Ansible dynamic inventory and automated SSH configuration, managing and monitoring 100+ AWS EC2 (Here I have used only 10 )instances becomes seamless. This approach not only saves time but also reduces the chances of manual errors.

1
Subscribe to my newsletter

Read articles from Abhishek Mishra directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Abhishek Mishra
Abhishek Mishra