HASNODE Documentation: Server Health Monitor

Nishant KumarNishant Kumar
4 min read

📜 History

"I built this script after my server crashed mid-game because I ignored a full disk. Now it’s my overzealous guardian angel—it nags me via email before things explode. Turns out, logs are cheaper than therapy."


👤 Author

Name: Nishant Yadav
Background: "Bash scripting hobbyist who treats terminal errors like cryptic poetry. I still panic when I see ‘segmentation fault’, but at least my scripts email me about it now."
Social: GitHub | Email


📖 Summary

A Bash script that monitors CPU, RAM, and disk usage like a paranoid sysadmin. Sends email alerts when thresholds are breached and logs everything (because trust issues).

Key Features:

  • Configurable thresholds (no code editing required).

  • Anti-spam logic: Alerts only after ALERT_THRESHOLD breaches.

  • Logs: For when you need receipts to prove your server is dramatic.


📝 Notes

The Struggle™:

  • “Debugging Bash floats is like asking a toaster to do calculus. Thank you, bc -l, for existing.”

  • “Gmail blocked my alerts until I learned about app passwords. Now my script is basically a tattletale.”

Future Plans:

  • Discord Alerts: “Replace emails with a bot that posts ‘RIP SERVER’ in #general.”

  • Panic Mode: “Auto-delete my Minecraft world if disk hits 95%. Priorities.”


🎯 Objectives

  1. Primary: “Prevent 2 AM server meltdowns with passive-aggressive emails.”

  2. Secondary: “Make logs so detailed they could be a Netflix documentary.”


🔌 Dependencies

  • Tools: mailutils, bc (for math that Bash can’t handle).

  • Tested On: Ubuntu/Debian. “If it breaks on Arch, blame the AUR gremlins.”


🚀 Examples

bash

Copy

Download

# Run a health check (like a responsible adult):  
./health-monitor.sh  

# Trigger a fake alert to test your email setup (chaos mode):  
MAX_CPU=5 && ./health-monitor.sh  # Prepare for spam!

💻 The Code

"Here’s the script—comments included because I forget how my own code works."

bash

Copy

Download

#!/bin/bash

# Configuration (because hardcoding is for amateurs)
CONFIG_FILE="server-health.conf"  # Thresholds live here
LOG_FILE="health.log"             # Server's diary
ALERT_FILE="alerts.log"           # Panic log
MAX_CPU=90                        # "Why is my CPU crying?" threshold
MAX_MEM=85                        # RAM's breaking point
MAX_DISK=90                       # When your disk becomes a hoarder
ADMIN_EMAIL="nishantyadav2207@gmail.com"  # Where tears go
ALERT_THRESHOLD=1                 # How many warnings before spamming you

ALERT_COUNT=0                     # Tracks how mad the script is

# Load config (or cry trying)
load_config() {
    [ -f "$CONFIG_FILE" ] && source "$CONFIG_FILE"
}

# Check CPU Usage (spoiler: it's always 100%)
check_cpu() {
    cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print 100 - $8}')
    printf -v cpu_usage "%.1f" "$cpu_usage"
    echo "CPU Usage: $cpu_usage%"  # Debug line for existential crises

    if (( $(echo "$cpu_usage > $MAX_CPU" | bc -l) )); then
        log_alert "CPU" "$cpu_usage" "$MAX_CPU"
    fi
}

# Check Memory Usage (where did all the RAM go?)
check_memory() {
    mem_usage=$(free -m | awk '/Mem:/ {print ($3/$2)*100}')
    printf -v mem_usage "%.1f" "$mem_usage"
    echo "Memory Usage: $mem_usage%"  # Debug line for denial

    if (( $(echo "$mem_usage > $MAX_MEM" | bc -l) )); then
        log_alert "Memory" "$mem_usage" "$MAX_MEM"
    fi
}

# Check Disk Usage (spoiler: it's always /tmp)
check_disk() {
    disk_usage=$(df -P / | awk 'NR==2 {gsub("%", "", $5); print $5}')
    echo "Disk Usage: $disk_usage%"  # Debug line for hoarders

    if [ "$disk_usage" -gt "$MAX_DISK" ]; then
        log_alert "Disk" "$disk_usage" "$MAX_DISK"
    fi
}

# Log alerts and maybe send an email (you've been warned)
log_alert() {
    local metric=$1
    local value=$2
    local max=$3
    local message="$metric usage high: ${value}% (Threshold: ${max}%)"

    echo "$(date) - WARNING: $message" >> "$ALERT_FILE"
    ((ALERT_COUNT++))

    if [ "$ALERT_COUNT" -ge "$ALERT_THRESHOLD" ]; then
        send_notification "$message"
        ALERT_COUNT=0  # Reset counter (no spam zone)
    fi
}

# Send email (requires mailutils setup. Good luck.)
send_notification() {
    local message=$1
    echo "$message" | mail -s "Server Alert" "$ADMIN_EMAIL"
}

# Generate a report nobody will read (but it's pretty)
generate_report() {
    echo "----- Server Health Report -----" >> "$LOG_FILE"
    echo "Timestamp: $(date)" >> "$LOG_FILE"
    echo "CPU: ${cpu_usage}% (Max: ${MAX_CPU}%)" >> "$LOG_FILE"
    echo "Memory: ${mem_usage}% (Max: ${MAX_MEM}%)" >> "$LOG_FILE"
    echo "Disk: ${disk_usage}% (Max: ${MAX_DISK}%)" >> "$LOG_FILE"
    echo "--------------------------------" >> "$LOG_FILE"
}

# Main function (where the magic happens)
main() {
    load_config
    check_cpu
    check_memory
    check_disk
    generate_report
}

# Let's roll!
main

💬 Human Touch

“This script is like a grumpy roommate who texts you ‘CLEAN YOUR DISK’ at 3 AM. It’s not perfect, but it works—kinda like my sleep schedule. Star the repo if you’ve ever cried over a segmentation fault!”


Latest Code: GitHub Repo
PS: If it breaks, blame the gremlins. Or capitalism.

0
Subscribe to my newsletter

Read articles from Nishant Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nishant Kumar
Nishant Kumar

Hi, I'm Nishant—a beginner coder and tech enthusiast. I enjoy learning and working with Python, Linux, and Git. I'm also starting to explore DevOps to learn how to build, deploy, and maintain software efficiently. I use trusted resources like Harvard’s CS50P and ChatGPT to guide my learning journey. On my Hashnode page, I share simple tips and insights about coding and DevOps practices as I grow my skills each day. Join me on my journey to learn and share knowledge!