How much am I saving?

Jenny VeddengJenny Veddeng
16 min read

Navigating Azure Cost Management Without EA or MCA Access

Compute services statistically represent the largest expense for many clients, a trend amplified by the growing demand for AI and high-performance computing workloads. However, not all users fully leverage available cost-saving commitments. Is the effort of measuring and optimizing costs worthwhile? Most likely, yes – and you might be surprised by potential savings. So, what can you do? Without EA or MCA agreements, CSP customers face challenges in leveraging Azure's full suite of cost management tools. The complexity and need increases with the size of your Azure environment, but so does the potential savings. This article focuses on measuring the cost savings across subscriptions, with some practical examples on what you can do if you want to get started with something lightweight.

Commit to spending, and save cost.

One of the challenges when deciding how much to commit to, is that Azure Advisor does not give you any advice on how much to commit to on a management group level for either reservations or savings plan. You are required to perform cost analysis at the individual subscription level, which simplifies savings plan creation for those subscriptions. However, this approach increases the effort and necessitates managing different commitment levels across various scopes.

Cost saving measures

Azure Savings Plan for Compute (generally available 2022) is a pricing model that provides customers with a way to save on compute costs by committing to a fixed hourly spend for one or three years, without having to specify a compute sku or region, as you do with the savings model most are familiar with; Reserved Instances (RI). Both switch to standard pay-as-you-go rates after meeting commitment levels.

  • Azure Reserved Instances: Offer savings of up to 72% compared to pay as you go.

  • Azure Savings Plan for Compute: Offers savings of up to 65% compared to pay as you go.

As with Reserved Instances, you specify a scope: resource group, subscription, management group, shared, and you can change the scope, but not the amount that you commit or the duration of the plan. RI is more forgiving when it comes to changes and cancellations, than savings plans commitments. A significant challenge with savings plans, especially, is predicting future consumption across various services for the next one to three years. I have been working with implementing Savings Plan for Compute for one of Microsoft’s largest customers in Norway, where development happens fast(!). Applications get deprecated and new products arrive. The risk of over-commitment is daunting without proper insights. This means one needs to keep track of consumption and add more savings plans as you go, which is a better approach than making one big commitment. Imagine having to do that for 100 subscriptions versus a few management groups.

To understand the following practical steps, it's important to grasp these two terms:

  • Actual Cost: Think of this like your regular monthly bill. It shows the exact amount you were charged and paid for everything you used during that specific month, including pay-as-you-go services and any big one-time purchases (like buying a 1-year reservation).

  • Amortized Cost: This takes those big, upfront purchases (like reservations) and spreads their cost evenly over the time they cover (e.g., over 12 months for a 1-year reservation). It helps you see a more consistent daily or monthly cost, reflecting the effective price of using those reserved resources each day, rather than seeing a huge spike in the month you bought the reservation.

I'll cover Azure Spot VMs in a future post, especially with the new ability to mix SKUs. Keep in mind that any Spot savings will also be reflected in these reports.

The practical part

I tested creating exports directly, storing them in a storage account and generate reports using Python. But what if you could get the insights you needed with just read permission, a lot more functionality and simplicity in creating your reports? Azure-cost-cli is a simple command line tool to get the cost of your Azure subscription. It uses the Azure Cost Management API to get the cost and output the results to the console, text, csv, markdown or JSON. So it can be used in a workflow to get the cost of your subscription and use it in subsequent steps.

You can add filters when you retrieve data, so I would for instance be able to only get the resources that have a certain tag; Which means you can more easily group together the subscriptions that belong to the same application, collect the cost reports for a certain owner and send it by mail, create a tag policy that would tag all resources that were covered by savings plan for compute, to improve your optimization report. Opportunities are endless.

First let’s get the data sets and store them in a local folder. Run for both AmortizedCost and ActualCost. What we want to do is combine the datasets and calculate the savings across my subscriptions by comparing Amortized and Actual data. We will be able to filter out the PricingModel, to make sure we are looking at one type at a time.

Amortized cost - Actual cost = Your savings

Installation is a one-liner : dotnet tool install --global azure-cost-cli

The script provided is a bit long, so for clarity: If you just want to test what you can do with the azure-cost-cli tool, this part is what you’re looking for. Run azure-cost --help to have a look at all possible parameters.

  cost_data=$(azure-cost dailyCosts \
      --subscription "$subscription_id" \
      --dimension "PricingModel" \
      --metric "$metric" \
      --timeframe Custom \
      --from "$start_date" \
      --to "$end_date" \
      --filter "$FILTER_STRING" \
      --output csv)
#!/usr/bin/env bash
set -euo pipefail

# ------------------------------------------------------------------------------
# Synopsis: Fetches Azure cost data for specified resource types across subscriptions.
#
# Description:
#   This script iterates through all accessible Azure subscriptions within a tenant,
#   fetches daily cost data for specific resource types (VMs, App Service Plans,
#   VMSS) for a defined date range, and saves the output as a CSV file for
#   each subscription. It skips subscriptions whose names start with "XSP".
#
#   The start date defaults to the first day of the previous month if not provided.
#   The end date is always the current day.
#
# Usage:
#   ./azure_cost_script.sh [START_DATE] [METRIC]
#
# Arguments:
#   START_DATE  (Required) The start date for the cost query in YYYY-MM-DD format.
#               Defaults to the first day of the previous month if not provided.
#   METRIC      (Required) The cost metric to use (e.g., 'AmortizedCost',
#               'ActualCost').
#
# Requirements/Dependencies:
#   - Azure CLI (`az`): Must be installed and logged in.
#   - azure-cost-cli (`azure-cost`): Must be installed (`dotnet tool install --global azure-cost-cli`).
#   - jq: Must be installed for JSON processing.
#   - date: GNU `date` command assumed for default date calculation (may need adjustment for macOS/BSD).
#   - Environment Variable: `AZURE_TENANT_ID` must be set if login is required.
#   - Permissions: Sufficient Azure RBAC rights to list subscriptions and query cost data.
#
# Examples:
#   # Run with default start date (previous month start) and AmortizedCost
#   ./azure_cost_script.sh "$(date -d '-1 month' +'%Y-%m-01')" AmortizedCost
#
#   # Run for specific start date with ActualCost metric
#   ./azure_cost_script.sh 2024-07-15 ActualCost
#
# Exit Codes:
#   0: Success.
#   1: Argument error, dependency missing, login failure, or other execution error.
#   2: Invalid date format provided.
# ------------------------------------------------------------------------------

# --- Global Variables / Constants ---
# Set default metric, can be overridden by argument
readonly DEFAULT_METRIC="AmortizedCost"
# Define the resource types to filter for
readonly FILTER_STRING="ResourceType=Microsoft.Compute/virtualMachines;Microsoft.Web/serverfarms;Microsoft.Compute/virtualMachineScaleSets"
# Directory to store output files
readonly OUTPUT_DIR="cost_exports"
# Regex for YYYY-MM-DD date format validation
readonly DATE_REGEX="^[0-9]{4}-[0-9]{2}-[0-9]{2}$"

# --- Functions ---

# Usage: _log_info "message"
# Prints an informational message with timestamp to stdout.
_log_info() {
  local message="$1"
  echo "[INFO] - ${message}"
}

# Usage: _log_error "message" [exit_code]
# Prints an error message with timestamp to stderr. Exits with provided code or 1 if code > 0.
_log_error() {
  local message="$1"
  local exit_code="${2:-1}" # Default exit code is 1
  # Send errors to stderr
  echo "[ERROR] - ${message}" >&2
  # Exit only if exit_code is greater than 0
  if [[ "$exit_code" -gt 0 ]]; then
     exit "$exit_code"
  fi
}

# Usage: _print_usage
# Prints the usage instructions to stderr.
_print_usage() {
    echo "Usage: $0 [START_DATE] [METRIC]" >&2
    echo "Arguments:" >&2
    echo "  START_DATE  (Required) Start date in YYYY-MM-DD format." >&2
    echo "              Defaults to the first day of the previous month if not provided." >&2
    echo "  METRIC      (Required) Cost metric (e.g., 'AmortizedCost')." >&2
    echo "Example: $0 $(date -d '-1 month' +'%Y-%m-01') ActualCost" >&2
}


# Usage: _check_command "command_name" "installation_instructions"
# Checks if a command exists in PATH, exits with error if not.
_check_command() {
  local cmd="$1"
  local instructions="$2"
  if ! command -v "$cmd" >/dev/null 2>&1; then
    _log_error "Required command '$cmd' not found in PATH." 0 # Log error but don't exit yet
    _log_error "Installation instructions: $instructions" 1 # Now exit
  fi
  _log_info "✓ Dependency '$cmd' found."
}

# Checks for all required software dependencies.
_check_requirements() {
  _log_info "Checking required software..."
  _check_command "az" "Follow instructions at https://docs.microsoft.com/cli/azure/install-azure-cli"
  _check_command "azure-cost" "Run 'dotnet tool install --global azure-cost-cli'"
  _check_command "jq" "Install via package manager (e.g., 'apt install jq', 'brew install jq', 'choco install jq')"
  _check_command "date" "GNU 'date' command required for default start date calculation." # Check for date command
}

# Checks if logged into Azure CLI, attempts login if necessary.
# Requires AZURE_TENANT_ID environment variable if login is needed.
_check_login() {
  if az account show &>/dev/null; then
    _log_info "Already logged in to Azure."
  else
    _log_info "Not logged in to Azure. Attempting login..."
    # Check mandatory environment variable for login
    : "${AZURE_TENANT_ID:?Error: AZURE_TENANT_ID environment variable must be set for login. Exiting.}"

    # Attempt login (set -e will handle failure)
    az login --tenant "$AZURE_TENANT_ID"
    _log_info "Azure login successful."
  fi
}

# Fetches and saves cost data for a single subscription.
# Arguments:
#   $1: subscription_id
#   $2: subscription_name (sanitized)
#   $3: metric
#   $4: start_date
_execute_cost_commands() {
  local subscription_id="$1"
  local subscription_name="$2"
  local metric="$3"
  local start_date="$4" # Receive start_date as an argument
  local end_date
  local output_file
  local cost_data

  _log_info "Processing Subscription: '$subscription_name' (ID: $subscription_id)"

  # Set the current subscription context (errors handled by set -e)
  az account set --subscription "$subscription_id" &>/dev/null
  _log_info "Set context to subscription '$subscription_name'"

  # Create the output directory if it doesn't exist
  mkdir -p "$OUTPUT_DIR"

  # Define the output file path
  output_file="${OUTPUT_DIR}/${subscription_name}_${metric}_cost.csv"

  # Calculate the end date (today)
  end_date=$(date +%Y-%m-%d)

  _log_info "Fetching cost data..."
  _log_info "  Metric: $metric"
  _log_info "  Date Range: $start_date to $end_date" # Use passed start_date
  _log_info "  Filter: $FILTER_STRING"

  # Run azure-cost command (errors handled by set -e)
  # Capture output into a variable
  cost_data=$(azure-cost dailyCosts \
      --subscription "$subscription_id" \
      --dimension "PricingModel" \
      --metric "$metric" \
      --timeframe Custom \
      --from "$start_date" \
      --to "$end_date" \
      --filter "$FILTER_STRING" \
      --output csv)

  # Check if any cost data was retrieved (check if variable is non-empty)
  if [[ -n "$cost_data" ]]; then
    # Overwrite the file with new data
    echo "$cost_data" > "$output_file"
    _log_info "Cost data saved to '$output_file'"
  else
    # Use _log_info for warnings that don't stop execution
    _log_info "Warning: No cost data retrieved for '$subscription_name'. Output file not created/updated."
  fi

  _log_info "Completed processing for '$subscription_name'."
  echo "------------------------------------------------------------------" # Separator
}

# --- Main Logic ---
main() {
  local metric="${DEFAULT_METRIC}" # Default metric
  local start_date=""
  local subscriptions_json
  local subscription_json
  local subscription_id
  local subscription_name_raw
  local subscription_name_sanitized

  # --- Positional Argument Parsing ---
  if [[ -n "$1" ]]; then
    if [[ "$1" =~ $DATE_REGEX ]]; then
      start_date="$1"
    else
      _log_error "Error: Invalid start date format '$1'. Required format: YYYY-MM-DD." 2
      _print_usage # Print usage info
      exit 1
    fi
  else
    # Handle the case where the first argument (START_DATE) is missing
    start_date=$(date -d "-1 month" +%Y-%m-01)
    _log_info "Start date not provided, defaulting to first day of previous month: $start_date"
  fi

  if [[ -n "$2" ]]; then
    metric="$2"
  else
    _log_error "Error: Cost metric not provided." 1
    _print_usage
    exit 1
  fi

  _log_info "Using provided start date: $start_date"
  _log_info "Using cost metric: $metric"

  # Check dependencies and login status
  _check_requirements
  _check_login

  # Get all subscriptions in the tenant (errors handled by set -e)
  _log_info "Fetching all subscriptions..."
  subscriptions_json=$(az account list --all --output json)

  # Check if the JSON array is empty or null using jq
  if echo "$subscriptions_json" | jq -e '. == null or length == 0' > /dev/null; then
    _log_error "No subscriptions found or failed to retrieve subscriptions." 1
  fi

  _log_info "Iterating through subscriptions..."
  echo "------------------------------------------------------------------"

  # Iterate through each subscription using jq
  echo "$subscriptions_json" | jq -c '.[]' | while IFS= read -r subscription_json; do
    subscription_id=$(echo "$subscription_json" | jq -r '.id')
    subscription_name_raw=$(echo "$subscription_json" | jq -r '.name')

    # Sanitize subscription name for use in filenames
    subscription_name_sanitized=$(echo "$subscription_name_raw" | sed 's/[^a-zA-Z0-9_-]/_/g')

    # Skip subscriptions starting with "XSP"
    if [[ "$subscription_name_sanitized" == XSP* ]]; then
      _log_info "Skipping subscription: '$subscription_name_raw' (starts with XSP)"
      echo "------------------------------------------------------------------"
      continue # Skip to the next iteration
    fi

    # Execute the cost command function, passing the (potentially defaulted) start_date
    _execute_cost_commands "$subscription_id" "$subscription_name_sanitized" "$metric" "$start_date"

  done # End of while loop

  echo "------------------------------------------------------------------"
  _log_info "Script completed successfully."
  echo "------------------------------------------------------------------"
}

# --- Script Entrypoint ---
# Pass all script arguments ($@) to the main function
main "$@"

exit 0

After generating the scripts you want, lets make a jupyter notebook and display our data! I’ve made the different parts into individual cells, to keep things modular as I try things out. First, lets load and combine the datasets.

import os
import pandas as pd

def load_and_combine_data():
    try:
        # Step 1: Load and Combine Data
        folder_path = 'cost_exports/'  # Specify the folder containing the CSV files

        # List all files in the folder
        all_files = os.listdir(folder_path)
        print("All files in the folder:", all_files)  # Debug: Display all files in the folder

        # Filter files based on suffixes
        amortized_files = [os.path.join(folder_path, f) for f in all_files if f.endswith('_AmortizedCost_cost.csv')]
        print("Filtered amortized files:", amortized_files) 

        actual_files = [os.path.join(folder_path, f) for f in all_files if f.endswith('_ActualCost_cost.csv')]
        print("Filtered actual files:", actual_files)  

        print(f"Found {len(amortized_files)} amortized files, {len(actual_files)} actual files ")

        #Load the data
        amortized_dfs = []
        for file in amortized_files:
            df = pd.read_csv(file)
            df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%Y')  # Convert date using the correct format
            df['SourceFile'] = os.path.basename(file)  # Add the source file name as a column
            amortized_dfs.append(df)

        actual_dfs = []
        for file in actual_files:
            df = pd.read_csv(file)
            df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%Y') # Convert date using the correct format
            df['SourceFile'] = os.path.basename(file)  # Add the source file name as a column
            actual_dfs.append(df)

        # Combine all savings plan files into one DataFrame
        if amortized_dfs:
            amortized_data = pd.concat(amortized_dfs, ignore_index=True)
            print(f"Combined savings plan data shape: {amortized_data.shape}")
        else:
            amortized_data = pd.DataFrame()
            print("No savings plan data found")

        if actual_dfs:
                    actual_data = pd.concat(actual_dfs, ignore_index=True)
                    print(f"Combined actual data shape: {actual_data.shape}")
        else:
            actual_data = pd.DataFrame()
            print("No actual data found")



        return amortized_data, actual_data

    except Exception as e:
        print(f"Error in load_data_from_azure_blob: {e}")
        return pd.DataFrame(), pd.DataFrame()

# Call the function to load the data
amortized_data, actual_data = load_and_combine_data()

# Now you can process these dataframes as needed
#Create a comparison of amortized and actual data

# Ensure Date is in datetime format
actual_data['Date'] = pd.to_datetime(actual_data['Date'])
amortized_data['Date'] = pd.to_datetime(amortized_data['Date'])

# Summarize actual costs per month
actual_monthly_cost = actual_data.groupby(actual_data['Date'].dt.to_period('M'))['Cost'].sum()

# Summarize savings plan costs per month
amortized_monthly_cost = amortized_data.groupby(amortized_data['Date'].dt.to_period('M'))['Cost'].sum()

# Combine actual and savings plan costs for comparison
comparison_df = pd.DataFrame({
    'Amortized Cost': amortized_monthly_cost.round(0).astype(int),
    'Actual Cost': actual_monthly_cost.round(0).astype(int)
}).fillna(0)  # Fill NaN values with 0 for months where data might be missing

# Calculate savings per month
comparison_df['Savings'] = comparison_df['Amortized Cost'] - comparison_df['Actual Cost']

# Specify the date range for total savings calculation
start_date = '2024-08'  # Adjust as needed
end_date = '2025-04'    # Adjust as needed

# Filter the data for the specified date range
filtered_df = comparison_df.loc[start_date:end_date]

# Calculate total savings within the date range
total_savings = filtered_df['Savings'].sum()

# Print results
print("Monthly Costs and Savings Comparison in NOK:")
print(comparison_df)

print(f"\nTotal Savings from {start_date} to {end_date}: {total_savings} NOK")

Example output: Here, I started purchasing savings plans in, guess what month?

The number of hours varies pr month, so the total amount commited (as its by the hour) will not be the same pr month.

I’m no data scientist, but lets make that into a graph

import matplotlib.pyplot as plt

# Prepare data for visualization
comparison_df.reset_index(inplace=True)  # Reset index to use Date as a column
comparison_df['Date'] = comparison_df['Date'].astype(str)  # Convert to string for better labeling

# Plot the data
plt.figure(figsize=(12, 6))
bar_width = 0.25
x = range(len(comparison_df['Date']))

# Bar plots for Actual Cost, Amortized Cost, and Savings
plt.bar(x, comparison_df['Actual Cost'], width=bar_width, label='Actual Cost (NOK)', color='#a6cee3')
plt.bar([p + bar_width for p in x], comparison_df['Amortized Cost'], width=bar_width, label='Amortized Cost (NOK)', color='#fdbf6f')
plt.bar([p + 2 * bar_width for p in x], comparison_df['Savings'], width=bar_width, label='Savings (NOK)', color='#b2df8a')

# Add labels and title
plt.xlabel('Month', fontsize=12)
plt.ylabel('Cost (NOK)', fontsize=12)
plt.title('Monthly Comparison of Actual Cost, Amortized Cost and Savings', fontsize=14)
plt.xticks([p + bar_width for p in x], comparison_df['Date'], rotation=45)
plt.legend()

# Show the plot
plt.tight_layout()
plt.show()

Lets take a look at the pricing models and show some things you can play around with.

# Function to analyze pricing models and provide insights
def analyze_pricing_models(amortized_data):
    """Analyze pricing models including OnDemand, Reservations, and other types"""

    print("=== Azure Cost Analysis ===\n")

    # 1. Check OnDemand Usage
    zero_ondemand_months = amortized_data[amortized_data['Name'] == 'OnDemand'].groupby(
        amortized_data['Date'].dt.to_period('M')
    )['Cost'].sum()
    zero_ondemand_months = zero_ondemand_months[zero_ondemand_months == 0]

    if not zero_ondemand_months.empty:
        print("⚠️ Warning: Savings plan commitment exceeded actual usage in:")
        print(list(zero_ondemand_months.index))

    # 2. Analyze All Pricing Models
    total_costs = amortized_data.groupby('Name')['Cost'].sum()
    total_cost = total_costs.sum()
    pricing_model_percentages = (total_costs / total_cost) * 100

    # 3. Check for Reservations
    reservation_terms = ['Reserved', 'RI', 'Reservation']
    has_reservations = amortized_data['Name'].str.contains('|'.join(reservation_terms), 
                                                         case=False, 
                                                         na=False).any()

    # 4. Print Analysis Results
    print("\n📊 Cost Distribution by Pricing Model:")
    model_df = pricing_model_percentages.to_frame(name="Percentage (%)")
    model_df['Cost (NOK)'] = total_costs
    print(model_df)

    if has_reservations:
        reservation_data = amortized_data[
            amortized_data['Name'].str.contains('|'.join(reservation_terms), 
                                              case=False, 
                                              na=False)
        ]
        print("\n🔍 Reservation Details:")
        print(reservation_data.groupby('Name')['Cost'].agg(['count', 'sum']))

    # 5. Optimization Suggestions
    print("\n💡 Optimization Suggestions:")
    if pricing_model_percentages.get('SavingsPlan', 0) < 50:
        print("- Consider increasing savings plan commitment (currently <50%)")
    if not has_reservations:
        print("- No reservations found. Consider Reserved Instances for predictable workloads")

    # 6. Visualize Distribution
    plt.figure(figsize=(10, 6))
    plt.subplot(1, 2, 1)

    # Pie chart
    plt.pie(
        pricing_model_percentages,
        labels=pricing_model_percentages.index,
        autopct='%1.1f%%',
        startangle=140,
        colors=['#a6cee3', '#fdbf6f', '#b2df8a', '#fb9a99']
    )
    plt.title("Cost Distribution by Pricing Model")

    plt.tight_layout()
    plt.show()

# Run the analysis
analyze_pricing_models(amortized_data)

All this will give you insight and overview, but if you’d want to use it to calculate the ideal commitment level you’d want, you need to consider that all compute sku’s are not covered by savings plan for compute. Meaning that some of the compute resources that generate cost can’t have savings plans. I have requested the ability to filter for Sku’s in the azure-cost-cli tool, and have my fingers crossed for the functionality. I mentioned that tags can be useful, but I haven’t found that the tool would support group by tag and retrieving the PricingModel in one go. For that, I had to use “DailyCosts”.

The next step, after I decide on all the things I’d like to see and how to illustrate it, I will most likely be integrating it into our Azure Platform portal, running on Azure Container Apps, where we like to centralize self service operations and insights into Azure information that is not easily available using the Azure portal. But it is just as possible to generate reports in a pipeline and for instance publish markdown reports with visuals to a specific page in f.instance Confluence, and/or just send the reports via mail. Imagination is the limit.

Recommendations:

  • Get started with trying out the tool and what insights you can get.

  • Determine what you want to do with the data.

  • Assess your landscape and future plans. Plan to start with Reserved Instances, because you can get higher discounts with RI and because it is easier to right size your reservation if you know the usage pattern and what future the plans for individual resources are. Plan to combine the commitment with savings plan for compute.

  • After getting a clear view of your past and future usage, make an azure reservations commitment, and after monitoring your usage with the reservation, then decide how much of the remaining compute is left to base your savings plan commitment on.

  • Keep monitoring and adding more commitments.

  • Explore other cost savings measures and focus on right sizing your resources.

0
Subscribe to my newsletter

Read articles from Jenny Veddeng directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jenny Veddeng
Jenny Veddeng