Playing Around With Dictionaries in Python

JasonJason
8 min read

For work, I have had an interesting project where I had to pass through multiple AWS Backup Plans into a Python class to make all my Backup Plans; these backup plans had so many options that needed configuring that the most logical way of passing the variables through was with a dictionary. It's a bit like a Struct in GoLang.

When approaching this task, the first milestone was to get Python to read through the dictionary. I did this by hardcoding the dictionaries into my Python class and choosing a toggle to select that dictionary to see if Python was reading my dictionaries correctly.

The code below shows the passing of the toggle backup_type and the hard coding of the dictionary.

class BackupPlanCreator:
    def __init__(self, vault_name: str, base_tags: dict, backup_type: str):
        self.vault_name = vault_name
        self.base_tags = base_tags
        self.backup_type = backup_type

        # Create the backup vault
        self.backup_vault = aws.backup.Vault(
            resource_name="backupVault",  # Unique resource name
            name=self.vault_name,  # Actual vault name
            tags=self.base_tags,
        )

        # Define configurations for each backup type
        self.backup_configs = {
            "hourly": {
                "name": "bkp-hourly-ret7-jc",
                "rule_name": "hourly-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 * * * ? *)",  # Every hour
                "delete_after_days": 7,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
            "daily": {
                "name": "bkp-daily-ret14-jc",
                "rule_name": "daily-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 5 ? * * *)",  # Daily at 5 AM
                "delete_after_days": 30,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
            "weekly": {
                "name": "bkp-weekly-ret365-jc",
                "rule_name": "weekly-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 6 1 * ? *)",  # Every Monday at 5 AM
                "delete_after_days": 365,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
            "monthly": {
                "name": "bkp-monthly-ret365-jc",
                "rule_name": "monthly-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 6 1 * ? *)",  # 1st of every month at 5 AM
                "delete_after_days": 365,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
        }

Below is the same class reading through the dictionary and compiling the backup_rules for the ultimate plan.

# Get the configuration for the specified backup type
        config = self.backup_configs.get(self.backup_type)
        if not config:
            raise ValueError(
                "Invalid backup type specified. Choose from: 'hourly', 'daily', 'weekly', 'monthly'."
            )

        # Define the backup rules using the configuration from the dictionary
        backup_rules = [
            {
                "rule_name": config["rule_name"],
                "target_vault_name": self.backup_vault.name,
                "completion_window": config["completion_window"],
                "copy_actions": [
                    {
                        "destination_vault_arn": self.backup_vault.arn,
                        "lifecycle": {
                            "cold_storage_after": 0,
                            "delete_after": config["delete_after_days"],
                            "opt_in_to_archive_for_supported_resources": False,
                        },
                    }
                ],
                "enable_continuous_backup": False,
                "lifecycle": {
                    "cold_storage_after": 0,
                    "delete_after": config["delete_after_days"],
                    "opt_in_to_archive_for_supported_resources": False,
                },
                "schedule": config["schedule"],
                "schedule_expression_timezone": "UTC",
                "start_window": config["start_window"],
            }
        ]

        # Define the advanced backup settings
        advanced_backup_settings = [
            {
                "backup_options": {
                    "WindowsVSS": (
                        "enabled" if config["enable_windows_vss"] else "disabled"
                    ),
                },
                "resource_type": "EC2",
            }
        ]

As you can see, a for loop is not currently needed, as I am only seeing if Python can read through and select the correct dictionary. I then pass backup_type to the Python Backup Plan Function as a string. This one is daily to choose the daily backup plan out of the dictionary.

# Create the backup plan
        backup_creator = backup.BackupPlanCreator(vault_name, base_tags, "daily")

This is great and shows me that I can process a dictionary. However, I want to take this a step further and pass the dictionary around. To achieve this, I moved the dictionary out of the class and into a variable to see if I could pass it through and get the same result.

import pulumi
import pulumi_aws as aws
import jc_aws_backup as backup

# Define configurations for each backup type
   backup_configs = {
            "hourly": {
                "name": "bkp-hourly-ret7-jc",
                "rule_name": "hourly-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 * * * ? *)",  # Every hour
                "delete_after_days": 7,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
            "daily": {
                "name": "bkp-daily-ret14-jc",
                "rule_name": "daily-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 5 ? * * *)",  # Daily at 5 AM
                "delete_after_days": 30,
                "start_window": 60,
                "completion_window": 125,
                "enable_windows_vss": True,
            },
            "weekly": {
                "name": "bkp-weekly-ret365-jc",
                "rule_name": "weekly-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 6 1 * ? *)",  # Every Monday at 5 AM
                "delete_after_days": 365,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
            "monthly": {
                "name": "bkp-monthly-ret365-jc",
                "rule_name": "monthly-backup-rule",
                "target_vault_name": self.backup_vault.name,
                "schedule": "cron(0 6 1 * ? *)",  # 1st of every month at 5 AM
                "delete_after_days": 365,
                "start_window": 60,
                "completion_window": 120,
                "enable_windows_vss": True,
            },
        }

# Create the backup plan
        backup_creator = backup.BackupPlanCreator(vault_name, base_tags, backup_configs, "daily")
import pulumi
import pulumi_aws as aws


class CorityBackupPlanCreator:
    def __init__(
        self, vault_name: str, base_tags: dict, backup_configs: dict, backup_type: str
    ):
        self.base_tags = base_tags
        self.backup_type = backup_type
        self.backup_configs = backup_configs

        # Create the backup vault
        self.backup_vault = aws.backup.Vault(
            vault_name,
            tags=self.base_tags,
        )

        config = self.backup_configs.get(self.backup_type)
        if not config:
            raise ValueError(
                "Invalid backup type specified. Choose from: 'hourly', 'daily', 'weekly', 'monthly'."
            )
        # Define the backup rules using the configuration from the dictionary
        backup_rules = []
        for _, backup_details in self.backup_configs.items():
            backup_rules.append(
                aws.backup.PlanRuleArgs(
                    rule_name=backup_details["rule_name"],
                    target_vault_name=self.backup_vault.name,
                    schedule=backup_details["schedule"],
                    lifecycle=aws.backup.PlanRuleLifecycleArgs(
                        delete_after=backup_details["delete_after_days"]
                    ),
                    start_window=backup_details["start_window"],
                    completion_window=backup_details["completion_window"],
                    recovery_point_tags=self.base_tags,
                )
            )

        # Define the advanced backup settings
        advanced_backup_settings = []
        for _, backup_details in self.backup_configs.items():
            advanced_backup_settings.append(
                aws.backup.PlanAdvancedBackupSettingArgs(
                    backup_options={
                        "WindowsVSS": (
                            "enabled"
                            if backup_details["enable_windows_vss"]
                            else "disabled"
                        )
                    },
                    resource_type="EC2",
                )
            )

        # Create the backup plan
        self.plan_resource = aws.backup.Plan(
            resource_name=backup_details["rule_name"],
            rules=backup_rules,
            advanced_backup_settings=advanced_backup_settings,
            tags=self.base_tags,
        )

        # Export the backup plan ID
        pulumi.export("backupPlanId", self.plan_resource.id)

What I also did here was add a for loop because my logic will change later, and I will want all four backup plans made at once from the dictionary. Yet this showed me that I could pass around a dictionary, and If I could pass around a dictionary, that meant I could also make it come from the central pulumi.yaml file and load it as an json object into the class. This is because when working with Pulumi, the idea you want to get to is to pass everything through one central yaml file that is linked to each environment stack. That is always the end goal. I then changed my main.py code around a bit like this to achieve this.

import pulumi
import pulumi_aws as aws
import json 
import jc_aws_backup as backup

backup_configs_json = pulumi.Config().require("backupConfigs")
vault_name = f"bkp-vault-{project_profile}-{project_short_region}"[:42]

# Load the Backup configs from the Yaml and pass as a Json String.
# Do this with Error Handling so you can see where in JSON it is failing.
try:
    backup_configs = json.loads(backup_configs_json)
except json.JSONDecodeError as e:
    raise ValueError(f"Invalid JSON in backupConfigs: {e}")

# Create the backup plan
backup_creator = backup.BackupPlanCreator(
    vault_name=vault_name,
    base_tags=base_tags,
    backup_type="daily"
    backup_configs=backup_configs,
)

I also added some Error Handling. This is because when I first passed through the json from the yaml file, I got a panic error shown below, and it was a pretty unhelpful error in figuring out where in the json it was failing.

Diagnostics:
  pulumi:pulumi:Stack (jcontent-dev-v2-jcontent_dev_v2):
    error: Program failed with an unhandled exception:
    Traceback (most recent call last):
      File "C:\DevOps\infrastructure.pulumi\iac-development\jcontent-dev-v2\__main__.py", line 70, in <module>
        backup_configs = json.loads(backup_configs_json)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Python312\Lib\json\__init__.py", line 346, in loads
        return _default_decoder.decode(s)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Python312\Lib\json\decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Python312\Lib\json\decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 23 column 27 (char 602)

By creating my Error handling, I get it to show me where exactly in the json object it is failing.
I have also supplied the yaml file below to show you how to pass in the object. This stumped me for a little while, and having an example would have helped.

encryptionsalt: VALUE
config: 
aws:region: us-east-2
environment: jc_testing_pulumi
backupConfigs: |
    {
      "hourly": {
        "name": "bkp-hourly-ret7-jc",
        "rule_name": "hourly-backup-rule",
        "schedule": "cron(0 * * * ? *)",
        "delete_after_days": 7,
        "start_window": 60,
        "completion_window": 120,
        "enable_windows_vss": true
      },
      "daily": {
        "name": "bkp-daily-ret14-jc",
        "rule_name": "daily-backup-rule",
        "schedule": "cron(0 5 ? * * *)",
        "delete_after_days": 30,
        "start_window": 60,
        "completion_window": 125,
        "enable_windows_vss": true
      },
      "weekly": {
        "name": "bkp-weekly-ret365-jc",
        "rule_name": "weekly-backup-rule",
        "schedule": "cron(0 6 ? * 2 *)",
        "delete_after_days": 365,
        "start_window": 60,
        "completion_window": 120,
        "enable_windows_vss": true
      },
      "monthly": {
        "name": "bkp-monthly-ret365-jc",
        "rule_name": "monthly-backup-rule",
        "schedule": "cron(0 6 1 * ? *)",
        "delete_after_days": 365,
        "start_window": 60,
        "completion_window": 120,
        "enable_windows_vss": true
      }
    }

This now prepares me for my final hurdle and task; what I actually want to happen is for the backup plan class to make all four backup plans in one run and name them accordingly. How I achieved this was first to remove backup_type from my Class and to change my for loop around to include the aws.backup.plan() function. Yet, I still needed it to name each plan for me. The code below shows how I achieved this.

import json
import pulumi
import pulumi_aws as aws


class BackupPlanCreator:
    def __init__(self, vault_name: str, base_tags: dict, backup_configs: dict):
        self.base_tags = base_tags
        self.backup_configs = backup_configs

        # Create the backup vault
        self.backup_vault = aws.backup.Vault(
            vault_name,
            tags=self.base_tags,
        )

        # Loop through each backup configuration and create a separate backup plan
        for backup_name, backup_details in self.backup_configs.items():
            # Define the backup rules
            backup_plan = aws.backup.Plan(
                resource_name=f"{backup_name}-backup-plan",
                rules=[
                    {
                        "rule_name": backup_details["rule_name"],
                        "target_vault_name": self.backup_vault.name,
                        "schedule": backup_details["schedule"],
                        "lifecycle": {
                            "delete_after": backup_details["delete_after_days"],
                        },
                        "start_window": backup_details["start_window"],
                        "completion_window": backup_details["completion_window"],
                        "recovery_point_tags": self.base_tags,
                    }
                ],
                # Define the advanced backup settings
                advanced_backup_settings=[
                    {
                        "backup_options": {
                            "WindowsVSS": (
                                "enabled"
                                if backup_details["enable_windows_vss"]
                                else "disabled"
                            ),
                        },
                        "resource_type": "EC2",
                    }
                ],
                tags=self.base_tags,
            )

The call to this function is similar to before, without the backup_type part being passed.

import pulumi
import pulumi_aws as aws
import json 
import jc_aws_backup as backup

backup_configs_json = pulumi.Config().require("backupConfigs")
vault_name = f"bkp-vault-{project_profile}-{project_short_region}"[:42]

# Load the Backup configs from the Yaml and pass as a Json String.
# Do this with Error Handling so you can see where in JSON it is failing.
try:
    backup_configs = json.loads(backup_configs_json)
except json.JSONDecodeError as e:
    raise ValueError(f"Invalid JSON in backupConfigs: {e}")

# Create the backup plan
backup_creator = backup.BackupPlanCreator(
    vault_name=vault_name,
    base_tags=base_tags,
    backup_configs=backup_configs,
)

And that’s it. That’s how I have managed to pass around a dictionary and evolve that into an actual, useful json object to then make multiple plans for one backup plan run.

I will build on this dictionary now and put in some functionality that you see in Terraform. If you want to use default values, you can, or if you're going to override those values, you can. The thing with my dictionary right now is everything is being passed through, but essentially, the engineer will not want to do that. They will probably only want to configure certain parts of the dictionary from the yaml file.

Yet this is a cliffhanger moment for another blog post for another time.

Till next time, happy coding.

0
Subscribe to my newsletter

Read articles from Jason directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jason
Jason

I am a cloud engineer, and I specialise in writing Go and Python to interact with the various cloud SDKs in AWS and Azure. Some GCP and Hetzner