AWS RDS Concept Essentials

Mohamed El ErakiMohamed El Eraki
17 min read

Inception

Hello everyone, This article is part of The Terraform + AWS series, And it is not dependent on any previous articles, I use this series to publish-out AWS + Terraform Projects & Knowledge.

This Article is written down based on practical experience and AWS Documentation with summarizing, collecting, and listing down the important points.


Overview

Hello Gurus, AWS RDS (Amazon Relation database) is a web service that makes it easier to set up, operate, and scale a relational database in AWS Cloud platform. it provides resizable capacity, automated backups, High-availability and failover mechanisms, supports multiple database engines, and manages common database administration tasks.

πŸ’‘
Why do you want to run a relational database in the AWS Cloud? Because AWS takes over many of the difficult and tedious management tasks of a relational database.

Amazon EC2 and on-premises databases

For a relational database in an on-premises server, you assume full responsibility for the server, operating system, and software. And for a database on an Amazon EC2 instance, AWS manages the layers below the operating system. However, you are managing all the database aspects e.g. Database engine, High-availability, Backups, DB Patching, etc..

πŸ’‘
Amazon EC2 isn't a fully managed service. Thus, when you run a database on Amazon EC2, you're more prone to user errors. For example, when you update the operating system or database software manually, you might accidentally cause application downtime. You might spend hours checking every change to identify and fix an issue.

Amazon RDS and Amazon EC2

Amazon RDS is a managed database service. It's responsible for most management tasks. By eliminating tedious manual tasks, Amazon RDS frees you to focus on your application side Management and your users management. And AWS recommend Amazon RDS over Amazon EC2 as your default choice for most database deployments.

The following tables shows a compasrison in AWS EC2 and Amazon RDS

Amazon RDS Advantages and Features

  • You can use the database products you are already familiar with: Db2, MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL.

  • Amazon RDS manages backups, software patching, automatic failure detection, and recovery.

  • You can turn on automated backups, or manually create your own backup snapshots. You can use these backups to restore a database. The Amazon RDS restore process works reliably and efficiently.

  • You can get high availability with a primary instance and a synchronous secondary instance that you can failover to when problems occur. You can also use read replicas to increase read scaling.

  • In addition to the security in your database package, you can help control who can access your RDS databases. To do so, you can use AWS Identity and Access Management (IAM) to define users and permissions. You can also help protect your databases by putting them in a virtual private cloud (VPC).

  • Secure master users with AWS secret manager.

The following image shows a typical use case of a dynamic website that uses Amazon RDS for database storage. AWS routes user traffic through Elastic Load Balancing, which forwards the requests to application servers. These application servers interact with RDS DB instances. The application servers and DB instances reside in different Availability Zones (AZs) within the same Virtual Private Cloud (VPC). The primary DB replicates to another standby DB , Both DB instances are in private subnets within the VPC, which means that Internet users can't access them directly.

πŸ’‘
The standby database didn't accept appliation requests. However, you have the ability to create read replica instead of standby. meanwhile, AWS RDS manages the replication and fail-over of the database in the background in Multi-az, and In read replica RDS The customer manages the fail-over and accepts data loss.


RDS Database engines

A DB engine is the specific relational database software that runs on your DB instance. Amazon RDS currently supports the following engines:

  • Db2

  • MariaDB

  • Microsoft SQL Server

  • MySQL

  • Oracle

  • PostgreSQL


Amazon RDS DB instances

A DB instance is an isolated database environment running in the cloud. It is the basic building block of Amazon RDS. A DB instance can contain multiple user-created databases, and can be accessed using the same client tools and applications you might use to access a standalone database instance. DB instances are simple to create and modify with the AWS command line tools, Amazon RDS API operations, or the AWS Management Console.

You can have up to 40 Amazon RDS DB instances, with the following limitations:

  • 10 for each SQL Server edition (Enterprise, Standard, Web, and Express) under the "license-included" model

  • 10 for Oracle under the "license-included" model

  • 40 for Db2 under the "bring-your-own-license" (BYOL) licensing model

  • 40 for MySQL, MariaDB, or PostgreSQL

  • 40 for Oracle under the "bring-your-own-license" (BYOL) licensing model


Database instance classes

A DB instance class determines the computation and memory capacity of a DB instance. A DB instance class consists of both the DB instance type and the size. Each instance type offers different compute, memory, and storage capabilities. For example, db.m6g is a general-purpose DB instance type powered by AWS Graviton2 processors. Within the db.m6g instance type there is db.m6g.2xlarge is a DB instance class.

Database instance class types

As said RDS offers multiple instance classes, based on the instance class you choose will determine the instance class type, Check the instance class type links below:


Database instance storage

Amazon RDS Database stores its database content into EBS, Which provides durable, block-level storage volumes that you can attach to an RDS instance. DB instance storage comes in the following types:

  • General Purpose (SSD)

  • Provisioned IOPS (PIOPS)

  • Magnetic

The storage types differ in performance characteristics and price. You can tailor your storage performance and cost to the needs of your database.

Each DB instance has minimum and maximum storage requirements depending on the storage type and the database engine it supports. It's important to have sufficient storage so that your databases have room to grow. Also, sufficient storage makes sure that features for the DB engine have room to write content or log entries. For more information, see Amazon RDS DB instance storage.


Amazon RDS and VPC

You can run a DB instance on a virtual private cloud (VPC) using the Amazon Virtual Private Cloud (Amazon VPC) service. When you use a VPC, you have control over your virtual networking environment. You can choose your own IP address range, create subnets, and configure routing and access control lists. The basic functionality of Amazon RDS is the same whether it's running in a VPC or not.

Amazon RDS manages backups, software patching, automatic failure detection, and recovery. There's no additional cost to run your DB instance in a VPC. For more information on using Amazon VPC with RDS, see Amazon VPC VPCs and Amazon RDS.

Amazon RDS uses Network Time Protocol (NTP) to synchronize the time on DB instances.


AWS Regions and Availability Zones

You can run your DB instance in several Availability Zones, an option called a Multi-AZ deployment. When you choose this option, Amazon automatically provisions and maintains one or more secondary standby DB instances in a different Availability Zone. Your primary DB instance is replicated across Availability Zones to each secondary DB instance. This approach helps provide data redundancy and failover support, eliminate I/O freezes, and minimize latency spikes during system backups.

In a Multi-AZ DB clusters deployment, the secondary DB instances can also serve read traffic. For more information, see Configuring and managing a Multi-AZ deployment.

You have Three options to ensure redundancy and failover:

  1. Single database instance: Creates a single database instance.

  2. Multi-AZ database instance: Creates a primary and standby DB instance in different AZ.

  3. Multi-AZ database cluster: Creates a DB cluster with a primary DB instance and two readable Standby instances.


RDS Security Group

While creating an RDS instance you have the ability to assign an existing security group or create a new one, A security group controls the access to a DB instance. It does so by allowing access to IP address ranges that you specify.


Database instance identifier

Each DB instance has a DB instance identifier. This customer-supplied name uniquely identifies the DB instance when interacting with the Amazon RDS API and AWS CLI commands. The DB instance identifier must be unique for that customer in an AWS Region.

The DB instance identifier forms part of the DNS hostname allocated to your instance by RDS. For example, if you specify db1 as the DB instance identifier, then RDS will automatically allocate a DNS endpoint for your instance. An example endpoint is db1.abcdefghijkl.us-east-1.rds.amazonaws.com, where db1 is your instance ID.

In the example endpoint db1.abcdefghijkl.us-east-1.rds.amazonaws.com, the string abcdefghijkl is a unique identifier for a specific combination of AWS Region and AWS account. The identifier abcdefghijkl in the example is internally generated by RDS and doesn't change for the specified combination of Region and account. Thus, all your DB instances in this Region share the same fixed identifier. Consider the following features of the fixed identifier:

  • If you rename your DB instance, the endpoint is different but the fixed identifier is the same. For example, if you rename db1 to renamed-db1, the new instance endpoint is renamed-db1.abcdefghijkl.us-east-1.rds.amazonaws.com.

  • If you delete and re-create a DB instance with the same DB instance identifier, the endpoint is the same.

  • If you use the same account to create a DB instance in a different Region, the internally generated identifier is different because the Region is different, as in db2.mnopqrstuvwx.us-west-1.rds.amazonaws.com.

πŸ’‘
The abcdefghijkl value is consistent in your region

Database name limitations

When creating a DB instance, some database engines require that a database name be specified. A DB instance can host multiple databases, a single Db2 database, or a single Oracle database with multiple schemas.

The database name value depends on the database engine:

  • For the Db2 database engine, the database name is the name of the database hosted in your DB instance. If you want to use Amazon RDS stored procedures to create or drop a database, then don't enter a database name when you create a DB instance.

  • For the MySQL and MariaDB database engines, the database name is the name of a database hosted in your DB instance. Databases hosted by the same DB instance must have a unique name within that instance.

  • For the Oracle database engine, database name is used to set the value of ORACLE_SID, which must be supplied when connecting to the Oracle RDS instance.

  • For the Microsoft SQL Server database engine, database name is not a supported parameter.

  • For the PostgreSQL database engine, the database name is the name of a database hosted in your DB instance. A database name is not required when creating a DB instance. Databases hosted by the same DB instance must have a unique name within that instance.

πŸ’‘
Amazon RDS creates a master user account for your DB instance as part of the creation process. This master user has permission to create databases and to perform create, delete, select, update, and insert operations on tables the master user creates. You must set the master user password when you create a DB instance, but you can change it at any time using the AWS CLI, Amazon RDS API operations, or the AWS Management Console. You can also change the master user password and manage users using standard SQL commands.

IAM database authentication for MariaDB, MySQL, and PostgreSQL

You can authenticate to your DB instance using AWS Identity and Access Management (IAM) database authentication. IAM database authentication works with MariaDB, MySQL, and PostgreSQL. With this authentication method, you don't need to use a password when you connect to a DB instance. Instead, you use an authentication token.

An authentication token is a unique string of characters that Amazon RDS generates on request. Authentication tokens are generated using AWS Signature Version 4. Each token has a lifetime of 15 minutes. You don't need to store user credentials in the database, because authentication is managed externally using IAM. You can also still use standard database authentication. The token is only used for authentication and doesn't affect the session after it is established.

IAM database authentication provides the following benefits:

  • Network traffic to and from the database is encrypted using Secure Socket Layer (SSL) or Transport Layer Security (TLS). For more information about using SSL/TLS with Amazon RDS, see Using SSL/TLS to encrypt a connection to a DB instance or cluster.

  • You can use IAM to centrally manage access to your database resources, instead of managing access individually on each DB instance.

  • For applications running on Amazon EC2, you can use profile credentials specific to your EC2 instance to access your database instead of a password, for greater security.

    You can fetch the IAM Token as follows, to use it with your application.

#!/bin/bash

# Generate an IAM authentication token
TOKEN=$(aws rds generate-db-auth-token \
  --hostname your-db-endpoint \
  --port 3306 \
  --region your-region \
  --username your-db-username)

# Export the token to be used by the application
export DB_TOKEN=$TOKEN

In general, consider using IAM database authentication when your applications create fewer than 200 connections per second, and you don't want to manage usernames and passwords directly in your application code.


Amazon RDS monitoring

There are several ways that you can track the performance and health of a DB instance. You can use the Amazon CloudWatch service to monitor the performance and health of a DB instance. CloudWatch performance charts are shown in the Amazon RDS console. You can also subscribe to Amazon RDS events to be notified about changes to a DB instance, DB snapshot, or DB parameter group. For more information, see Monitoring metrics in an Amazon RDS instance.


Performance Insights

Performance Insights in Amazon RDS expands on existing Amazon RDS monitoring features to illustrate and help you analyze your database performance. With the Performance Insights dashboard, you can visualize the database load on your Amazon RDS DB instance. You can also filter the load by waits, SQL statements, hosts, or users. For more information, see Monitoring DB load with Performance Insights on Amazon RDS.


Using Amazon RDS Proxy

By using Amazon RDS Proxy, you can allow your applications to pool and share database connections to improve their ability to scale. RDS Proxy makes applications more resilient to database failures by automatically connecting to a standby DB instance while preserving application connections. By using RDS Proxy, you can also enforce AWS Identity and Access Management (IAM) authentication for databases, and securely store credentials in AWS Secrets Manager.

Using RDS Proxy, you can handle unpredictable surges in database traffic. Otherwise, these surges might cause issues due to oversubscribing connections or new connections being created at a fast rate. RDS Proxy establishes a database connection pool and reuses connections in this pool. This approach avoids the memory and CPU overhead of opening a new database connection each time. To protect a database against oversubscription, you can control the number of database connections that are created.

RDS Proxy queues or throttles application connections that can't be served immediately from the connection pool. Although latencies might increase, your application can continue to scale without abruptly failing or overwhelming the database. If connection requests exceed the limits you specify, RDS Proxy rejects application connections (that is, it sheds load). At the same time, it maintains predictable performance for the load that RDS can serve with the available capacity.

You can reduce the overhead to process credentials and establish a secure connection for each new connection. RDS Proxy can handle some of that work on behalf of the database.

RDS Proxy is fully compatible with the engine versions that it supports. You can enable RDS Proxy for most applications with no code changes.


Secrets Manager integration

With AWS Secrets Manager, you can replace hard-coded credentials in your code, including database passwords, with an API call to Secrets Manager to retrieve the secret programmatically. For more information about Secrets Manager, see AWS Secrets Manager User Guide.


Synchronous vs. Asynchronous Replication: Which is Right for Your AWS RDS Instance?

The primary difference between synchronous and asynchronous RDS replication lies in how the data is copied from the primary database instance to the read replicas or standby instances.

Synchronous Replication:

  • Data Consistency: Synchronous replication ensures that data is written to the primary database and at least one standby or secondary instance before the write operation is considered complete. This guarantees data consistency across the instances.

  • Use Case: This is typically used in Multi-AZ deployments where high availability and data durability are critical. It ensures that failover instances are up-to-date with the primary instance.

  • Latency: There is a slight performance overhead due to the need to confirm that the data is written to both the primary and secondary instances before acknowledging the write operation.

  • Failover: In the event of a failure of the primary instance, a synchronous replica can be promoted to the primary role with minimal data loss, as it is fully consistent with the primary.

Asynchronous Replication:

  • Data Consistency: In asynchronous replication, the data is written to the primary database first, and the changes are then propagated to the read replicas in the background. This means there is a potential for lag between the primary instance and the read replicas.

  • Use Case: This is commonly used for read replicas where the primary goal is to offload read traffic from the primary instance. It is useful for improving read scalability and performance.

  • Latency: There is typically lower latency for write operations since the primary instance does not wait for the replicas to acknowledge the writes.

  • Lag: There can be a replication lag, where the read replicas may not have the most up-to-date data compared to the primary instance. This lag can vary based on the write load and network latency.

  • Failover: In case of primary instance failure, an asynchronous replica may be promoted to primary, but there may be some data loss due to the replication lag.

Summary of Use Cases:

  • Synchronous Replication (Multi-AZ):

    • High availability and durability.

    • Ensures minimal data loss during failover.

    • Suitable for critical applications requiring high data consistency.

  • Asynchronous Replication (Read Replicas):

    • Scalability and performance for read-heavy workloads.

    • Can be used across different regions for geographical redundancy.

    • Suitable for applications that can tolerate some level of replication lag.

Example in AWS RDS:

Multi-AZ Deployment (Synchronous Replication):

  • Automatically replicates data to a standby instance in a different availability zone.

  • Used for high availability and automatic failover.

  • Configuration is done by enabling Multi-AZ when creating the RDS instance.

Read Replicas (Asynchronous Replication):

  • Can be created within the same region or across different regions.

  • Used to offload read traffic and improve read performance.

  • Created as separate RDS instances with the replicate_source_db parameter pointing to the primary instance.

πŸ’‘
In Read Replicas, you manage the failover manually on your own.

Configuring and managing a Multi-AZ deployment

Multi-AZ deployments can have one standby or two standby DB instances. When the deployment has one standby DB instance, it's called a Multi-AZ DB instance deployment. A Multi-AZ DB instance deployment has one standby DB instance that provides failover support, but doesn't serve read traffic. When the deployment has two standby DB instances, it's called a Multi-AZ DB cluster deployment. A Multi-AZ DB cluster deployment has standby DB instances that provide failover support and can also serve read traffic.

You can use the AWS Management Console to determine whether a Multi-AZ deployment is a Multi-AZ DB instance deployment or a Multi-AZ DB cluster deployment. In the navigation pane, choose Databases, and then choose a DB identifier.


How to deploy Replica RDS - NOT Multi-AZ - using Terraform

provider "aws" {
  region = "your-region"  # Replace with your AWS region
}

# Primary RDS Instance
resource "aws_db_instance" "primary" {
  identifier             = "primary-db"
  engine                 = "mysql"  # Replace with your desired database engine
  instance_class         = "db.t3.micro"
  allocated_storage      = 20
  storage_type           = "gp2"
  name                   = "mydatabase"
  username               = "admin"
  password               = "password"  # Replace with a secure password
  parameter_group_name   = "default.mysql8.0"  # Replace with the appropriate parameter group
  skip_final_snapshot    = true
  multi_az               = false
}

# Read Replica RDS Instance
resource "aws_db_instance" "read_replica" {
  identifier              = "read-replica-db"
  engine                  = aws_db_instance.primary.engine
  instance_class          = aws_db_instance.primary.instance_class
  allocated_storage       = aws_db_instance.primary.allocated_storage
  storage_type            = aws_db_instance.primary.storage_type
  username                = aws_db_instance.primary.username
  password                = aws_db_instance.primary.password
  parameter_group_name    = aws_db_instance.primary.parameter_group_name
  skip_final_snapshot     = true
  replicate_source_db     = aws_db_instance.primary.identifier

  # Ensure Multi-AZ is disabled
  multi_az                = false
}

What are The IAM permissions required for Allowing users to access RDS?

To allow an IAM user to access Amazon RDS using IAM authentication, you need to grant specific permissions in an IAM policy. Below is an example of an IAM policy that allows an IAM user to access RDS and authenticate using IAM:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "rds-db:connect"
            ],
            "Resource": [
                "arn:aws:rds-db:<region>:<account-id>:dbuser:<db-cluster-identifier>/<db-username>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "rds:DescribeDBInstances",
                "rds:DescribeDBClusters",
                "rds:DescribeDBClusterEndpoints",
                "rds:DescribeDBClusterParameterGroups",
                "rds:DescribeDBClusterSnapshotAttributes",
                "rds:DescribeDBClusterSnapshots",
                "rds:DescribeDBEngineVersions",
                "rds:DescribeDBInstances",
                "rds:DescribeDBLogFiles",
                "rds:DescribeDBParameterGroups",
                "rds:DescribeDBParameters",
                "rds:DescribeDBSecurityGroups",
                "rds:DescribeDBSnapshotAttributes",
                "rds:DescribeDBSnapshots",
                "rds:DescribeDBSubnetGroups",
                "rds:DescribeEventCategories",
                "rds:DescribeEventSubscriptions",
                "rds:DescribeEvents",
                "rds:DescribeOptionGroupOptions",
                "rds:DescribeOptionGroups",
                "rds:DescribeOrderableDBInstanceOptions",
                "rds:DescribePendingMaintenanceActions",
                "rds:DescribeReservedDBInstances",
                "rds:DescribeReservedDBInstancesOfferings"
            ],
            "Resource": "*"
        }
    ]
}

Explanation of the Policy

  1. rds-db

    : This action allows the IAM user to connect to the RDS database using IAM authentication. The Resource ARN specifies the RDS database user and cluster.

  2. rds

    *: These actions allow the IAM user to describe various RDS resources and configurations, which might be necessary for monitoring and managing the database instances.

Customizing the Policy

  • <region>: Replace with the AWS region of your RDS instance (e.g., us-west-2).

  • <account-id>: Replace with your AWS account ID.

  • <db-cluster-identifier>: Replace with the identifier of your RDS cluster.

  • <db-username>: Replace with the database username that will use IAM authentication.


Creating and connecting to an RDS PostgreSQL Database

In this step, we will learn how to create a simple AWS RDS and connect to it via an EC2 machine, Follow up steps provided by AWS at the following URL

Creating and connecting to a PostgreSQL DB instance


References


That's it, Very straightforward, very fastπŸš€. Hope this article inspired you and will appreciate your feedback. Thank you.

0
Subscribe to my newsletter

Read articles from Mohamed El Eraki directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Mohamed El Eraki
Mohamed El Eraki

Cloud & DevOps Engineer, Linux & Windows SysAdmin, PowerShell, Bash, Python Scriptwriter, Passionate about DevOps, Autonomous, and Self-Improvement, being DevOps Expert is my Aim.