AWS Solutions Architect Associate Cheat Sheet

AWS Solutions Architect Associate Cheat Sheet
About This Cheat Sheet
This cheat sheet was prepared based on:
Udemy Practice tests by Jon Bonso, Neal Davis, Stephane Maarek
AWS Documentation
AWS FAQ
AWS Whitepapers
Note: This sheet is for last-minute or quick reference only. These were my notes for a final day glance before the actual SAA exam.
Credits
All credits to excellent SAA-C03 course and practice tests by:
Adrian Cantrill
Chad Smith
Jon Bonso
Neal Davis
Ranga Karnam
Stephane Maarek
You can enroll in any ONE of the courses listed below + Practice Exams to gain knowledge and clear the certification.
Course Links
Adrian Cantrill Course
Chad Smith O'Reilly Live Classes
- O'Reilly Learning (search by "author: Chad Smith")
Jon Bonso Practice Tests
Neal Davis Course & Practice Tests
Ranga Karnam Course and Exam Review
Stephane Maarek Course & Practice Tests
In-Detail Cheat Sheets by Neal and Jon
AWS Official Study Guides and Trainings
SAA-C03 Update Information
The new exam SAA-C03 started from Aug-2022. Only the weight distribution for domains changed; no major changes from SAA-C02. See What's New with the SAA-C03 by Jon Bonso.
My Tips on Practice Exams
Practice exams are not to determine pass/fail, but to test your understanding of AWS services and choose the most appropriate service for a scenario.
Read the entire question and every option, think like a Solution Architect, and eliminate wrong answers.
Google for help during practice exams (not answers), to learn and make notes.
After completion, check explanations for every answer and make notes.
Actual SAA-C03 is easier than practice exams (practice tests help you learn and prepare).
Connect with the Author
Are you preparing for SAA-C03? Have doubts or want to collaborate for AWS/DevOps certifications? Connect:
Twitter: venkatesh111
LinkedIn: venkatesh111
If planning for AWS Certified Developer Associate, see: AWS DVA-C01 Cheat Sheet
Key Words - to watch out for in questions !
Region
Physical or Geographical location
Made up of two or more Availability Zones
Each region is isolated from others
Multiple Availability Zones per region
Data replication across regions is possible
Communication between regions via public internet
AWS Region Examples
Code | Name |
us-east-1 | US East (N. Virginia) |
ap-south-1 | Asia Pacific (Mumbai) |
eu-west-2 | Europe (London) |
me-south-1 | Middle East (Bahrain) |
Availability Zone
Group of one or more discrete data centres
Redundant power, networking, connectivity
Low latency, high throughput, highly redundant network
Highly available, fault tolerant, scalable infrastructure
Durability
Likelihood of data loss
Example: Multiple copies of data in different locations increases durability
AWS S3 offers 99.999999999% durability
Availability
How readily a service is available
Example: More ATM machines = higher availability
Deploying EC2/RDS in multiple AZs increases availability
Resilient
Recover from failure induced by load, attacks, failures
Partial system failure doesn't take down the whole system
Fault Tolerant
- System remains operational even if some components fail
AWS Services
IAM
- Explicit deny policy always overrides explicit allow
IAM Roles
Using IAM Role for EC2, ASG is more secure than providing access via IAM user
ECS tasks can also be assigned with IAM ROLES just like IAM Role or EC2 instances
Want EC2 instance to access other AWS services (Example S3) use IAM ROLE
Sharing CloudTrail logs between AWS accounts then use IAM Roles
Cross Account Access
Your developers/Ops want to access particular resources in 2 or more different (PROD, TEST) AWS accounts
Temporary access to resources in a second account (use with STS)
Custom Identity Broker
- If your On-Prem LDAP is not compatible with SAML, and you want users to use LDAP to authenticate to AWS use custom identity brokers
Note: You cannot attach IAM Role to On-Prem Instances, use IAM credentials
External ID
To give a third-party access to your AWS resources (delegate access)
Monitor your AWS account and help optimize costs
Perform some analytics etc
IAM Best Practices
Lock away your AWS account root user access keys
Create individual IAM users
Enable MFA
Use user groups
Grant least privilege
Use roles for applications that run on Amazon EC2 instances
Use roles to delegate permissions
AWS IAM Best Practices Documentation
IAM permissions boundaries helps you to restrict AWS IAM admin access and prevent privilege escalation, or allowing them to bypass any other security rules.
S3
Storage Classes Overview
Storage Class | Use Case | Retrieval Time | Cost |
S3 Intelligent-Tiering | Unpredictable/changing access patterns | Milliseconds | Moderate |
S3 Standard | Frequently accessed data (>1/month) | Milliseconds | Standard |
S3 Standard-IA | Infrequently accessed, retained ≥1 month | Milliseconds | Lower |
S3 One Zone-IA | Reproducible, lower resiliency requirement | Milliseconds | Lowest IA |
S3 Glacier Instant Retrieval | Rarely accessed, immediate retrieval needed | Milliseconds | Archive |
S3 Glacier Flexible Retrieval | Rarely accessed, minutes–hours retrieval | 1–12 hours | Archive |
S3 Glacier Deep Archive | Lowest cost, hours retrieval | 12–48 hours | Lowest |
Glacier Retrieval Options
Storage Class | Expedited | Standard | Bulk |
S3 Glacier Instant Retrieval | N/A | N/A | N/A |
S3 Glacier Flexible Retrieval | 1–5 min | 3–5 hours | 5–12 hours |
S3 Glacier Deep Archive | N/A | ≤12 hours | ≤48 hours |
Replication at S3
SRR – Same Region Replication
Aggregate logs into a single bucket – If you store logs in multiple buckets or across multiple accounts, you can easily replicate logs into a single, in-Region bucket. Doing so allows for simpler processing of logs in a single location.
Configure live replication between production and test accounts – If you or your customers have production and test accounts that use the same data, you can replicate objects between those multiple accounts, while maintaining object metadata.
Abide by data sovereignty laws – You might be required to store multiple copies of your data in separate AWS accounts within a certain Region. Same-Region Replication can help you automatically replicate critical data when compliance regulations don't allow the data to leave your country.
Encryption at S3
Server-Side Encryption Request Amazon S3 to encrypt your object before saving it on disks in its data centers and then decrypt it when you download the objects.
Client-Side Encryption Encrypt data client-side and upload the encrypted data to Amazon S3. In this case, you manage the encryption process, the encryption keys, and related tools.
AWS KMS (SSE-KMS)
Encryption at rest
Automatic key rotation every 1 year
Operational efficiency (least manual efforts)
Server-side Encryption (SSE):
Customer Provided Keys (SSE-C)
S3 Managed Keys (SSE-S3)
KMS Managed Keys (SSE-KMS)
Client-side Encryption (CSE):
Customer managed master encryption keys (CSE-C)
KMS managed master encryption keys (CSE-KMS)
In order to ensure all objects uploaded to S3 are encrypted, create an S3 bucket policy that denies any S3 Put request that does not include the x-amz-server-side-encryption
header.
s3:x-amz-server-side-encryption: AES256
→ use S3-managed keyss3:x-amz-server-side-encryption: aws:kms
→ use AWS KMS managed keys
https://aws.amazon.com/blogs/security/how-to-prevent-uploads-of-unencrypted-objects-to-amazon-s3/
For cost effective analysis on data stored on S3 use Amazon Athena to run SQL queries.
S3 Object Lock
prevent deleting or modifying object for fixed amount of time.
Object Lock can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely
Object Lock to help meet regulatory requirements that require WORM (write-once-read-many) storage
Adds layer of protection against object changes and deletion
Object locks must be enabled at the time of creation of buckets (new bucket)
Bucket versioning is automatically enabled (cant be disabled) for Object lock enabled buckets
S3 Static website hosting
Only for static content, can also contain client-side scripts.
Does NOT support Server-side processing/scripting like PHP, JSP, or ASP.NET.
Accessing S3 from EC2 or ECS
IAM Role or Instance profile attached to EC2 to access S3
Data transfer between S3 and EC2 in same region is FREE
S3 Transfer Acceleration
fast, easy, and secure transfers of files over long distances between your client and an S3 bucket
S3 Transfer acceleration uses globally distributed edge locations in Amazon CloudFront
Additional data transfer charges might apply.
Use for large scale (more than 20GB) download and upload of data into S3 from various edge location
Use cases
Your customers upload to a centralized bucket from all over the world.
You transfer gigabytes to terabytes of data on a regular basis across continents.
You can't use all of your available bandwidth over the internet when uploading to Amazon S3.
S3 Cost
Enabling versioning will have additional cost (each versioned objects are charged)
Incomplete S3 multipart uploads are charged
Data transfer cost between S3 buckets in same region is free
VPC endpoint and S3
VPC endpoint for Amazon S3 to upload files/images from EC2 instance in private subnet
VPC endpoint for Amazon S3 reduces Direct connect costs
Feature | Gateway Endpoint for S3 | Interface Endpoint for S3 |
Network Traffic | Remains on AWS network | Remains on AWS network |
IP Addresses Used | Amazon S3 public IP addresses | Private IP addresses from your VPC |
Access from On-Premises | No access | Allows access |
Access from Another AWS Region | No access | Allows access via VPC peering or Transit Gateway |
Cost | Free | Not free |
EFS
EFS provides hierarchical directory structure
Can be used with both AWS and on-premises resources
Access the files concurrently by multiple EC2 instances
Multiple compute instances, including EC2, ECS (both Fargate and EC2 nodes), and Lambda, can access an EFS file system at the same time
EFS Storage Classes
Reduce storage cost by moving to different EFS storage classes
EFS will automatically and transparently move your files to the lower cost regional EFS Standard-IA or EFS One Zone-IA based on the last time they were accessed
Amazon EFS Intelligent-Tiering: moves the files between storage class
EFS Storage classes
EFS Standard-IA
EFS One Zone-IA
Bursting Throughput mode
It is the default mode, the amount of throughput scales as your file system grows, the more you store, the more throughput is available to you.
Does not incur any additional charges and you have a baseline rate of 50 KB/s per GB of throughput that comes included with the price you pay for your EFS standard storage.
Provisioned Throughput mode
Allows you to burst above your allocated allowance, which is based upon your file system size, so if your file system was relatively small but the use case for your file system required a high throughput rate, then the default bursting throughput options may not be able to process your request quickly enough. In this instance, you would need to use provisioned throughput.
This option does incur additional charges where you will need to pay for any bursting above the default capacity allowed from the standard bursting throughput.
Amazon FSx
Amazon FSx for Windows File Server
Distributed File System Replication (DFSR) ↔ Amazon FSx
Accessible over Server Message Block (SMB) protocol
Amazon FSx is accessible from Windows, Linux, and MacOS
You can use Active Directory domain for authentication
Amazon FSx for Lustre
Enable high performance computing (HPC)
Mounting FSx for Lustre on an AWS Fargate launch type isn't supported. (Use EFS)
Can be mounted on EC2 worker nodes
AWS Snowball Edge
Offline data transfer from remote areas (or on-prem) to AWS
Unstable internet connection
It has on-board storage and compute power, Provides you with storage and processing capacity
Support local data processing and collection in disconnected environments such as ships, windmills, and remote factories
Used by disaster response team in case of natural disasters like hurricane, storm
AWS DataSync
Online data transfer from on-prem to AWS
Unstable internet connection
Can be used even with loss of Internet access for brief time
- If a task is interrupted, for instance, if the network connection goes down or the AWS DataSync agent is restarted, the next run of the task will transfer missing files, and the data will be complete and consistent at the end of this run.
AWS DataSync is a secure, online service that automates and accelerates moving data between on premises and AWS storage services
DataSync can copy data between:
NFS, SMB, HDFS (Hadoop), EFS, FSx
Self-managed object storage, AWS Snowcone, AWS S3
Migrate SMB/NFS from On-Prem to AWS than choice is AWS DataSync
Use cases:
Migrate your data to AWS, Move data between on-premises and AWS
Reduce on-premises storage costs by moving data directly to S3 Glacier
Replicate your data into AWS S3
AWS Storage Gateway
File Gateway
NFS/SMB, over file protocol
Supports local (on-prem) caching
Amazon S3 File Gateway
Useful if you have application making use of EFS to gather and store the content and they may be processed by numerous Amazon EC2 Linux instances
Data lakes, backups, and ML workflows
Amazon FSx File Gateway
Volume Gateway
iSCSI block storage
Hybrid cloud block storage
Supports local (on-prem) caching
Volume Gateway stores and manages on-premises data in Amazon S3 on your behalf
You can take point-in-time copies of your volumes (cached or stored) using AWS Backup
Cached Volume Gateway
Primary data is stored in Amazon S3
Frequently accessed data is retained locally (On-Prem) in the cache for low latency access
Stored Volume Gateway
Primary data is stored locally
Entire dataset is available for low latency access on premises while also asynchronously getting backed up to Amazon S3
Tape Gateway
iSCSI VTL
Replacement for your on-prem physical tapes, without changing existing backup workflows
Supports local (on-prem) caching, Caches virtual tapes on premises for low-latency data access
Encrypts data between the gateway and AWS for secure data transfer
Transitions virtual tapes between Amazon S3 and Amazon S3 Glacier Flexible Retrieval, or Amazon S3 Glacier Deep Archive, to minimize storage costs
CloudFront
Use geographic restrictions (geo blocking) to prevent/block users in specific geographic locations (nations) from accessing content that you're distributing through a CloudFront distribution
You can use CloudFront for on demand (VOD) or live streaming (real time) video
To reduce latency of the images/files hosted on S3 bucket use CloudFront
Cache media, can serve secret/private content
CloudFront is used for only Delivery (CDN) max download size over CloudFront is 20GB
CloudFront Origin Access Identity (OAI)
- Restrict access to Amazon S3 bucket so that objects can be accessed only through my Amazon CloudFront distribution
High Availability with CloudFront
You create an origin group with two origins: a primary and a secondary
If the primary origin is unavailable CloudFront automatically switches to the secondary origin
Example
You have s3 bucket in us-west-1 and data is being replicated to ap-southeast-1 then:
Create an additional CloudFront origin pointing to the ap-southeast-1 bucket
Set up a CloudFront origin group with the us-west-1 bucket as the primary and the ap-southeast-1 bucket as the secondary
Field-Level Encryption
Adds an additional layer of security that lets you protect specific data throughout system processing so that only certain applications can see it
Enable your users to securely upload sensitive information to your web servers
Lambda@Edge
Customized material (from website) depending on the device (mobile, desktop, tablet) from which they view the website
Improve search engine optimization (SEO) for your website
Route requests to different origins based on different viewer characteristics
Route requests to origins within a home region, based on a viewer's location
AWS Global Accelerator
With Global Accelerator, you are provided two global static public IPs that act as a fixed entry point to your application, improving availability
AWS Global Accelerator Reduce Internet latency
Corporate proxies (On-Prem) can also whitelist your application's static IP addresses in their firewalls
Provides static IP which we can bind in the on-prem firewall
Add or remove your AWS application endpoints, such as ALB, NLB, EC2 Instances, and Elastic IPs without making user-facing changes
Real Time Messaging Protocol (RTMP), deliver content over TCP from across the globe
AWS Global Accelerator also performs health checks automatically and route traffic to healthy endpoints
CloudFront Vs Global Accelerator
CloudFront
- HTTP, Cacheable TO users over CDN
Global Accelerator
HTTP, and non-HTTP such as TCP, UDP (gaming), RTMP (real time high video and audio), MQTT (IoT), VoIP
From user location
Amazon EC2
Hibernation
You are not charged for Hibernated Instance usage. You pay only for the EBS volumes and Elastic IP Addresses attached to it. There are no other hourly charges (just like any other stopped instance)
To preserve contents of the instance's memory whenever the instance is unavailable
The EBS root volume is restored to its previous state
The RAM contents are reloaded
Spot Instances
Cost-effective choice if your applications can be interrupted
Examples: data analysis, batch jobs, background processing
Placement Groups
- There is no charge for creating a placement group
Cluster
Packs instances close together inside an Availability Zone
Low-latency and high network performance necessary for tightly-coupled node-to-node communication
High Performance Computing HPC applications
Partition
Groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions
Used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka
Spread
- Strictly places a small group of instances across distinct underlying hardware to reduce correlated failures
AWS Elastic Beanstalk
Ideal for simple web application, NOT ideal for micro services
Supports Time Based scaling
You can scale AWS Elastic Beanstalk environments on a defined schedule. Useful when you know the details around when issue is occurring
Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring
ECS - Elastic Container Service
Auto Scaling can be triggered on ECS service based on ECS services CPU utilization
For Amazon ECS cluster, using the Fargate ECS task launch type, use AWS Application Auto Scaling with target tracking policies to scale
By default, Fargate tasks are spread across Availability Zones
Elastic Load Balancers
NLB (Network Load Balancer)
Operates at Layer 4, TCP/UDP
Ultra-low latency
Can handle tens of millions of requests per second while maintaining high throughput at ultra-low latency
ALB (Application Load Balancer)
Operates at Layer 7, HTTP, HTTPS
You can create a listener rule on the ALB to redirect HTTP traffic to HTTPS
ALB supports path-based routing (route the traffic to different target group based on the url/path)
Session Management
Sticky Sessions or Session Affinity (local)
Route a site user to the particular web server that is managing that individual user's session
It's cost effective, generally fast because it eliminates network latency
Drawbacks:
In the event of node failure, session data is lost
If using ASG, traffic may be unevenly distributed
Distributed Session
ElastiCache for Redis, and ElastiCache for Memcached
Provide a shared data storage for sessions that can be accessible from any individual web server
There is additional cost and network latency
These are extremely fast and provide sub-millisecond latency
Can cache any data, not just HTTP sessions
AWS Session Management Documentation
Auto Scaling Groups
- You can perform EC2 auto scaling based on SQS queue also
Scheduled Scaling
Schedule according to predictable load changes
Known holidays, known history
Example: Every week the traffic to your web application starts to increase on Wednesday, remains high on Thursday, and starts to decrease on Friday, you can configure a schedule for Amazon EC2 Auto Scaling to increase capacity on Wednesday and decrease capacity on Friday
Dynamic Scaling
- Reactive in nature
Target Tracking Scaling Policy
Unpredictable workloads and traffic spikes
To keep the average aggregate CPU utilization of your Auto Scaling group at X percent
Predictive Scaling
Regular patterns of traffic increases (business hours) and applications that take a long time to initialize
Potentially save you money on your EC2 bill by helping you avoid the need to overprovision capacity
Cyclical traffic, such as high use of resources during regular business hours and low use of resources during evenings and weekends
Recurring on-and-off workload patterns, such as batch processing, testing, or periodic data analysis
Applications that take a long time to initialize, causing a noticeable latency impact on application performance during scale-out events
Suspend-Resume Feature
Temporarily pause scaling activities
Useful when you are making a change or investigating a configuration issue
RDS
OLTP is RDS
RDS Storage Auto Scaling automatically scales storage capacity in response to growing database workloads, with zero downtime
Using AWS DMS, you can migrate Oracle relational database running in an on-premises data center to Amazon RDS without modifying the application's code
RDS Read Replica
Read replica for read operation, helps improve RDS overall performance (as reads are redirected to read replica)
Read Replica support multi-region
When you create a read replica
Amazon RDS takes a DB snapshot of your source DB instance and begins replication
If you create multiple read replicas only one snapshot is created at the start of the first create action
You experience a brief I/O suspension on your source DB instance while the DB snapshot occurs
Points to consider for creating read replica
You must enable automatic backups on the source DB instance by setting the backup retention period to a value other than 0
Long-running transaction can slow the process of creating the read replica
AWS recommend that you wait for long-running transactions to complete before creating a read replica
When to use Read Replica?
For performance improvement of RDS (note Multi-AZ is for DR)
For internal systems request data from the RDS DB instance
Scaling beyond the compute or I/O capacity of a single DB instance for read-heavy database workloads
Serving read traffic while the source DB instance is unavailable (data on the read replica may be "stale")
Business reporting or data warehousing scenarios: You may want business reporting queries to run against a read replica rather than your primary, production DB Instance
Additional RDS Notes
Standby Instance are AZ specific, read replicas can be over multiple regions
Use AWS Secret Manager to protect your RDS database with password and automatic key rotation
ElastiCache for Redis to improve RDS DB instance speed/performance
Gaming leaderboard, Top 10 players, real-time score update
To create an encrypted RDS from unencrypted RDS
Take a Snapshot of the RDS instance
Create an encrypted copy of the snapshot
Restore the RDS instance from the encrypted snapshot
Multi AZ RDS Deployment
RPO less than 1 sec
Multi AZ RDS deployment is limited to same region, Cross-Region Multi-AZ isn't supported
RDS HA and DR Metrics
Feature | RPO (approx) | RTO (approx) |
Amazon RDS Multi-AZ | 0 | 1-2 Minutes |
Read replica promotion (in-Region) | Minutes | < 5 Minutes |
PITR (in-Region) using automated backups | 5 Minutes | Minutes-Hours |
PITR (cross-Region) using automated backups | 6-20 Minutes | Minutes-Hours |
Snapshot restore | Hours | Minutes-Hours |
AWS RDS Read Replicas DocumentationAWS RDS FAQ
AWS Aurora
Multi-Region DB
Multi-AZ DB, and read performance issue from secondary than Aurora is choice, read replication latency of less than one second
Aurora Auto Scaling for the read replica, helps with read replica latency issue if any
Aurora Global Database (DR Purpose of aurora)
Provides disaster recovery from region-wide outages, use for DR purpose between 2 different aws region
Allows a single Amazon Aurora database to span multiple AWS regions
Replicates your data with no impact on database performance, enables fast local reads with low latency in each region
RPO is 1 second and RTO is less than 1 minute
Aurora Serverless
Automatically starts up, shuts down, and scales capacity up or down based on your application's needs
Run your database in the cloud without managing any database capacity
You can create a database endpoint without specifying the DB instance class size
Useful for infrequently accessed Database (For example, your database usage might be heavy for a short period of time, followed by long periods of light activity or no activity at all.)
AWS Aurora Serverless Documentation
Aurora Read Replicas
Useful to improve performance of primary DB of Amazon Aurora
Offload read workloads from the primary DB instance
Supports only read operations
Each Aurora DB cluster can have up to 15 Aurora Replicas
Maintain high availability by locating Aurora Replicas in separate Availability Zones
Aurora automatically fails over to an Aurora Replica in case the primary DB instance becomes unavailable
AWS RedShift
Data warehousing
Both structured and unstructured
Complex or complicated analytical queries and joins
Amazon EMR
Perform Big Data analytics
Big Data processing, examples: Apache Spark, Hive, Presto
Lambda
Max runtime 15min
Minimal operational overhead expenditures → Lambda (not EKS or EC2 or ECS)
DynamoDB
Can handle several million queries per second at its peak and respond in milliseconds
User data in the form of JSON documents, then it is DynamoDB (not RDS)
Key-value store → DynamoDB
The maximum item size in DynamoDB is 400 KB
On-Demand
Your application traffic is difficult to predict (unpredictable) and control
Flash sale
Your workload has large spikes of short duration, or if your average table utilization is well below the peak
New applications, or applications whose database workload is complex to forecast
Developers working on serverless stacks with pay-per-use pricing
SaaS provider and independent software vendors (ISVs) who want the simplicity and resource isolation of deploying a table per subscriber
VPC Endpoints for DynamoDB
Helps you to connect to DynamoDB within AWS network (VPC)
You need route table entry created for the endpoint
DynamoDB Time to Live (TTL)
- TTL is useful if you store items that lose relevance (delete items) after a specific time
Use cases:
Remove user or sensor data after one year of inactivity in an application
Archive expired items to an Amazon S3 data lake via Amazon DynamoDB Streams and AWS Lambda
Retain sensitive data for a certain amount of time according to contractual or regulatory obligations
Orders placed after one month will no longer be monitored
Secondary Index
Search item using more than one key: value
Tracking ID, or customer ID, or order ID
Minimal operational overhead expenditures → DynamoDB (not RDS)
DynamoDB Accelerator (DAX)
Fully managed, highly available In-memory caching system used in front of DynamoDB
Performance improvement from milliseconds to microseconds for DynamoDB
API Gateway
RESTful services, REST APIs
Minimal operational overhead expenditures → Lambda (not EKS or EC2 or ECS)
Serverless Microservices
Minimal operational overhead expenditures design serverless:
Frontend/Web Layer
- S3 with CloudFront static website hosting
Application Layer
API Gateway and AWS Lambda functions
API Gateway, NLB and AWS Fargate
ALB and AWS ECS
DB Layer
DynamoDB → DB layer to store user data
Aurora → DB layer to store user data
ElastiCache → DB layer to store/cache user data
CloudWatch
To get Memory and disk related metric install cloudwatch agent
- Example: SwapUtilization
You can configure an Amazon CloudWatch alarm that triggers the recovery of the EC2 instance if it becomes impaired (Instance check fails only not system check fails)
CloudTrail
By default, only Management events are logged and not data events
Additional charges apply for data or Insights events
Protecting CloudTrail Logs
Log to a dedicated and centralized Amazon S3 bucket
Enable CloudTrail log file integrity
Encrypting CloudTrail log files with AWS KMS-managed keys (SSE-KMS)
By default, the log files delivered by CloudTrail to your bucket are encrypted by SSE-S3
To provide a security layer that is directly manageable, you can instead use server-side encryption with AWS KMS-managed keys (SSE-KMS) for your CloudTrail log files
CloudTrail Security Best Practices
To share CloudTrail log files between multiple AWS accounts
Create an IAM role for each account that you want to share log files with
For each of these IAM roles, create an access policy that grants read-only access to the account you want to share the log files with
Have an IAM user in each account programmatically assume the appropriate role and retrieve the log files
CloudTrail Log Sharing Documentation
Route 53
A Record
A record value is always an IP address
A record maps your website like example.com to IP address (ex: Elastic IP)
CNAME
CNAME can never be an IP address
CNAME record maps a name to another name
CNAME are for actual DNS servers, you can't create a CNAME record for example.com
Example:
An A record for example.com points to the server IP address
A CNAME record for www.example.com points to example.com
Alias
Alias record is an Amazon Route 53-specific virtual record
It works only with Amazon Route 53 (AWS specific resources)
AWS specific resources:
ELB
CloudFront Distribution
Elastic Beanstalk
S3 static websites
From one record in a hosted zone to another record
AAAA Record
- AAAA record is similar to an A record but it is for IPv6 addresses
MX Record
MX records (Mail Exchange records) is used for setting up Email servers
MX records must be mapped correctly to deliver email to your address
Important: A CNAME can't be used for naked/root domain names. Root domain names must be mapped with either an A record or an Alias record (in Route 53).
Route 53 Routing Policies
Simple Routing Policy
- Route traffic to single resource
Failover Routing Policy
- Active-passive failover
Latency Routing Policy
- You have resources in multiple AWS Regions and you want to route traffic to the Region that provides the best latency with less round-trip time
Geolocation Routing Policy
- Route traffic based on the location of your users
Geoproximity Routing Policy
Route traffic based on the location of your resource
Shift traffic from resources in one location to resources in another
Multivalue Answer Routing Policy
You want Route 53 to respond to DNS queries with up to eight healthy records selected at random
Return multiple values for a DNS query and route traffic to multiple IP addresses
Associating a Route 53 health check with records
Weighted Routing Policy
- Route traffic to multiple resources in proportions (based on weight 30%, 60% etc) that you specify
SQS
- Decouple your architecture
Standard Queues
At-least-once message delivery
Duplicate messages can be delivered
Best effort ordering
Better throughput than FIFO
FIFO Queues
Exactly-once processing
No Duplicate messages
FIFO Order
Low throughput
SQS Temporary Queue Client
Request-Response method, short-lived, lightweight messaging destinations
No intention to use SQS for long term
Leverages virtual queues instead of creating/deleting SQS queues
Additional SQS Notes
- Use SQS FIFO for Asynchronously updates to database, avoid dropping writes to the database
Deduplication of messages can be enabled by:
Enable content-based deduplication
Explicitly provide the message deduplication ID
Priority: Use separate queues (both can be standard Q) to provide prioritization of work and EC2 to perform prioritization
Amazon Kinesis
Collect, process, and analyze real-time, streaming data
Real-time data processing
Clickstream data processing
It is fully managed, highly scalable
Default retention is 24hrs, but can be extended to 7days (useful when the destination (S3) is not getting all the data from Kinesis)
Kinesis Video Streams
- Securely stream video from connected devices to AWS for analytics, machine learning (ML), and other processing
Kinesis Data Streams
- Real-time data streaming service that can continuously capture gigabytes of data per second from hundreds of thousands of sources
Kinesis Data Firehose
- Capture, transform, and load data streams into AWS data stores for near real-time analytics
Amazon Kinesis Data Analytics
- Process data streams in real time with SQL or Apache Flink without having to learn change existing code/application
SNS
Fanout Scenario
- Message published (S3 event notification) to an SNS topic is replicated and pushed to multiple endpoints, such as Kinesis Data Firehose delivery streams, Amazon SQS queues, HTTP(S) endpoints, and Lambda functions. This allows for parallel asynchronous processing
Example: Event-based strategy to run the multiple programs in parallel
SES
Send mail from within any application
Send email securely, globally, and at scale
Use Cases
Transactional emails (purchase confirmations or password resets)
Marketing emails (promotions, special offers and newsletters)
Mass email communications (notifications and announcements)
AWS Secrets Manager
Automatic Key Rotation possible
Helps RDS database with password protection and automatic key rotation
Application on EC2, or if Lambda function needs credentials to be retrieved than best choice is AWS Secrets Manager
AWS Inspector
Automated vulnerability management service that continually scans EC2 and container workloads for software vulnerabilities and unintended network exposure
AWS Inspector is specific to EC2 and Container workloads
Provides Automated Security Assessments for EC2 instances
Requires agent installation on EC2 for Host (vulnerability assessment/best practices) OR can do Network Assessment for EC2 without installing agent
AWS GuardDuty
Threat detection service that continuously monitors your AWS accounts and workloads for malicious activity
It uses Machine Learning, anomaly detection
Can protect against Crypto Currency attacks
Aim is to analyze logs:
CloudTrail Logs: unusual API calls, unauthorized deployments
VPC Flow Logs: unusual internal traffic, unusual IP address
DNS Logs: compromised EC2 instances sending encoded data within DNS queries
AWS Macie
Macie helps identify and alert you to sensitive data, such as personally identifiable information (PII)
AWS Macie is specific to S3
AWS Shield
- Avoid DDoS Attacks
AWS WAF
AWS Web Application Firewall (WAF) protect web applications and APIs from attacks
AWS WAF is your first line of defense against web exploits
Use AWS WAF to protect your API Gateway APIs
Protect from SQL injection, Cross-site scripting (XSS)
Protect against HTTP flooding attacks
Use AWS WAF to access or restrict from embargoed nation
Important Notes
AWS WAF rules are evaluated before other access control features, such as resource policies, IAM policies, Lambda authorizers, and Amazon Cognito authorizers
WAF can be integrated with Application Load Balancer (ALB) (and NOT NLB)
You can deploy AWS WAF on:
CloudFront
Application Load Balancer
API Gateway
AWS AppSync
AWS WAF with API Gateway Documentation
VPC
Expanding the VPC's IP Address Capacity
- It's NOT possible to change/modify the IP address range of an existing VPC or subnet
You can do one of the following:
Add an additional IPv4 CIDR block as a secondary CIDR to your VPC
Create a new VPC with your preferred CIDR block and then migrate the resources from your old VPC to the new VPC (if applicable)
Additional Notes:
You cannot disable IPv4 support for your VPC and subnet
You can have both IPv4 and IPv6, but not just IPv6 in your VPC
VPC Sharing
- Allows multiple AWS accounts (within Same AWS Organization) to create their application resources, such as EC2, RDS, Redshift clusters, and Lambda functions, into shared, centrally-managed virtual private clouds (VPCs)
Use case:
- EC2 from "Test Account" want to access Redshift cluster in "Prod Account"
VPC Flow Logs
Capture information about the IP traffic going to and from network interfaces in your VPC
VPC Flow log data can be published to:
Amazon CloudWatch Logs
Amazon S3
Flow logs can be used for:
Monitoring the traffic that is reaching your instance
Diagnosing overly restrictive security group rules
Determining the direction of the traffic to and from the network interfaces
NAT Gateway
NAT Gateway is resilient within a single-AZ (loss of AZ is loss of NAT Gateway)
Must create multiple NAT Gateway in multiple AZ for fault-tolerance
Launched in Public subnet, can be used by private instance to connect (routes need to be added) to internet
Site-to-Site VPN Connections
Site to site VPN connection can be established immediately
Site to site VPN connection is cheaper (compared to AWS Direct Connect)
A single VPN tunnel still has a maximum throughput of 1.25 Gbps
Use AWS Transit Gateway to scale an AWS Site-to-Site VPN throughput beyond a single IPsec tunnel's maximum limit of 1.25 Gbps limit
To resolve slower VPN connection, use a transit gateway with equal cost multipath routing and add additional VPN tunnels
Transit Gateway enables you to scale the IPsec VPN throughput with equal cost multi-path (ECMP) routing support over multiple VPN tunnels
Scaling VPN Throughput with Transit Gateway
AWS Direct Connect
- Data transfer pricing over Direct Connect is lower than data transfer pricing over the internet
Maximum Resiliency (resiliency as much as possible) for critical or crucial Workloads
- Separate connections terminating on separate devices in more than one location
Resilience to:
Device failure
Connectivity failure
Complete location failure
High Resiliency for Critical Workloads
- One connection at multiple locations
Resilience to:
Device failure
Connectivity failure due to a fiber cut
Complete location failure
AWS Direct Connect Resiliency RecommendationsAWS Direct Connect Disaster Recovery
Multiple VPC Connection
- To connect multiple VPC (Prod, Dev, Test) with On-prem and avoid resource sharing among the connected devices create an AWS Direct Connect connection and a VPN connection for each VPC to connect back to the data center. You cannot use Transit gateway here because you need to avoid resource sharing between VPC
Transit Gateway VPC Connection Documentation
AWS Organizations
You can use aws:PrincipalOrgID condition key in your resource-based policies (S3 bucket policies) to more easily restrict access to IAM principals from accounts in your AWS organization.
Subscribe to my newsletter
Read articles from Venkatesh K directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Venkatesh K
Venkatesh K
Engineer | Cloud | DevOps | Automation | Trainer | AWS Community Builder