Serverless Design for URL Shortening Service
Table of contents
1. Overview
1.1 What is a URL shortening Service?
It is a service which can provide short aliases for long URLs. Instead of sharing a lengthy URL with your customers or peers, a short URL will be generated and shared. On clicking the short URL, the link gets redirected to the actual URL.
Following is an example of a Short URL:
Providing a Long URL https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-setup-api-key-with-console.html the URL shortening service returns the short URL https://tinyurl.com/5xcEdf56
Services that offer this functionality are Tinyurl, Bitly, etc.
2. Requirements:
2.1 Functional Requirements
Service should create a Short URL against a long URL.
The usage of a short URL should redirect the users to its original long URL.
The service should allow the users to generate custom short URLs.
The service should allow the users to set up TTL for the short URL.
The service should allow the user to delete a short URL.
2.2 Non Functional Requirements
Service should be up and running all the time.
Service should be fast and reliable.
Service should expose REST API’s so that it can be integrated with third-party applications
Service should be able to track the usage on a client level and have the flexibility to enable tier-based subscriptions.
3. Assumptions
For every short URL created, there can be 100 reads. Assuming 1:100 write-to-read ratio
The system generates 100 million URLs per day
Write Operations = 100 million/24 hours/3600 seconds = 1160 writes per second.
Read operations = 100 million * 100 /24 hours/3600 seconds = 11600 reads per second.
Assuming the service can run for 100 years, the total number of records generated by the system would be 100 million 365 100 years = 3650 Billion records.
Assuming each record size of 100 bytes, the total storage requirement would be 3650 Billion records * 100 bytes = 365 TB
4. High-Level Design
4.1 High-level flow
The following diagram depicts the high-level flow of URL shortening service
4.1.1 Shortening Request
The user triggers the short URL creation request by providing the Long URL from the browser.
The browser sends the request to the URL Shortening Service
URL Shortening Service creates the short URL for the input URL and stores the data in the Datastore.
URL Shortening Service returns the short URL to the User.
4.1.2 Redirection Request
The user enters the short URL in the browser.
The browser sends the request to the URL Shortening Service.
URL Shortening Service queries the Datastore to fetch the actual URL corresponding to the short URL.
URL Shortening Service sends the redirect URL to the browser.
4.2 APIs
To expose the functionalities we will use the following REST APIs
Shortening the URL
Redirecting the URL
Deleting the URL
4.2.1 Shortening the URL
API request definition of the Short Url creation.
createShortUrl(actualURL, customURL, ttl)
// actualURL - Required Field. URL which needs to be shortened.
// customURL - Optional Field. Custom short URL.
// ttl - Optional Field. Time in seconds for which the short URL needs to be active.
Return Value: shortURL
4.2.2 Redirecting the URL
API request definition for redirecting the URL
redirectURL(url)
// url - Required Field. Shortened URL for which the actual URL need to be fetched from Datastore.
Returns a HTTP redirect response with HTTP code 302.
4.2.3 Deleting the URL
API request definition for deleting the URL
deleteURL(url)
// url - Required Field. Shortened URL for which the actual URL need to be fetched from Datastore.
4.3 Design Diagram
4.3.1 Shortening URL request:
The user client (eg. Browser) calls the URL Shortening service API gateway to create the short URL with the user’s API key.
API gateway triggers the
ShortURLHandler
Lambda to create the short URL for the actual URL in the request.ShortURLHandler
Lambda stores the URL mapping with other request params in the Dynamo DB and returns the short URL to the client.
4.3.2 Redirect URL request:
Browser calls the URL Shortening service API gateway to get the actual URL.
API gateway triggers the
RedirectURLHandler
Lambda to get the actual URL for the short URL in the request.RedirectURLHandler
Lambda queries the Dynamo DB with a short URL to fetch the actual URL and returns the actual URL to the client.
4.3.3 Delete URL request:
The user client calls the URL Shortening service API gateway to delete the short URL with the user’s API key.
API gateway triggers the
DeleteURLHandler
Lambda which deletes the short URL from the Dynamo DB.
4.4 Components
4.4.1 Short URL generator
We need to use two solutions URL encoding and Key generation for creating the short URL.
URL encoding
Base62 (Preferred)
MD5
Key Generation
4.4.1.1 URL Encoding through base62
A base is some digits or characters that can be used to represent a particular number. Base 10 are digits [0–9], which we use in everyday life and Base 62 are [0–9][a-z][A-Z].
As per our assumptions made for the system, it should be able to handle 3650 billion records which in turn means 3650 billion unique short URLs. Following are the unique URLs that can be generated for different URL lengths.
URL Length | Unique Records |
5 | ~916 million |
6 | ~56 billion |
7 | ~3500 billion |
To achieve our estimate of 3650 billion unique short URLs, we would need a URL of length 7.
Possible options to create Base 62:
Create Short URLs from random numbers: Generate a random number and convert it to base62. As more records get added to the Datastore the chances of collision increase.
Using a Counter or Key generation technique: This is commonly used for Server-based services. It uses a centralized key generation service or zookeeper to distribute keys to each server thereby avoiding collisions. See more under the
Key Generation
section.
4.4.1.2 URL encoding through MD5
The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value(or 32 hexadecimal digits). We can use this 32 hexadecimal digit for generating 7 characters long tiny URLs. These solutions have more chances of collision thereby increasing the Datastore queries as the first 7 characters of the generated hash could be the same for multiple URLs.
Steps
Encode the long URL using MD5 and take the first 7 characters.
The first 7 digits could be the same for different long URLs and hence query the Datastore for collision.
Store the generated URL if it is not already present in the Datastore. If the short URL is already present, try the next 7 characters and so on.
4.4.1.3 Key Generation
To create a unique short URL in the serverless system with minimal collisions, we will use timestamps as the key generator.
The epoch Time (in milliseconds) 1713585359347
can be converted to base 62 UASDXFr
which can be used as the unique key for short URLs. In case of conflict, we can use the first letter from the API gateway request ID
or the user’s apiKey
. For example, if we have two requests at the same timestamp 1713585359347
with base62 value as UASDXFr
and user API key as AIzaSyDaGmWKa4JsXZ-HjGw7ISLn_3namBGewQe
. To resolve the conflict while writing to Dynamo DB, we append the first letter A
from the API key thereby creating the short URL UASDXFrA
.
4.4.2 Database
4.4.2.1 Data Access Patterns
- Fetch the actual Long URL with the short URL.
We will go with Dynamo DB as the data store. Following is the DB schema
Field | Type | Description |
shortURL | partitionKey | Shortened URL for which the actual long URL was mapped to. |
longURL | URL which was shortened | |
ttl | Time in seconds the shortURL needs to be active. We can use Dynamo DB https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html functionality. |
4.5 Monitoring
The URL shortening service uses AWS Cloudwatch for monitoring the system performance and business metrics.
Some of the success metrics to track
Number of short URLs generated per day
Number of redirection provided by the system per day
Number of unique users using the Short URL service
4.6 Availability/Load balancer
Availability and Load balancing are handled internally by AWS and does not require any infrastructure setup. AWS SLA documentation.) provides that AWS supports monthly uptime of at least 99.95% for each AWS region.
4.7 Rate Limiting
Rate Limiting or Throttling can be set on AWS API gateway. Following are some of the basic throttling settings
AWS throttling limits are applied across all accounts and clients in a region. This is set by AWS and client cannot change update this configuration.
Per-account limits are applied to all APIs in an account in a specified Region. The limits can be updated by contacting the AWS customer support.
Per-API, per-stage throttling limits are applied at the API method level for a stage.
Per-client throttling limits are applied to clients that use API keys associated with your usage plan as client identifier.
Reference: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html
Conclusion
The design can be easily extended to
enable tier based subscription model when the users login to the system and ratelimit the user requests based on their subscription.
support Custom short URL.
Subscribe to my newsletter
Read articles from Aparna Vikraman directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by