How to Create a URL Shortener with Custom Analytics using AWS Services

TL;DR; This article covers how to implement a basic URL Shortener API with Custom Metrics. The sample project implements a Serverless Solution, deployed with the help of AWS CDK that relies on API GateWay, DynamoDB, Lambdas and Custom Cloudwatch metrics.

Introduction

Have you ever wondered how URL shorteners work under the hood? Or, even better, how to implement a similar solution that allows you to create custom analytics based on user behavior and types?

In this article, we’ll discuss what is needed to implement such features and provide a step-by-step guide (with a complete sample repository) for deploying them on your account.

Building the API

Overview

The API will have two different endpoints:

  • A private endpoint to allow for redirect configurations to be created

  • A public endpoint to redirect users to the desired URL

Diagram illustrating a URL redirect architecture using AWS services. Route 53 directs to API Gateway, which handles requests with two Lambda functions: CreateRedirectURL Lambda and RedirectURL Lambda. Data is logged to CloudWatch, and URL data is stored in a Redirects Table. An API key is used for authentication.

As one can see in the provided infrastructure Diagram, the resources required to build the API are very straightforward.

To build the API we took advantage of serverless AWS services, such as API Gateway, Lambdas, and DynamoDB.

The Infrastructure was designed as a proof-of-concept, but one could further improve it by adding IAM authentication, CloudFront, or WAF to further protect the API infrastructure.

Custom Domain with Route53

A custom Domain and an already configured Hosted Zone are prerequisites to deploying the provided CDK project, but one could easily remove that configuration and test the infrastructure with the default API Gateway domain.

Diagram showing a connection between Route 53 and an API Gateway. The text "Hosted Zone, Domain Certificate, A Record" is in between.

As part of the project, we use Route53 to configure a custom domain for our API Gateway, since using the default domain wouldn’t act as a “URL Shortener” for most scenarios.

The CDK project expects the Hosted Domain to be already configured, but it will automatically generate the Domain Certificate and A Record for the new API endpoints to allow for HTTPS traffic.

Redirects DynamoDB Table

A Single DynamoDB Table will be used to store the required redirect configurations.

Interface displaying database model details for "ShortenedUrls." Includes sections for primary key attributes with "urlId" as a string partition key and other attributes like "ttl," "originalURL," and "ttlInSeconds" with their types and sample data formats.

The Data model used by the sample project is straightforward, storing only a few attributes:

  • urlId - Used as PrimaryKey and also as part of the shortened URL. The redirect URLs are generated as https://link.<your_url>/<urlId>

  • ttl - Timestamp used to identify when the redirect record should be removed from the DynamoDB table

  • originalURL - The actual URL to which incoming requests should be redirected to

  • ttlInSeconds - Number of seconds for which the record was created for

The DynamoDB table is also configured to use the pay-per-request pricing model, time to live enabled on the TTL attribute, and to be deleted once the stack is removed.

Create URL endpoint

The Create URL endpoint will be the one responsible for creating and saving redirect configurations.

Flowchart showing user interaction with an API Gateway, protected by an API key. The user sends a PUT request, which goes through the gateway to a Lambda function named "CreateRedirect URL Lambda," which then updates a Redirects Table.

The configuration for this endpoint will be very straightforward, it will be triggered for any POST request received on the configured domain that provides a given urlId - such as https://link.<your_url>/<urlId>.

This endpoint will be private, requiring an API Key to be called, and a JSON schema was configured to allow API Gateway to validate any incoming request body.

Having API Gateway validating incoming requests allows for faster response times (in case of errors) and avoids invocations of the Lambda function for invalid requests.

The CreateRedirectURL Lambda in the sample project implements a basic approach, where it will:

  1. Ensure the received input is correct

  2. Build and send a conditional PUT request to DynamoDB, a condition expression will be added to avoid overwriting existing records with the same primary key

  3. Based on the response:

    1. DynamoDB operation succeeded —> Build the success response and return the newly created redirect URL

    2. Operation failed —> Map the error and build the error response

As future improvements, one could implement more validations, link the created records to a given user / API Key, or even add some custom metrics to, for example, record how many redirects are created for given domains.

Redirection endpoint

The redirect URL endpoint is public and is the one responsible for creating the metrics and redirecting the user to the appropriate URLs.

Architecture diagram illustrating a user sending a GET request to an API Gateway public endpoint, which connects to a RedirectURL Lambda function. The function interacts with a Redirects Table to retrieve data and sends metrics to CloudWatch.

This endpoint is configured to be triggered for every GET request received on the configured domain that provides a given urlId - such as https://link.<your_domain>/<urlId>.

The responsibilities of this Lambda are:

  1. Validating incoming requests to ensure a valid path parameter has been provided.

  2. Retrieve the redirect configuration from DynamoDB based on the provided urlId

  3. Inspect the request headers and DynamoDB response to generate and store the custom metrics on CloudWatch

  4. Based on the DynamoDB response

    1. If a redirect record was found —> return the success response with redirect configuration

    2. no record found or DynamoDB request failed —> Map and return the appropriate error response

The “magic” of this Lambda is how the success response is built, which allows browsers to redirect users to the provided URL. For Example:

{
    "statusCode": 302,
    "headers": {
        "Location": "https://lhidalgo.dev/url-shortener-custom-metrics"
    }
}

The redirection works thanks to the returned status code, as returning 3XX lets the browser know that the requested URL has been redirected. For our scenarios, the relevant 3XX status codes are:

  • 301 —> Tells the browser that the URL has been moved permanently, the browser might use cached responses on consequent requests.

  • 302 —> Resource has been moved temporarily, the browser will always call the original URL in consequent requests.

Since we want to track how many times a given URL is requested and create metrics based on the request information, the sample project will use the 302 status code. For other scenarios, where metrics are not implemented, or only unique user requests are relevant, using the 301 status code would be recommended to reduce the amount of received requests.

Custom Metrics

Overview

AWS CloudWatch will by default record some basic usage and performance metrics of deployed services, such as Lambda Functions, where metrics such as Invocation Count, Duration, and Error vs. Success Rate are automatically collected.

A part of the default metrics, CloudWatch also allows you to create custom metrics to track specific aspects of your application.

When working with custom metrics, one should be aware of the format in which they are stored, a quick overview of the most relevant aspects would be:

  • Namespace —> could be seen as categories where multiple metrics can be created, especially useful for categorizing related metrics or differentiating them by service or feature.

  • Metric Name —> names to identify specific metrics, for example RedirectRequest

  • Unit & Value —> attributes to inform Cloudwatch what and in what format is being recorded and the actual value for the same

  • Dimensions —> additional information, provided as Key-Value pairs, that provides additional data to the given metric.

Developers should be very mindful when implementing custom metrics, as the cost can escalate quickly. When using Dimensions, the metric count is the equivalent of every unique combination of Metric names and Dimensions.

Implementing Custom Metrics

There are multiple ways to send custom metrics to CloudWatch, a quick overview of the different options would be:

BenefitsDrawbacks
Log Filter Subscription MetricCan be enabled, disabled, and changed without modifying the code. Completely async, doesn’t add any additional execution time to the requests.Depends on CloudWatch Logs being enabled, can be difficult to configure without structured logs.
Lambda Powertools Metrics package (CloudWatch Embedded metric format)Package maintained by AWS. High customization potential. No async-calls are required.Depends on CloudWatch Logs being enabled, can be difficult to configure without structured logs. Require source code changes in order to add them.
AWS-SDK CloudWatch PutMetricCommandMost customizable approach. CloudWatch API requests can send multiple metrics at once. CloudWatch logs can be disabled.Require source code changes in order to add them. Require an async API request in order to emit metrics, adding some latency to the execution time.

The provided sample project implements custom CloudWatch metrics using the aws-sdk v3 to keep the sample as vanilla as possible, the Lambda PowerTools metrics package is recommended for any scenario where response times are critical, as no additional latency is introduced.

As part of our example, we create a single metric called RedirectRequest which includes the following dimensions:

  • shortName —> urlId / pathparameter of the redirect URL

  • platform —> device platform/OS, fetched from the request headers

  • deviceLanguage —> device language, fetched from the request headers

  • browser —> browser used, fetched from the request headers

  • domain —> the domain of the redirection URL, f.e.: lhidalgo.dev

  • redirectURL —> the full URL where the user will be redirected to, f.e.: https://lhidalgo.dev/url-shortener-custom-metrics

  • success —> flag to identify if the user was successfully redirected or not

Having these dimensions will allow us to implement different analyses, f.e.: language distribution of users accessing lhidalgo.dev.

Consuming Custom Metrics

CloudWatch metrics can be exported and consumed from third-party dashboards and analytics platforms, but the easiest approach is to use the AWS Cloudwatch Console, using the console one can explore all the available metrics and even create a custom dashboard to ease the recurrent access to given metrics.

A screenshot of a metric query interface showing redirect data. It lists redirects to lhidalgo.dev, serverlessguru.com, and Google, displaying minimum, maximum, sum, and average values. Options to add queries, dynamic labels, and math functions are visible, with additional settings for statistics, period, and timezone.

As one can see in the above screenshot, the CloudWatch console also allows us to implement Metric Math expressions to aggregate the metric data to our desired format.

Taking advantage of this one can see how, thanks to the dimensions we provided to the metrics, we can now create a table view to analyze the redirect count to the different domains.

Deploying the provided repository

If you want to deploy the stack that was created as part of this article, feel free to fork the public repository and follow the next steps.

Requirements

To be able to deploy and use this application without requiring any changes you'll need:

Deployment steps

  1. Fork & Clone the repository

  2. Install the project dependencies npm i

  3. Set the required environment variable DOMAIN_NAME to your custom domain. Examples:

  4. Synthesize the Cloudformation template to ensure the configuration is correct npx cdk synth

    • Some errors might be thrown if the Route53 Hosted Zone doesn't exist or your current AWS role lacks the proper access to it
  5. Deploy the application npx cdk deploy

  6. (Optional) Delete the application once you're done using it npx cdk destroy

Using the API

  1. Retrieve the following values of the Stack Outputs printed during the deployment in the previous step

    • UrlShortenerCustomMetricsStack.APIKeyID —> You'll need to replace the string <api-key-id> with the value returned

    • UrlShortenerCustomMetricsStack.customAPIUrlOutput —> You'll need to replace the <your_url> with the value returned

  2. Retrieve the API Key value and use it to replace the <api-key-value> in the following examples, some options to do so would be:

    • Navigate to the AWS Console and retrieve it manually

    • Execute the following AWS CLI command aws apigateway get-api-key --api-key <api-key-id> --include-value --query "value" --output text

  3. Create your first redirection URL

         curl --location 'https://<your_url>/exmaple' \
         --header 'Content-Type: application/json' \
         --header 'x-api-key: <api-key-value>' \
         --data '{
             "originalURL": "https://lhidalgo.dev/url-shortener-custom-metrics",
             "ttlInSeconds": 360000
         }'
    
  4. The previous curl command will return the redirection URL as part of its body, the response should be "https://<your_url>/exmaple"

  5. Open that link on any browser and you'll be redirected to the URL you sent as originalURL in the previous request

Conclusion

In conclusion, building a URL shortener with custom metrics using AWS services is easier than one could think of and allows you to self-host your solution giving you more flexibility into what statistics you want to create.

This article also showcases how easy it is to create custom CloudWatch metrics, which can be implemented in any existing project to increase the observability of the same and provide valuable insights into the behavior of your application and users.

References

1
Subscribe to my newsletter

Read articles from Lorenzo Hidalgo Gadea directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Lorenzo Hidalgo Gadea
Lorenzo Hidalgo Gadea

💻 Full Stack Software Engineer and ☁ Serverless Developer, focused on building efficient and cost-effective applications using cloud-based technologies