Video transcoding using FFmpeg and AWS Lambda

In my previous article, I discussed in depth about the backend architecture which is hosted entirely on AWS, Part-2. In this article, I will talk more about the video transcoding process itself, how the videos are converted to a different format, for e.g. Livestreaming chunk files, mkv, mp4 and webm based formats.

Lets take a look at the overview of the transcoding process

The S3 bucket publishes an event which invokes a Lambda function
The event has the key, which is the file name of the video file present in the S3 bucket.
The Lambda function is a python script that fetches the video content, stores it in the lambda’s ephemeral storage and invokes a python subprocess, which is the ffmpeg CLI tool
The FFmpeg CLI tool is made available to the Lambda runners as a Layer, a Layer can be thought of as a storage that is accessible to all the Lambda invocations, this storage is separate from the default ephemeral storage link that AWS Lambda provides. It can be used to share dependencies such as external packages across functions and keeps the code logic separate.
Finally after the video conversion is complete, the converted video is persistent in the storage, and sent to another S3 destination bucket.
The destination bucket is configured to publish an event upon file upload and the event is published to an SQS queue.

AWS Lambda and FFmpeg layer

Creating the Lambda function

Before adding the layer, we need to create a Lambda function

To create a Lambda function, go to AWS Lambda and create a function.

You can create a basic IAM role that gives access to the S3 buckets as our python code will fetch the video file present at S3 storage. When done, click “Create function”

Creating the FFmpeg layer

We need to create the layer for AWS Lambda. To do that, go to AWS > Lambda > Layers and click on create Layer

We will upload our ffmpeg tool as a zip file to be used as layer.

For downloading the zip file, that is our ffmpeg binaries, we can use the following link that hosts these binaries link.

Once downloaded, you can unzip the binaries and upload the ffmpeg tool in the layer.

Adding the FFmpeg layer to AWS Lambda

From the selected box, you can click on “Layers” to add a layer

With these things setup, we are now ready with our layer attached to AWS Lambda.

Configuring the Lambda function

Before we move to the python CLI, there are a few important configurations needed to be done

Firstly, we need to increase the function timeout and memory requirements as video transcoding is a demanding process, and it takes some time for ffmpeg to convert the video from one format to another.

Based on my testing, a 30Mb video file at max takes 2 minutes with 1Gb of memory requirement to get converted. Keeping a buffer, I have kept the constraints a bit relaxed in the configuration. If the timeout is set to default, the process will exit and it will be incomplete.

In the triggers section, you can see I have provided my input S3 bucket as the incoming event source. You can add the trigger by clicking on “Add trigger” and choose the event source and choose the action for which you want the event to be created

Make sure that the input S3 bucket is configured to publish an event

The above image shows the event notification I have configured on my input bucket. To create an event notification on the bucket, go to Bucket > properties and event notifications and choose Lambda as the source.

One last configuration step is to add the environment variable in the Lambda

I have added the destination bucket which forms the output bucket for storage of transcoded videos.

Python script for video transcoding

import json
import urllib.parse
import boto3
import os
import subprocess
#Initialise S3 client
s3 = boto3.client('s3')

def lambda_handler(event, context):
    # Extract bucket name and key from S3 event
    bucket = event['Records'][0]['s3']['bucket']['name']
    # Parse any embedded special characters in the key
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')

    parts = key.split('/')
    if len(parts) < 3:
        raise ValueError("Invalid S3 key structure. Expected format is 'uuid/extension/resolution/filename'")

    # Video download path
    download_tmp_path = f'/tmp/{os.path.basename(key)}'
    # Extract target file extension
    target_file_extension = parts[1]
    # Extract target resolution
    target_file_resolution = parts[2]
    output_filename = f'output.{target_file_extension}'
    output_path = f'/tmp/output/{parts[0]}/'

    # Get the destination bucket from env variables
    dest_bucket = os.getenv('DESTINATION_BUCKET', None)

    # throw error if destination bucket is undefined
    if not dest_bucket:
        raise Error("Destination bucket not defined, please add variable")

    try:
        # check if video exists in storage for the key 
        response = s3.get_object(Bucket=bucket, Key=key)

        # download the video file from s3 to download_tmp_path
        s3.download_file(bucket, key, download_tmp_path)

        # FFmpeg video conversion and save video to output_path
        try:
            os.makedirs(output_path, exist_ok=True)
            convert_video(target_file_resolution, target_file_extension, download_tmp_path, output_path, output_filename)
        except Exception as e:
            print('Unable to convert video file using ffmpeg')
            return

        # For all the files converted by ffmpeg, upload it to destination bucket
        for root, _, files in os.walk(output_path):
            for file in files:
                file_path = os.path.join(root, file)
                relative_path = os.path.relpath(file_path, output_path)
                s3_key = f'{extract_s3_prefix(key)}/{relative_path}'

                print(f"Uploading {file_path} to s3://{bucket}/{s3_key}")
                s3.upload_file(file_path, dest_bucket, s3_key)
                print("Converted file uploaded successfully")
        return
    except Exception as e:
        print('Error executing Lambda, find reason for error below')
        raise e


def convert_video(tgt_res, target_format, input_file_path, output_file_path, output_filename):
    # FFmpeg target resolution command
    resolution = f"scale=-2:{tgt_res}"

    # codec command
    if target_format == "mp4":
        codec = ["-c:v", "libx264", "-preset", "slow", "-crf", "23", "-c:a", "aac"]
    elif target_format == "mkv":
        codec = ["-c:v", "libx264", "-preset", "slow", "-crf", "23", "-c:a", "aac"]
    elif target_format == "webm":
        codec = ["-c:v", "libvpx-vp9", "-crf", "30", "-b:v", "0", "-b:a" ,"128k", "-c:a", "libopus"]
    elif target_format == "m3u8":
        codec = [
        "-c:v", "libx264", "-preset", "veryfast", "-crf", "23",  # Video codec
        "-c:a", "aac",  # Audio codec
        "-start_number", "0",  # Start numbering segments from 0
        "-hls_time", "10",  # Segment duration in seconds
        "-hls_playlist_type", "vod",  # HLS playlist type
        ]
    else:
        raise ValueError("Unsupported target format")


    try:
        # start the subprocess from the lambda layer
        subprocess.run([
            "/opt/bin/ffmpeg", "-y", "-i", input_file_path,
            "-vf", resolution,
            *codec,
            output_file_path + output_filename
        ], check=True)
    except subprocess.CalledProcessError as e:
        print('FFmpeg conversion failed')
        raise e  

def extract_s3_prefix(key):
    """
    Extract the S3 prefix up to and including the 'resolution' part of the key.

    :param key: The full S3 key (e.g., "uuid/format/resolution/xxx.mov").
    :return: The prefix up to the 'resolution' (e.g., "uuid/format/resolution/").
    """
    parts = key.split("/")  # Split the key by "/"
    if len(parts) >= 3:  # Ensure there are enough segments
        return "/".join(parts[:3])
    else:
        raise ValueError("Invalid key format: Cannot extract prefix")

The script has 4 steps in total:

From the incoming event, extract the target extension and resolution. The sample key event is like uuid/mp4/720/filename.mov which means that the video is supposed to be converted to an .MP4 video from .MOV with 720p being the target resolution
The file is downloaded in the ephemeral storage on the function
The FFmpeg tool works as a CLI, and supports a defined set of commands that work for different sets of video conversions. The function invokes the CLI as a subprocess and passes the commands to it.
Finally the converted video is stored at an output path and uploaded to destination bucket.

Note that the storage in AWS Lambda can be re-used across multiple invocations of the lambda, which is why the uuid is used parts[0] to have a unique path for the output.

Final steps

The output bucket is configured to publish multiple events to the SQS queue.

The SQS is a queue created with defaults to publish these events to consumer services.

Conclusion

With this article, I will conclude the video transcoder series. I had immense fun developing this application, learning more about AWS from the documentation, for a practical use case, was a knowledgable experience.

Be sure to read my other two articles in these series. Serverless Video transcoder with AWS

I hope you enjoyed reading this, please find relevant links to the project / my profile below

GitHub Link

Live

My LinkedIn

Part 3: Video transcoding using FFmpeg and AWS Lambda