Part 3: Video transcoding using FFmpeg and AWS Lambda


In my previous article, I discussed in depth about the backend architecture which is hosted entirely on AWS, Part-2. In this article, I will talk more about the video transcoding process itself, how the videos are converted to a different format, for e.g. Livestreaming chunk files, mkv, mp4 and webm based formats.
Lets take a look at the overview of the transcoding process
The S3 bucket publishes an event which invokes a Lambda function
The event has the key, which is the file name of the video file present in the S3 bucket.
The Lambda function is a python script that fetches the video content, stores it in the lambda’s ephemeral storage and invokes a python subprocess, which is the ffmpeg CLI tool
The FFmpeg CLI tool is made available to the Lambda runners as a Layer, a Layer can be thought of as a storage that is accessible to all the Lambda invocations, this storage is separate from the default ephemeral storage link that AWS Lambda provides. It can be used to share dependencies such as external packages across functions and keeps the code logic separate.
Finally after the video conversion is complete, the converted video is persistent in the storage, and sent to another S3 destination bucket.
The destination bucket is configured to publish an event upon file upload and the event is published to an SQS queue.
AWS Lambda and FFmpeg layer
Creating the Lambda function
Before adding the layer, we need to create a Lambda function
To create a Lambda function, go to AWS Lambda and create a function.
You can create a basic IAM role that gives access to the S3 buckets as our python code will fetch the video file present at S3 storage. When done, click “Create function”
Creating the FFmpeg layer
We need to create the layer for AWS Lambda. To do that, go to AWS > Lambda > Layers and click on create Layer
We will upload our ffmpeg tool as a zip file to be used as layer.
For downloading the zip file, that is our ffmpeg binaries, we can use the following link that hosts these binaries link.
Once downloaded, you can unzip the binaries and upload the ffmpeg tool in the layer.
Adding the FFmpeg layer to AWS Lambda
From the selected box, you can click on “Layers” to add a layer
With these things setup, we are now ready with our layer attached to AWS Lambda.
Configuring the Lambda function
Before we move to the python CLI, there are a few important configurations needed to be done
Firstly, we need to increase the function timeout and memory requirements as video transcoding is a demanding process, and it takes some time for ffmpeg to convert the video from one format to another.
Based on my testing, a 30Mb video file at max takes 2 minutes with 1Gb of memory requirement to get converted. Keeping a buffer, I have kept the constraints a bit relaxed in the configuration. If the timeout is set to default, the process will exit and it will be incomplete.
In the triggers section, you can see I have provided my input S3 bucket as the incoming event source. You can add the trigger by clicking on “Add trigger” and choose the event source and choose the action for which you want the event to be created
Make sure that the input S3 bucket is configured to publish an event
The above image shows the event notification I have configured on my input bucket. To create an event notification on the bucket, go to Bucket > properties and event notifications and choose Lambda as the source.
One last configuration step is to add the environment variable in the Lambda
I have added the destination bucket which forms the output bucket for storage of transcoded videos.
Python script for video transcoding
import json
import urllib.parse
import boto3
import os
import subprocess
#Initialise S3 client
s3 = boto3.client('s3')
def lambda_handler(event, context):
# Extract bucket name and key from S3 event
bucket = event['Records'][0]['s3']['bucket']['name']
# Parse any embedded special characters in the key
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
parts = key.split('/')
if len(parts) < 3:
raise ValueError("Invalid S3 key structure. Expected format is 'uuid/extension/resolution/filename'")
# Video download path
download_tmp_path = f'/tmp/{os.path.basename(key)}'
# Extract target file extension
target_file_extension = parts[1]
# Extract target resolution
target_file_resolution = parts[2]
output_filename = f'output.{target_file_extension}'
output_path = f'/tmp/output/{parts[0]}/'
# Get the destination bucket from env variables
dest_bucket = os.getenv('DESTINATION_BUCKET', None)
# throw error if destination bucket is undefined
if not dest_bucket:
raise Error("Destination bucket not defined, please add variable")
try:
# check if video exists in storage for the key
response = s3.get_object(Bucket=bucket, Key=key)
# download the video file from s3 to download_tmp_path
s3.download_file(bucket, key, download_tmp_path)
# FFmpeg video conversion and save video to output_path
try:
os.makedirs(output_path, exist_ok=True)
convert_video(target_file_resolution, target_file_extension, download_tmp_path, output_path, output_filename)
except Exception as e:
print('Unable to convert video file using ffmpeg')
return
# For all the files converted by ffmpeg, upload it to destination bucket
for root, _, files in os.walk(output_path):
for file in files:
file_path = os.path.join(root, file)
relative_path = os.path.relpath(file_path, output_path)
s3_key = f'{extract_s3_prefix(key)}/{relative_path}'
print(f"Uploading {file_path} to s3://{bucket}/{s3_key}")
s3.upload_file(file_path, dest_bucket, s3_key)
print("Converted file uploaded successfully")
return
except Exception as e:
print('Error executing Lambda, find reason for error below')
raise e
def convert_video(tgt_res, target_format, input_file_path, output_file_path, output_filename):
# FFmpeg target resolution command
resolution = f"scale=-2:{tgt_res}"
# codec command
if target_format == "mp4":
codec = ["-c:v", "libx264", "-preset", "slow", "-crf", "23", "-c:a", "aac"]
elif target_format == "mkv":
codec = ["-c:v", "libx264", "-preset", "slow", "-crf", "23", "-c:a", "aac"]
elif target_format == "webm":
codec = ["-c:v", "libvpx-vp9", "-crf", "30", "-b:v", "0", "-b:a" ,"128k", "-c:a", "libopus"]
elif target_format == "m3u8":
codec = [
"-c:v", "libx264", "-preset", "veryfast", "-crf", "23", # Video codec
"-c:a", "aac", # Audio codec
"-start_number", "0", # Start numbering segments from 0
"-hls_time", "10", # Segment duration in seconds
"-hls_playlist_type", "vod", # HLS playlist type
]
else:
raise ValueError("Unsupported target format")
try:
# start the subprocess from the lambda layer
subprocess.run([
"/opt/bin/ffmpeg", "-y", "-i", input_file_path,
"-vf", resolution,
*codec,
output_file_path + output_filename
], check=True)
except subprocess.CalledProcessError as e:
print('FFmpeg conversion failed')
raise e
def extract_s3_prefix(key):
"""
Extract the S3 prefix up to and including the 'resolution' part of the key.
:param key: The full S3 key (e.g., "uuid/format/resolution/xxx.mov").
:return: The prefix up to the 'resolution' (e.g., "uuid/format/resolution/").
"""
parts = key.split("/") # Split the key by "/"
if len(parts) >= 3: # Ensure there are enough segments
return "/".join(parts[:3])
else:
raise ValueError("Invalid key format: Cannot extract prefix")
The script has 4 steps in total:
From the incoming event, extract the target extension and resolution. The sample key event is like
uuid/mp4/720/filename.mov
which means that the video is supposed to be converted to an.MP4
video from.MOV
with720p
being the target resolutionThe file is downloaded in the ephemeral storage on the function
The FFmpeg tool works as a CLI, and supports a defined set of commands that work for different sets of video conversions. The function invokes the CLI as a subprocess and passes the commands to it.
Finally the converted video is stored at an output path and uploaded to destination bucket.
Note that the storage in AWS Lambda can be re-used across multiple invocations of the lambda, which is why the uuid
is used parts[0]
to have a unique path for the output.
Final steps
The output bucket is configured to publish multiple events to the SQS queue.
The SQS is a queue created with defaults to publish these events to consumer services.
Conclusion
With this article, I will conclude the video transcoder series. I had immense fun developing this application, learning more about AWS from the documentation, for a practical use case, was a knowledgable experience.
Be sure to read my other two articles in these series. Serverless Video transcoder with AWS
I hope you enjoyed reading this, please find relevant links to the project / my profile below
Subscribe to my newsletter
Read articles from Abdulmateen Pitodia directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Abdulmateen Pitodia
Abdulmateen Pitodia
Software engineer, passionate about learning and exploring distributed systems, tinkering around with Frontend, learning on the go