A Lightweight, Serverless Self-Hosted GitHub Runners on AWS


This article explores the benefits and implementation of self-hosted GitHub Actions runners, particularly within corporate environments using GitHub Enterprise Cloud, where network access, security, cost, and control are critical. We focus on a solution using AWS ECS Fargate to dynamically scale runners, leveraging serverless container execution to minimise infrastructure management and optimise costs. The guide covers setting up this solution to efficiently manage CI/CD workflows with ease.
In the world of modern software development, CI/CD pipelines are not just a convenience; they are the backbone of efficient and reliable delivery. GitHub Actions has emerged as a dominant force in this space, offering seamless integration with source control and a straightforward way to build, test, and deploy code. While GitHub’s hosted runners are fantastic for getting started, but many teams adopt self-hosted solutions to meet specific operational needs, such as accessing private non-cloud resources, complying with stringent corporate governance, or ensuring CI/CD pipelines are not paused by GitHub Actions usage limits.
This is where self-hosted runners come in.
This is the first article in a series where we will design and build a fully serverless, event-driven, and cost-effective solution for self-hosted GitHub Actions runners using native AWS services. Let’s dive in.
A Quick Intro: GitHub Actions and Self-Hosted Runners
GitHub Actions is a CI/CD platform that allows you to automate your workflows directly from your GitHub repository. These workflows are triggered by events (like a git push
or the creation of a pull request) and are executed by runners.
GitHub provides two main types of runners:
GitHub-Hosted Runners: These are virtual machines managed and maintained by GitHub. They come pre-loaded with a wide range of software and are the default, easy-to-use option.
Self-Hosted Runners: These are runners that you deploy and manage in your own environment, whether it’s a physical server, a virtual machine in the cloud, or a container. You install the GitHub Actions runner agent on them, register them with your repository, organization, or enterprise, and they become available to pick up jobs from your workflows.
The “Why”: Choosing the Right Runner for the Job
Before diving into the benefits of self-hosting, it’s worth noting the convenience of GitHub-hosted runners. For many non-Enterprise and even some Enterprise users, they are an excellent starting point, offering a quick and easy way to get CI/CD pipelines running with significant advantages:
Simplicity and Speed: You can get started in minutes without any infrastructure setup.
Managed Environment: GitHub handles the maintenance, software updates, and security of the runner fleet, allowing teams to focus on their code.
This article isn’t about pitting self-hosted runners against GitHub-hosted ones. Instead, it’s about exploring a powerful, lightweight solution for scenarios where self-hosting becomes a necessity or a strategic advantage.
Check this deep dive explores important factors to consider When to choose GitHub-Hosted runners or self-hosted runners with GitHub Actions from GitHub Blog.
So, why go through the trouble of managing your own runners? The advantages are significant for specific use cases:
Cost Control: While GitHub offers a generous free tier of execution minutes, costs can escalate quickly once those limits are surpassed. GitHub-hosted runners are billed on a per-minute basis, with rates increasing for larger, more powerful machine types. For teams with frequent deployments or long-running jobs, these costs can become a significant operational expense. A self-hosted solution provides a powerful alternative for optimisation. By architecting a serverless system on AWS, you can leverage highly cost-effective compute options like Fargate containers or EC2 Spot Instances. These resources are provisioned just-in-time for a job and terminated immediately upon completion, meaning you only pay for the exact compute time used and eliminate any cost for idle runners.
Enhanced Security and Network Access: For many organisations, it’s critical that build artifacts never leave their private network. Self-hosted runners can be placed within your VPC, granting them secure access to internal resources like databases, artifact repositories, or private registries without exposing them to the public internet.
A Look at Existing Self-Hosted Solutions
The need for better self-hosted runners is not new, and the community has produced some excellent solutions. Two popular ones are:
philips-labs/terraform-aws-github-runner: This is a very popular and robust Terraform module that provisions scalable, spot-instance-based runners on AWS. It’s a powerful solution but can be complex to configure and manage, and it relies on a fleet of EC2 instances.
actions-runner-controller/actions-runner-controller (ARC): This is the official, open-source solution from GitHub for running actions runners on a Kubernetes cluster. It’s the go-to for teams heavily invested in Kubernetes but can be overkill if you aren’t already running and managing a K8s cluster.
While both are fantastic, they can still introduce operational overhead. We wanted to see if we could build something even more “hands-off.”
The Solution: A Lightweight, Serverless Architecture
Our goal was to create an event-driven system that provisions a runner exactly when it’s needed and tears it down the moment the job is done, incurring virtually zero cost when idle.
Here is the high-level architecture:
Let’s break down how it works:
GitHub Webhook: It all starts with a webhook configured in a GitHub App installed in GitHub Organization to access all or selected repositories. The webhook is configure it to fire on the
workflow_job
event. This event is perfect because it contains the context we need, such as thejob queued
orcompleted
.AWS API Gateway: The webhook needs a secure, scalable endpoint to send its payload to. An API Gateway REST API is the perfect serverless front door, it receives the request from GitHub and is configured to trigger our primary Lambda function. we selected the REST API over HTTP API for the extra layer of protection by using the resource policy to restrict the calls from the GitHub Actions IPs only.
AWS Lambda: This is the core of the solution. The Lambda function receives the webhook payload, validates it to ensure it came from owned GitHub Organization(s), and inspects it to see what kind of runner is being requested based on the runner labels. Its sole job is to make one decision: “Do I need to start a runner?” If the answer is yes, it makes calls to…
Request a runner registration token from GitHub.com
Use the registration token and extracted job details to provision a runner
AWS ECS Fargate: Instead of managing EC2 instances, we use Fargate, the serverless compute engine for containers. The Lambda function simply invokes the
RunTask
API call to start a new Fargate task with the provided environment variables. This task uses a pre-built Docker container image that has the GitHub Actions runner agent installed. The agent starts, registers itself with GitHub, runs the single job it was assigned, and then the task terminates.
This architecture is incredibly lightweight, secure, and cost-effective. We don’t have any idle EC2 instances waiting for jobs. We only pay for the Lambda execution and the exact duration of the Fargate task while the job is running.
What’s Next?
We’ve laid out the vision and the core components of our serverless self-hosted runner solution. In the next article in this series, we will get our hands dirty and walk through the first steps of implementation: setting up the GitHub App and Webhook as start point of the build
Stay tuned!
Subscribe to my newsletter
Read articles from Eric directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by