Using AI to generate repetitive code automatically with Python


https://github.com/JTSG1/GreyhoundDash
I’m building GreyhoundDash, a monitoring dashboard that can run either self-hosted or as SaaS.
The dashboard is intended as a tool to help users monitor their self-hosted applications from a single view. This is definitely not a unique endeavour as there are many others like Heimdal. But it serves as an interesting platform to base a few different experiments from.
To recognise and health-check the long tail of open-source, self-hostable services, the dashboard needs a machine-readable definition for each one.
Hand-authoring those definitions doesn’t scale; the Awesome-Self-Hosted index alone lists hundreds of candidates.
Instead of grinding through them manually, I’m using large-language-model code generation to create the boilerplate classes automatically. This series documents the experiment—why I chose the approach, how the tooling is wired together, and what still needs tightening up.
Requirements
Core workflow
Detect new service
Scan the curated list (JSON produced from Awesome-Self-Hosted).
Generate code
Produce a Python subclass of
ServiceBase
that performs a simple “is-alive” check.Create PR
Commit the file on a new branch, push, and open a pull-request back to
main
. Branch/PR names must be traceable (feat/basic-svc-<slug>
, etc.).
Determinism & safety
Same input → same output. If the model drifts, retry up to n times, then quarantine the service.
Validate the JSON schema with Pydantic before writing to disk; on failure, log and skip.
Cost & noise controls
Batching: group 10 services per PR (
PR_FREQUENCY
) so reviewers aren’t flooded.Budget guard: stop once the estimated OpenAI spend or token quota for the run is hit.
Checkpointing: persist a
completed_services.json
so reruns pick up where they left off.
These constraints keep the process affordable and the repository maintainable while still letting the LLM do the grunt work. This article will focus on the first and third points.
Harvesting candidate services
Source: https://awesome-selfhosted.net/#software
Goal: Iterate over every entry and convert to a JSON format for easy parsing
This bit was relatively straight forwards, I decided to parse the content on this page https://awesome-selfhosted.net/#software (awesome project that I fully endorse) using the BS4 lib. I won’t go into specifics here but I will share the code:
Extraction pipeline:
1: Fetch the HTML
def fetch_awesome_selfhosted():
url = "https://awesome-selfhosted.net/#software"
response = requests.get(url)
if response.status_code != 200:
raise Exception(f"Failed to fetch the page: {response.status_code}")
return response.text
2: Parse with Beautiful Soup
BS is a great Python library used for parsing data from web pages, often used for web scraping. Parsing over HTML can be quite delicate, as any changes to the structure of the page can throw off the code.
def extract_services(html: str) -> dict[str, dict]:
soup = BeautifulSoup(html, "html.parser")
services, skipped = {}, []
for s in soup.select("#software section"):
name = s.h3.get_text(strip=True)
link = s.select_one("a.external-link")
if not link:
skipped.append(name); continue
services[to_camel(name)] = {
"name": name,
"description": (s.p or "").get_text(strip=True),
"homepage": link["href"],
}
if skipped:
logging.warning("Skipped %d services without a homepage: %s",
len(skipped), ", ".join(skipped))
return services
3: Persist
The list should be persisted so it can be used as input to the Language Model.
with open("tooling/awesome_selfhosted_services.json", "w") as f:
import json
json.dump(result, f, indent=2)
The scraper gives me a one-shot JSON cache with exactly the three fields the generator needs—name
, description
, homepage
. Whenever Awesome-Self-Hosted changes, I just rerun the script and the new entries appear in the file.
It’s deliberately minimal:
No retry/back-off and no DOM diffing—acceptable for tooling, not for prod.
Relies on the current page structure; if the maintainers rename a class, the selector fails and I’ll fix it then.
Runs on demand, outside the dashboard’s runtime path, so a crash can’t hurt users.
I also persist a awesome_selfhosted_services_completed.json
checkpoint. Each code-gen run looks at that file first, skips anything already merged, and works on the remainder. Preventing duplicate processing and allows the code to pick what hasn’t been done only.
Generating the service stubs
The LLM must return something I can deserialize without guess-work, so the interaction is schema-first:
Schema contract
A singleServiceResponseSchema
(Pydantic) drives both the OpenAI call and local validation:class ServiceResponseSchema(BaseModel): file_name: str # e.g. "service_grafana.py" service: str # Human-readable name code: str # The Python subclass itself logo: str # URL or data-URI (future feature) web_app: bool # True ⇢ include, False ⇢ skip
Any field mismatch raises immediately; no fragile
json.loads()
spelunking.Prompt template
Stored intooling/service_generation_prompt.txt
with Jinja-like tokens:You are ChatGPT. Generate a subclass of {{ base_class_name }}. ``` python {{ base_class_file }} ``` Sample implementations of this base class: ``` python {{ implementation_samples }} ``` Response must comply with the below schema: ``` json {{ json_schema }} ```
At runtime the script simply
.replace()
s the placeholders—no template engine needed.
Why JSON over code blocks?
Without a rigid envelope, the model may prepend helper text or miss imports, which breaks CI. JSON gives me a single thing to parse—and Pydantic enforces it at runtime.
Why a bare string replace instead of Jinja?
The prompt is configuration, not application code; simplicity beats another dependency.
Future cleanup: swap the naïve .replace()
for str.format_map()
so accidental {{ }}
inside comments don’t expand.
Wiring the pipeline
The automation lives in /tooling
and is split into two well-defined actors:
ServiceGenerator
Actor | Responsibility | Why seperate |
ServiceGenerator | - Iterate over the JSON cache | Pure “business logic” & prompt-handling; no Git side-effects. |
- Call OpenAI, validate the ServiceResponseSchema | ||
- Write the new service_*.py files | ||
- Update awesome_selfhosted_services_completed.json | ||
GithubPRCreator | - Spawn a branch | Keeps VCS concerns isolated; lets you swap GitHub for GitLab later. |
- git add/commit/push | ||
- Open a pull-request via GitHub API |
GithubPRCreator
(excerpt)
def create_branch_and_raise_pr(self, branch, services):
# 1. git checkout -b <branch>
# 2. git add -A && git commit -m "feat: ..."
# 3. git push -u origin <branch>
pr = self.repo.create_pull(
title=f"gen: {services[0]} → {services[-1]}",
body="\n".join(f"{n}: {ok}" for n, ok in services),
head=branch,
base="main",
)
Guard rails
Refuses to reuse an existing branch (fast-fail).
Logs the PR URL so CI can post a comment or Slack alert later.
ServiceGenerator
(key points)
PR_FREQUENCY = 10 # services per PR
BATCH_SIZE = 10 # services per run (keeps token spend predictable)
GPT_MODEL = "gpt-4o-mini"
Prompt fill → deterministic JSON response (temperature = 0).
Schema validation with Pydantic; on failure retry ≤ 5 then skip.
Write file only when
web_app is True
.Book-keeping — append to
completed.json
so reruns resume cleanly.Every
PR_FREQUENCY
successes → delegate toGithubPRCreator
.
Entrypoint
if __name__ == "__main__":
gh = GithubPRCreator(Github(os.getenv("GH_TOKEN")), "JTSG1/GreyhoundDash")
ServiceGenerator(openai_client=OpenAI(), github_pr_creator=gh).iterate_services()
The main
block wires the two actors together; nothing else runs in global scope, which makes the module import-safe for future unit tests.
Why this structure?
A tight, two-class design keeps the surface area small: one side talks to the LLM, the other to Git. Each part can be mocked in isolation, batteries included for retries, cost limits, and PR hygiene.Wrapping up
Wrap-up
What’s solved?
GreyhoundDash can now turn a raw list of open-source services into ready-to-review Python stubs, batch-commit them, and open pull-requests—all without a human typing the same boilerplate thirty times.What’s still brittle?
Scraper depends on Awesome-Self-Hosted’s HTML staying stable.
Generated code is unit-tested only for syntax, not runtime behaviour.
Logo retrieval and licence compliance are TODOs.
OpenAI cost control is coarse (batch ceiling rather than per-token budgeting).
Immediate next steps
Add a CI job that lints each generated file and executes a dummy health-check.
Swap the HTML scrape for the project’s YAML source to decouple from DOM churn.
Enforce a spend cap via the OpenAI usage API and abort runs that hit it.
Store prompt + schema version in git so regressions are traceable.
Calling for help
Spot a service the scraper missed? Open an issue.
See a way to harden the GitHub automation (e.g., branch naming, PAT scope)? Let me know.
The repo is at github.com/JTSG1/GreyhoundDash. Star it and keep up to date on the development!
Subscribe to my newsletter
Read articles from James directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
