Automating Dockerfile Generation Using Python & Large Language Models (LLMs)


Project Overview
Imagine a Python script that automatically generates a Dockerfile based on the developer's input.
For example:
Input: Java → Output: Java-based Dockerfile
Input: Rust → Output: Rust-based Dockerfile
Input: Ruby on Rails → Output: Ruby on Rails Dockerfile
This automation is powered by Generative AI and Large Language Models (LLMs) to support any programming language dynamically!
Why Not Just Use ChatGPT?
Yes, ChatGPT can generate Dockerfiles, but:
It requires manual input each time.
Many MNCs block access to ChatGPT and similar tools.
It cannot be easily automated for DevOps workflows.
Solution? We will integrate LLMs into a Python script for a smooth and automated developer experience!
Understanding Large Language Models (LLMs)
LLMs are AI models designed for text-based tasks. They fall into two categories:
A. Local LLMs
Runs on your local machine or company’s private servers.
More secure and private (Meta, DeepSeek, Granite, etc.).
Requires setup, infrastructure, and maintenance.
B. Hosted LLMs (Cloud-based)
Hosted by providers like OpenAI (ChatGPT), Google (Gemini), or DeepSeek.
Pay per API call (cost depends on the number of tokens used).
No infrastructure maintenance required, but less privacy and security.
Feature | Local LLMs | Hosted LLMs |
Security | High (Runs in-house) | Risky (Third-party) |
Cost | Free (No API cost) | Paid per API call |
Setup | Complex (Requires infra) | Easy (Ready to use) |
Scalability | Limited | Highly Scalable |
Setting Up the Project
Now, let’s implement this project using Local LLMs first.
Install Olama (LLM Manager for Local Models)
Olama is like Docker but for LLMs. It helps download, run, and manage AI models locally.
Download and Install Ollama: Just like docker CLI we need Ollama CLI to install LLM Models like llama by meta or deepseek
# For Linux curl -fsSL https://ollama.com/install.sh | sh
Start Ollama Service
ollama serve or ollama start
Pull Llama3 Model: Pulling model successful
ollama pull llama3.2:1b
Let’s try a prompt to generate a docker file for java based app: Prompt given:
Create a dockerfile for java based app
We got the output but it’s using ChatGPT, but we will use python script to automate this by using ollama
Project Setup for Local LLMs
Create Virtual Environment": Two developers working on same machine with two different projects and need different dependencies for each project then we use the “Virtual Env” Concept.
python3 -m venv venv source venv/bin/activate # On Linux/MacOS # or .\venv\Scripts\activate # On Windows
Install Dependencies
echo "ollama" > requirements.txt pip3 install -r requirements.txt
We need to install dependencies to run the project and we got ollama package dependencies inside requirements.txt file, which help to talk to ollama api installed in our machine and then it talks to llm model.
- Write a generate_dockerfile.py python script
import ollama
PROMPT = """
ONLY Generate an ideal Dockerfile for {language} with best practices. Do not provide any description
Include:
- Base image
- Installing dependencies
- Setting working directory
- Adding source code
- Running the application
# just like this we can add more no. of constraints to prompt, what kind of docker file we need,
"""
def generate_dockerfile(language):
response = ollama.chat(model='llama3.1:8b', messages=[{'role': 'user', 'content': PROMPT.format(language=language)}])
return response['message']['content']
if __name__ == '__main__':
language = input("Enter the programming language: ")
dockerfile = generate_dockerfile(language)
print("\nGenerated Dockerfile:\n")
print(dockerfile)
- Run the Application
python3 generate_dockerfile.py
We got the docker file successfully with the help of python script: integrated with ollama package > then it calls to ollama API > LLM triggered.
- Let’s try to modify prompt in script and generate multi stage dockerfile
Edit > vim generate_dockerfile.py
We added the multi stage docker build command.
Run the script again:
python3 generate_dockerfile.py
We got the python dockerfile with multi stage build, modify accordingly
How It Works
The script takes a programming language as input (e.g., Python, Node.js, Java)
Connects to the Ollama API running locally
Generates an optimized Dockerfile with best practices for the specified language
Returns the Dockerfile content with explanatory comments
We are done with local llm on our machine, same way we can setup for our organisation to ease the work. Security is high compared to Hosted LLMs and Free of cost too.
But if your system don’t have enough resources try to work on “Hosted LLM’s”
Hosted LLM’s
Hosted LLM’s charge you for API calls according to tokens. If your machine lacks hardware resources for local LLMs, you can use Google Gemini’s hosted API.
Google AI Studio have their own model → Gemini pro 1.5. Google right now offering free API Key
Go to Google AI studio > Get API Key > Create API Key
Write a python script generate_dockerfile_gemini.py
import google.generativeai as genai import os # Set your API key here os.environ["GOOGLE_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxx" # Configure the Gemini model genai.configure(api_key=os.getenv("GOOGLE_API_KEY")) model = genai.GenerativeModel('gemini-1.5-pro') PROMPT = """ Generate an ideal Dockerfile for {language} with best practices. Just share the dockerfile without any explanation between two lines to make copying dockerfile easy. Include: - Base image - Installing dependencies - Setting working directory - Adding source code - Running the application - Add a multi stage build """ def generate_dockerfile(language): response = model.generate_content(PROMPT.format(language=language)) return response.text if __name__ == '__main__': language = input("Enter the programming language: ") dockerfile = generate_dockerfile(language) print("\nGenerated Dockerfile:\n") print(dockerfile)
Script for multi stage dockerfile
Be careful while using these external model, else someone can miss use the tokens or even can perform DOS Attack
Run the script
echo "google.generativeai" > requirements.txt pip3 install -r requirements.txt python3 <nameofscript>
Install the requirements before running the python script
Use the LLM Models wisely depending upon your task, Docker file is very common task so it doesn’t matter what LLM you choose.
Example Outputs:
If the user enters Node.js, the generated Dockerfile might use
node:alpine
as the base image.If the user enters Ruby on Rails, it can generate a Dockerfile optimized for Rails applications.
Important Considerations
Token Costs: Hosted LLMs (like OpenAI or Gemini) charge based on token consumption. For example:
OpenAI’s GPT-4 charges $15 per million tokens.
Gemini Pro 1.5 may offer free API usage, but this depends on region availability.
Security: Always keep API keys private and delete them after use.
Rate Limiting: Prevent DoS attacks by implementing request limits on API keys.
Conclusion
This project automates Dockerfile generation using LLMs, improving efficiency for DevOps engineers. Whether using Llama locally or Gemini’s API, this approach simplifies containerization for various programming languages.
Subscribe to my newsletter
Read articles from Amit singh deora directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Amit singh deora
Amit singh deora
DevOps | Cloud Practitioner | AWS | GIT | Kubernetes | Terraform | ArgoCD | Gitlab