Automating Dockerfile Generation Using Python & Large Language Models (LLMs)

Project Overview

Imagine a Python script that automatically generates a Dockerfile based on the developer's input.
For example:

Input: Java → Output: Java-based Dockerfile
Input: Rust → Output: Rust-based Dockerfile
Input: Ruby on Rails → Output: Ruby on Rails Dockerfile

This automation is powered by Generative AI and Large Language Models (LLMs) to support any programming language dynamically!

Why Not Just Use ChatGPT?

Yes, ChatGPT can generate Dockerfiles, but:

It requires manual input each time.
Many MNCs block access to ChatGPT and similar tools.
It cannot be easily automated for DevOps workflows.

Solution? We will integrate LLMs into a Python script for a smooth and automated developer experience!

Understanding Large Language Models (LLMs)

LLMs are AI models designed for text-based tasks. They fall into two categories:

A. Local LLMs

Runs on your local machine or company’s private servers.
More secure and private (Meta, DeepSeek, Granite, etc.).
Requires setup, infrastructure, and maintenance.

B. Hosted LLMs (Cloud-based)

Hosted by providers like OpenAI (ChatGPT), Google (Gemini), or DeepSeek.
Pay per API call (cost depends on the number of tokens used).
No infrastructure maintenance required, but less privacy and security.

FeatureLocal LLMsHosted LLMs
SecurityHigh (Runs in-house)Risky (Third-party)
CostFree (No API cost)Paid per API call
SetupComplex (Requires infra)Easy (Ready to use)
ScalabilityLimitedHighly Scalable

Setting Up the Project

Now, let’s implement this project using Local LLMs first.

Install Olama (LLM Manager for Local Models)

Olama is like Docker but for LLMs. It helps download, run, and manage AI models locally.

  1. Download and Install Ollama: Just like docker CLI we need Ollama CLI to install LLM Models like llama by meta or deepseek

     # For Linux
     curl -fsSL https://ollama.com/install.sh | sh
    
  2. Start Ollama Service

     ollama serve
     or 
     ollama start
    
  3. Pull Llama3 Model: Pulling model successful

     ollama pull llama3.2:1b
    

Let’s try a prompt to generate a docker file for java based app: Prompt given:

Create a dockerfile for java based app

We got the output but it’s using ChatGPT, but we will use python script to automate this by using ollama

Project Setup for Local LLMs

  1. Create Virtual Environment": Two developers working on same machine with two different projects and need different dependencies for each project then we use the “Virtual Env” Concept.

     python3 -m venv venv
     source venv/bin/activate  # On Linux/MacOS
     # or
     .\venv\Scripts\activate  # On Windows
    
  2. Install Dependencies

     echo "ollama" > requirements.txt
     pip3 install -r requirements.txt
    

We need to install dependencies to run the project and we got ollama package dependencies inside requirements.txt file, which help to talk to ollama api installed in our machine and then it talks to llm model.

  1. Write a generate_dockerfile.py python script
import ollama

PROMPT = """
ONLY Generate an ideal Dockerfile for {language} with best practices. Do not provide any description
Include:
- Base image
- Installing dependencies
- Setting working directory
- Adding source code
- Running the application
# just like this we can add more no. of constraints to prompt, what kind of docker file we need,
"""

def generate_dockerfile(language):
    response = ollama.chat(model='llama3.1:8b', messages=[{'role': 'user', 'content': PROMPT.format(language=language)}])
    return response['message']['content']

if __name__ == '__main__':
    language = input("Enter the programming language: ")
    dockerfile = generate_dockerfile(language)
    print("\nGenerated Dockerfile:\n")
    print(dockerfile)

  1. Run the Application
python3 generate_dockerfile.py

We got the docker file successfully with the help of python script: integrated with ollama package > then it calls to ollama API > LLM triggered.

  1. Let’s try to modify prompt in script and generate multi stage dockerfile

Edit > vim generate_dockerfile.py

We added the multi stage docker build command.

Run the script again:

python3 generate_dockerfile.py

We got the python dockerfile with multi stage build, modify accordingly

How It Works

  1. The script takes a programming language as input (e.g., Python, Node.js, Java)

  2. Connects to the Ollama API running locally

  3. Generates an optimized Dockerfile with best practices for the specified language

  4. Returns the Dockerfile content with explanatory comments

We are done with local llm on our machine, same way we can setup for our organisation to ease the work. Security is high compared to Hosted LLMs and Free of cost too.

But if your system don’t have enough resources try to work on “Hosted LLM’s”

Hosted LLM’s

Hosted LLM’s charge you for API calls according to tokens. If your machine lacks hardware resources for local LLMs, you can use Google Gemini’s hosted API.

Google AI Studio have their own model → Gemini pro 1.5. Google right now offering free API Key

  1. Go to Google AI studio > Get API Key > Create API Key

  2. Write a python script generate_dockerfile_gemini.py

     import google.generativeai as genai
     import os
    
     # Set your API key here
     os.environ["GOOGLE_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxx"
    
     # Configure the Gemini model
     genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
     model = genai.GenerativeModel('gemini-1.5-pro')
    
     PROMPT = """
     Generate an ideal Dockerfile for {language} with best practices. Just share the dockerfile without any explanation between two lines to make copying dockerfile easy.
     Include:
     - Base image
     - Installing dependencies
     - Setting working directory
     - Adding source code
     - Running the application
     - Add a multi stage build
     """
    
     def generate_dockerfile(language):
         response = model.generate_content(PROMPT.format(language=language))
         return response.text
    
     if __name__ == '__main__':
         language = input("Enter the programming language: ")
         dockerfile = generate_dockerfile(language)
         print("\nGenerated Dockerfile:\n")
         print(dockerfile)
    

    Script for multi stage dockerfile

    Be careful while using these external model, else someone can miss use the tokens or even can perform DOS Attack

  3. Run the script

     echo "google.generativeai" > requirements.txt
     pip3 install -r requirements.txt
     python3 <nameofscript>
    

    Install the requirements before running the python script

Use the LLM Models wisely depending upon your task, Docker file is very common task so it doesn’t matter what LLM you choose.

Example Outputs:

  • If the user enters Node.js, the generated Dockerfile might use node:alpine as the base image.

  • If the user enters Ruby on Rails, it can generate a Dockerfile optimized for Rails applications.

Important Considerations

  • Token Costs: Hosted LLMs (like OpenAI or Gemini) charge based on token consumption. For example:

    • OpenAI’s GPT-4 charges $15 per million tokens.

    • Gemini Pro 1.5 may offer free API usage, but this depends on region availability.

  • Security: Always keep API keys private and delete them after use.

  • Rate Limiting: Prevent DoS attacks by implementing request limits on API keys.

Conclusion

This project automates Dockerfile generation using LLMs, improving efficiency for DevOps engineers. Whether using Llama locally or Gemini’s API, this approach simplifies containerization for various programming languages.

10
Subscribe to my newsletter

Read articles from Amit singh deora directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Amit singh deora
Amit singh deora

DevOps | Cloud Practitioner | AWS | GIT | Kubernetes | Terraform | ArgoCD | Gitlab