ML Model Deployment using Docker and FastAPI

Why ML engineer or researchers need to consider Docker and FastAPI
Usually, data scientists and ML engineers build prototypes for AI models using Jupyter notebooks or Colab. This approach is quick to demonstrate the prediction of models and initial proof of concept. But things change if the aim is to build production-ready models or multiple engineers/researchers are working on the same model. We need a proper development environment and an IDE like PyCharm(there are many. I love PyCharm for ML tasks). https://www.jetbrains.com/pycharm/#. If multiple collaborators work on the same model in their own environment, there are chances of dependent libraries not being compatible on other machines (for example, one engineer uses Python 3.x and another uses Python 3.y).
To resolve this, we use concept of containerization. Docker is a tool which offers containerization and model can be build as docker image which can be run on any machine in a docker environment.
Once an ML model is created or deployed, it runs on a local machine (or GPU). Unless an external service integrates with your model or users consume the model, it is just magic software lying in your local VM. Just imagine if ChatGPT was never exposed as an HTTP API, what the world would have missed.
To expose your model to an HTTP server, we need to run the model as a REST API for which FastAPI https://fastapi.tiangolo.com/ can be used. It supports Python and follows Open API standards.
Let's take an example of a simple ML model which predicts temperature value given in Celsius to Fahrenheit. The objective of this demonstration is to show how an ML model can be run independently in a docker container and exposed as a REST API.
Containerize model and expose as REST API
Open PyCharm, create new project ModelTemp. All files mentioned below should be created under this root project.
Create python file tempPredict.py.
from fastapi import FastAPI, Body
from pydantic import BaseModel
import numpy as np
from sklearn.linear_model import LinearRegression
tempPredict = FastAPI(title = "Celsius to Fahrenheit")
#Train model with Celsius-Fahrenheit mapping in order
Celcius = np.array([0,20,40,60,80,100]).reshape(-1,1)
Fahrenheit = np.array([32,68,104,140,176,212])
model = LinearRegression()
model.fit(Celcius,Fahrenheit)
class TempConvertor(BaseModel):
celcius: float
@tempPredict.post("/predict")
def predict_temp(req:TempConvertor):
predict = model.predict(np.array([[req.celcius]]))[0]
return {"celcius":req.celcius,"fahrenheit":round(predict,2)}
- The steps given above are simple which any data scientist can understand. Model uses some celsius and Fahrenheit data for training using Linear Regression. Now we create a configuration file which defines all dependencies required to run above model.
fastapi
uvicorn
scikit-learn
numpy
- Here comes the main part. Now we need to create a docker file which mentions the command required to install dependencies from the requirement file and other details like port, location where the file will be copied into the docker environment.
FROM python:3.10-slim
WORKDIR /tempPredict
COPY tempPredict.py /tempPredict
COPY requirements.txt /tempPredict
#Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8000
CMD ["uvicorn", "tempPredict:tempPredict","--host","0.0.0.0","--port","8000"]
This is how project structure will look in PyCharm.
Now lets run docker command to build and run docker image for model. Docker Desktop should be installed in local machine https://www.docker.com/products/docker-desktop/ . Docker commands can be run on PyCharm terminal or Docker Desktop terminal
docker build -t temperature-model .
docker run -p 8000:8000 temperature-model
You should see below output in console once model runs on FastAPI server.
- Once the container runs, open the browser and run http://localhost:8000/docs where the FastAPI server will be running. FastAPI converts your ML model into a REST API which can be exposed as an HTTP URL. Once a model is exposed as an HTTP API, it can be used by different consumers and users.
Pull model image in Docker hub
Once an engineer creates a docker image of a model on his/her machine, it is ready to be pushed to the Docker Hub where other collaborators can pull the model and start working on it.
- We need to create a repository in Docker hub where different version of image will be pushed.
- Next step is to login to your docker hub from terminal and create tag for your local image.
docker login -u <username>
Enter password:
docker tag temperature-model <your_repo_name>:latest
# ex : docker tag mridul12/temperature-model:latest
# Here to maintain version of model image tag can be 1.0, then next push 1.1 etc
- Now once tag is ready, push your image to docker hub. Once image is pushed, it can be pulled by other engineers.
docker push <image_tag>
docker pull <image_tag>
# ex : docker push mridul12/temperature-model:1.0
- Image and container details can be viewed in console using below commands
docker ps
docker images
Conclusion
If the ML model is containerized and exposed as a REST API, it creates new opportunities for exploring ML advantages with micro-service and container architecture. Will appreciate your comment and any feedback. You can reach out to me at mridul.deka09@gmail.com. My LinkedIn profile https://www.linkedin.com/in/mriduldeka09/
Subscribe to my newsletter
Read articles from Mridul Deka directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Mridul Deka
Mridul Deka
I am a software developer who likes exploring new things, contribute back to community.