I developed an API and Machine learning model for tea grades classification
Tea is a beloved beverage enjoyed worldwide, with countless varieties boasting unique flavors and aromas. However, grading tea leaves plays a crucial role in determining their quality and market value. Traditionally, this process relies on the expertise of human tea tasters. But what if we could leverage the power of machine learning (ML) to automate and potentially even enhance tea grade classification?
So, we'll delve into the development of an API and an ML model for classifying tea grades. We'll explore the technical aspects of building this system, highlighting the challenges and the exciting possibilities it presents.
Brewing the Data (creating the custom dataset)
The first step involved acquiring a robust dataset of tea images. So, this is the first point and the biggest I have encountered when it comes to developing this machine learning model. Below I have mentioned the details of the dataset.
No. of images = 164 images
No. of tea labels or grades = 16
Tea grade | No. of images |
BOP 1 | 14 |
BOP 1A | 11 |
BOPF | 06 |
BP 1 PF 1 | 06 |
CTC | 21 |
DUST 1 | 08 |
FBOP | 15 |
FBOPF 1 | 14 |
Golden Tips | 08 |
Green Tea | 10 |
OP | 15 |
OP 1 | 17 |
OPA | 08 |
PD | 7 |
PEKOE 1 | 14 |
Silver Tips | 07 |
Developing the machine learning model
First I started developing it with tensorflow, keras API because it was easier and straight forward. Then I wanted to develop it with pytorch framework because it is much more better framework when it comes to research level projects. Finally i decided to go with YOLOv8 because of the flexibility it provides, the straight forwardness and especially because I wanted to put the primary weight on the dataset when it comes to the accuracy of the model.
So, then I trained the model using custom dataset I have created.
from ultralytics import YOLO
# load the model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# train the model
model.train(data='./tea_grades_dataset', epochs=20, imgsz=64)
In YOLOv8 it saves two types of models.
The model with the best fitness score. - best.pt
The model with last epochs training - last.pt
In here I took the best.pt model.
from ultralytics import YOLO
def model_train():
model = YOLO('/home/warrior/Development/tea-grading-api-and-model/runs/classify/train/weights/best.pt')
return model
and now I am ready to classify the tea grades with my model.
Building the API: Serving Predictions
In here I used FastAPI framework rather than using flask and django.
from fastapi import FastAPI, File, UploadFile
from predict import model_train
import subprocess
app = FastAPI()
model_data = None
@app.post("/predict/")
async def predict_image(file: UploadFile = File(...)):
global model_data
if model_data is None:
model_data = model_train()
with open(f"./predict/{file.filename}", "wb+") as f:
f.write(file.file.read())
result = model_data(f"./predict/{file.filename}")
#print(result)
names_dict = result[0].names
probs = result[0].probs
index = probs.top1
return {"result": names_dict[index]}
def run_uvicorn():
uvicorn_command = [
"uvicorn",
"api:app",
"--host", "127.0.0.1",
"--port", "8000",
"--reload"
]
subprocess.run(uvicorn_command, check=True)
if __name__ == "__main__":
run_uvicorn()
So, first I created the API using FastAPI for uploading an image. Then I realized, this is not something practically done, when integrating this API to another application.
Then after I modified this code to use a URL of the image. And here is the code of tht.
from fastapi import FastAPI, File, UploadFile
from predict import model_train
import subprocess
import urllib
app = FastAPI()
model_data = None
@app.post("/predict/")
async def predict_image(url: str):
global model_data
if model_data is None:
model_data = model_train()
def download_image(url, save_as):
urllib.request.urlretrieve(url, save_as)
save_as = './predict/predict.jpg'
download_image(url, save_as)
result = model_data(f"./predict/predict.jpg")
names_dict = result[0].names
probs = result[0].probs
index = probs.top1
return {"result": names_dict[index]}
def run_uvicorn():
uvicorn_command = [
"uvicorn",
"api:app",
"--host", "127.0.0.1",
"--port", "8000",
"--reload"
]
subprocess.run(uvicorn_command, check=True)
if __name__ == "__main__":
run_uvicorn()
When using the URL as the input for the API, in here I am downloading the image to the folder called "predict/" then after I am doing the classification.
Here, I am returning the name of the tea grade predicted in the model classification.
A Cup of Potential
This project demonstrates the exciting potential of ML in the tea industry. An API for tea grade classification can offer several benefits:
Efficiency: Automating tea grade classification can significantly reduce the time and resources required compared to manual methods. Scalability: The API can handle large volumes of tea images, making it suitable for high-throughput operations. Objectivity: ML models can provide a more objective and consistent grading process compared to human tasters, who can be susceptible to fatigue or bias.
Looking Ahead: The Future of Tea Grading
This project paves the way for further innovation in tea grading. Integrating the API with mobile apps could enable real-time tea grade assessment at farms or processing facilities. Additionally, the model can be continuously improved by incorporating new data and exploring more advanced ML techniques.
By harnessing the power of ML, we can not only streamline tea grading but also potentially gain deeper insights into the complex world of tea quality. So, the next time you savor a cup of tea, remember the potential role of intelligent systems in ensuring its quality and delivering a consistent, delightful experience.
Thank you for reading....
Subscribe to my newsletter
Read articles from Sumal Surendra directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Sumal Surendra
Sumal Surendra
I'm Sumal Surendra, an Information Systems undergrad from Sabaragamuwa University of Sri Lanka on a mission to decode the mysteries of the digital world. Think code whisperer, data detective, and all-around tech enthusiast!