CNN, YOLO & the Unpluckable Problem

Let me explain.

So when we’re talking AI, we need data. Good, bad, incomplete, it doesn’t matter. What matters is there is something to train the models on.

So I scoured the depths of the internet and found a few datasets. There were ones that have already trained models to detect disease, and I found one that covered “Year, Month and District-wise Estimated Production of Tea in India” from 2013 to 2024. I thought it would make for a great Power Bi project. I love a good data virtualisation project but it would mean re-installing power BI. I could also work on a forecasting model but I had mentally moved on to the next dataset on the google search list.

I picked the “Age-Stratified Tea Leaf Quality Classification Dataset.”
It had annotated tea leaves—complete with bounding boxes and everything.

Naturally, I thought: YOLO.
As in, “You Only Look Once,” the object detection algorithm—not the mantra of a reckless teenager. Though in hindsight, both might apply.

I also considered CNNs, because surely classifying leaves by their plucking standard—‘two leaves and a bud’ and so on might be helpful…?

Let’s Spiral for a Second

So the more I pondered on what to do with this dataset, the more I realised I was either trying to find solutions to problems that did not exist, or my solutions wouldn’t make things easier for anyone.

First Idea: Assist New Pluckers

Maybe the model could help seasonal workers identify the right plucking standard. Is that a bud or a tiny leaf?

But in reality—do we really want them to stop every few seconds to check their phone? Also, let’s be honest: pluckers are usually very experienced.

Results? If a plucker were to constantly look through their phone before plucking, it pulls down the efficiency. Leaves go unplucked. Everyone’s annoyed.

An alternative solution. Maybe….just maybe, the manager/supervisor could use it to train the pluckers before they go into the action field. The trained model is used to test the pluckers, if they fail, they go through training until they learn to pluck the right, good quality leaves.

Okay. That’s not bad.

But is it really a requirement?

I don’t know yet.

Second Idea: Sorting After Plucking

Alright. So field use was a no-go. What about after the leaves are plucked?

Imagine a tray full of freshly picked leaves. Surely, a trained YOLO model could analyse an image and spit out numbers: X buds, Y mature leaves, Z stray twigs and one unfortunate beetle.

Useful? Possibly.

But there are hiccups.

For starters, the data I currently have, the very material I’m training this grand vision on, is a series of individual tea leaves against clear-ish backgrounds. Some of them also feature fingers, human ones.

I felt the need to clarify that.

But even if we were to politely ignore that detail, this is not a glorious heap of overlapping foliage. So while the tray is more realistic, the model might just cry quietly and return a bounding box for the tray itself.

And even if the detection did work. Say I got the right kind of dataset and trained the model on images of trays with leaves, what exactly would we do with the results?

A weighing scale already tells us how much has been plucked. We do, given everything, sell per kilogram! Unless we’re assigning price points based on leaf maturity on a per-basket basis, what’s the value here?

Worse, by the time we’re sorting trays, the leaves are already plucked and can’t be undone. So at best, we’ve invented a slightly judgmental camera.

Third Idea: Quality Estimation at Factory Intake

This one felt promising. Imagine capturing an image of the batch before processing—just as it arrives at the factory gate.

Could the model provide an automated quality grade?

Maybe it sees too many older leaves and throws up a red flag. Or maybe it spots the golden standard and awards five stars and a tiny tea emoji.

But again… no.

Our dataset contains isolated leaf photos. Not bags or batches or anything remotely resembling what factory intake looks like.

And there’s another minor snag: we don’t have labels for “good batch” vs “bad batch.” Which means the model can’t learn what to reward or warn about. No labels, no learning.

At this point, I’m in over my head but a spark of an idea ignites.

Fourth and Final Idea: Data Logging (The Audacity of Hope)

Now here’s where I thought I was being clever.

What if, what if, this whole image-taking thing wasn’t for real-time action at all?

What if it was just for keeping records?

Snap a photo at plucking time. Another at intake. A third at processing. Over days and weeks, we’d build a timeline. An archive of tea quality, picking patterns, pest issues, and that week when half the estate was underwater.

Over time, this could inform trends. Help trace quality back to picking standards. Maybe even train future models.

But then reality knocked.

Are we really going to ask temporary, seasonal pluckers to take photos of every basket? And with what phones? On whose time?

Even if the estate manager did it once a day, are they really going to maintain photographic discipline for an entire season?

I’ll admit—this one still has some charm. But it depends on infrastructure, buy-in, and a deeply dedicated workforce. Not impossible. Just…unlikely.

At this point, I stared into my tea and whispered, “What am I doing?”

What Tea Was I Drinking?

Okay. Let’s take a break and discuss what is, truly, the best beverage after water chilled in an earthen pot.

Tea.

That’s what this is all about, isn’t it?

Righto! I was drinking vanilla flavoured oolong tea.

After that quick restorative break, we’re back to greying hair, one at a time.

What Would Actually Be Useful?

The problem isn’t the idea. The problem is the data.

To build genuinely helpful AI tools for estates, we’d need a different kind of dataset. Something less “images of leaves with mystery boxes” and more like this:

Training and supervision data: Images taken at the moment of plucking, labelled by experts. So we know exactly what was intended, and what was actually picked.
Factory intake quality estimation: Batch-level images with detailed grading. Ideally, something a trained eye would annotate (“a bit too mature, poor oxidation, might ferment oddly”), so machines can learn what “subtle but important” looks like.
Estate-level data logs: Regular snapshots over time—daily or weekly photos taken from consistent angles—to track trends, spot diseases, note nutrient deficiencies, or capture seasonal quirks.

Currently, this type of dataset does not exist. Not publicly. And especially not for Indian estates.

So… What Now?

To the people working in tea—researchers, estate managers, technologists—would tools like this actually help?

Would a computer vision model that can track plucking trends, estimate intake quality, or even flag early-stage plant health issues make your job easier?

Or am I simply caffeinated and grasping at straws here?

Because here’s the thing: I want to help. I just don’t want to make tech for the sake of saying “Look! I used AI on tea leaves!” when that model is never going to be used by anyone outside this notebook.

Also, let’s face it, I’m 2 classes in on a 6 month tea course, I know not the T of tea. But I am coming to realise that there is a massive hole in India specific data.

P.S. I Still Think There’s Hope

While this particular model may be shelved for now, I’m not going to delete the dataset yet. It is 4.5GB worth of data and it took me an hour to download it. So i’ll keep it if for when inspiration strikes.

I’m still intrigued by plant health management. Image-based disease and nutrient deficiency detection has already been explored in other crops and tea too, in fact, with real success. I will work on that dataset too, and learn what affects our precious leaves.

Then there’s the Power BI / Data Visualisation project that I am really looking forward to, and hopefully, it has the right kind of data in there to go through a few forecasting models. And who am I kidding, I LOVE working on Power BI. It’s almost, meditative!

I know this is a long road ahead and I don’t really mind it. But if anyone’s willing to point me in the right direction—or send me on a detour to Tocklai—I’d be very grateful.

Until then, this blog remains my lab notebook.
Equal parts journal, research log, and a mild existential crisis.

Thank you for tuning in and bring your own tea cup the next time!

Ignition Issues Or Is My GPS Not Working?