📚 From Blurry Pages to Clear Learning: My ML Internship at Suvidha Foundation

At Suvidha Foundation, my internship wasn’t about model accuracy or leaderboard scores — it was about impact.
I worked on a problem that many don’t think of as technical at first glance: books that were too blurry to read. These were scanned using mobile phones — some folded at the corners, some under bad lighting — and they were the only source of study material for underprivileged girls who didn’t have access to printed textbooks.
I knew if I could clean up these pages — make the text sharp and OCR-ready — I could help make education just a bit more accessible. So I used everything I knew about image preprocessing and computer vision to build a tool that could turn blurry scans into usable learning material.
🧑‍💻 Internship Role
Machine Learning Intern
Suvidha Foundation (NGO)
June 2023 – July 2023
🛠️ What I Worked On
🔹 Text Recovery from Noisy Scans
Processed over 150+ pages of mobile-scanned textbooks
Used OpenCV to apply:
Grayscale conversion to remove lighting artifacts
Otsu's thresholding for binarization
Erosion and dilation to reduce blur and separate characters
Contour analysis to isolate text from shadows or folds
Saved each page in a cleaned
.png
format for OCR useVerified OCR accuracy by running Tesseract on the output and manually checking readability
đź’ˇ What Made It Tricky
Scans weren’t consistent — different phones, lighting, resolution
Pages were often skewed or had shadow lines from folds
Sometimes contrast was so low, even my preprocessing had to be tuned page by page
But once I found the right combination of contrast stretching, adaptive thresholding, and image dilation — the text popped out. It became readable, printable, and usable.
❤️ Why This Mattered
These weren’t just experiments. These pages were used to build booklets and handouts for students in remote schools who couldn’t afford full textbooks.
My work helped support Suvidha Foundation’s mission of empowering 100+ underprivileged girls by giving them access to educational material — which otherwise would’ve been unreadable or incomplete.
đź§ What I Learned
That computer vision has the power to support basic human needs — like education
That “data preprocessing” isn’t just a boring step — it can make or break the outcome
How to build tools that work on real-world messiness, not just sanitized academic data
And most importantly, how ML can be quietly powerful in the background — just cleaning up words so someone else can read them
✨ Bonus: Fundraising Support
Alongside the tech work, I also helped Suvidha’s outreach team run a small fundraising campaign. We used email automation and social media to raise awareness — and ended up helping fund learning kits and notebooks for 100+ girls.
âś… Tech Stack
Python, OpenCV, NumPy
Otsu Thresholding, Morphological Ops (Dilation, Erosion)
Tesseract OCR (for validation)
Manual validation using side-by-side visual comparison
✉️ Let’s Connect
If you’re building AI for education, NGOs, or just want to chat about how to use ML for social good — reach out anytime.
Subscribe to my newsletter
Read articles from Khushal Jhaveri directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
