Building "Resume Analyzer" – An AI-Powered Resume Matching Tool Using Stream lit & AWS


Hey everyone!
I’m excited to share a project I’ve built recently called Resume Analyzer — an AI-powered tool that helps match resumes with job descriptions using natural language processing. The idea was born out of the struggle many freshers face in getting noticed by recruiters, especially when their resumes don’t reflect job-relevant keywords.
This post outlines what it does, how I built it, and how it can help students and job seekers optimize their resumes for better chances of selection.
💡 The Idea
Most candidates submit a generic resume everywhere. But many companies use keyword-matching algorithms or applicant tracking systems (ATS) to filter resumes before a human even sees them. So, I decided to build a smart tool that:
Uploads your resume as a PDF
Accepts a job description
Highlights which keywords are missing
Calculates a matching success score
All of this — in seconds — via a simple browser app!
⚙️ Tech Stack
Frontend UI | Backend Logic | Storage & Deployment |
Streamlit | Python (Text Matching) | AWS S3 (Resume Storage) |
HTML + CSS (via Streamlit) | Text Tokenization & NLP | Deployed on Render |
✨ Features
✅ Upload resume in PDF format
📄 Paste any job description text
🔍 Compares resume against JD using token-based matching
✅ Uploads resume securely to an AWS S3 bucket
📊 Shows success rate based on keyword matching
💡 Suggests missing keywords to improve the resume
🧠 Simple, clean, and fast user experience
📦 How It Works (End-to-End)
1. Upload & Parse Resume
Users upload their resume in PDF format. Instead of using OCR (which is often inaccurate for digital PDFs and needs system-level dependencies), we extract direct text using libraries like PyMuPDF (fitz) or PDFMiner. This ensures more reliable and faster extraction.
2. Match Against Job Description
We take the pasted JD text, clean and tokenize both the JD and the resume, remove stopwords, and perform intersection matching to calculate:
Matched keywords
Success rate (percentage match)
Missing terms to improve resume
3. Store Resume in AWS S3
We also upload the file to an AWS S3 bucket to simulate real-world enterprise workflows where resumes are stored for future reference or ML training.
📂 File Structure
bashCopyEditresume-analyzer/
│
├── app.py # Main Streamlit App
├── jd_matcher.py # Resume-JD comparison logic
├── resumeparser.py # PDF text extractor
├── requirements.txt # Python dependencies
├── packages.txt # System packages for deployment
├── setup.sh # Installs poppler/tesseract (optional)
├── render.yaml # Render deployment config
🧠 Key Concepts Used
Set-based keyword matching
Stopword removal
AWS S3 file upload with boto3
PDF text parsing with PyMuPDF
Dynamic UI with Streamlit
Render deployment pipeline
🧪 Challenges Faced
⚠️ Avoiding system-level dependencies like Tesseract OCR for simplicity
🧹 Cleaning the text for better token matching
🔒 Securely integrating AWS S3 for file uploads
🚀 Ensuring the app works smoothly on cloud (Render)
✅ What I Learned
Building cloud-hosted tools with a simple UI using Streamlit
Working with AWS S3 via Python's
boto3
Designing resume matching logic without complex ML models
Deploying full apps with environment configs on Render
🔗 Live Demo
📁 GitHub Repo
🎯 Future Improvements
Add basic ML-based semantic matching
Save and download improved resumes
User authentication
History tracking of uploaded resumes
Final Thoughts
This project is a great example of how simple technologies — used right — can create powerful tools. If you're a fresher or student looking to improve your resume, try it out! And if you're a developer, feel free to fork and build on top of it.
Thanks for reading! 🚀
— Mokka Madan Mohan
Subscribe to my newsletter
Read articles from Mokka Madan Mohan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
