Building an LLM-Powered Resume Screener Using Mistral and Gradio (Fully Offline)

Table of contents
- Introduction
- π§ Tech Stack
- π Prerequisites
- βοΈ Setting Up Mistral via Ollama
- π Key Features
- π Project Flow
- π§° Sample Prompt to LLM (Parsing Resume)
- πΉ Sample Prompt to LLM (Matching with JD)
- π Example Output
- π Why This Project Matters
- π Bonus: Gradio UI Snapshot
- π What's Next?
- π GitHub
- π Final Thoughts

Introduction
In todayβs hiring landscape, HR teams often rely on Applicant Tracking Systems (ATS) to quickly filter resumes against job descriptions. While these systems work, they often fall short in understanding nuanced skillsets, context, and potential. Enter Large Language Models (LLMs). In this blog, Iβll walk you through how I built a smart resume screening app that mimics ATS behavior but leverages the power of an open-source LLM (Mistral), all running locally on your machine β no OpenAI keys, no API costs, no data leakage.
π§ Tech Stack
LLM: Mistral (7B model via Ollama)
Interface: Gradio
PDF Parsing: PyMuPDF (fitz)
Language: Python
Execution: Fully offline
π Prerequisites
Python 3.8+
Basic knowledge of how LLMs and APIs work (no prior ML experience needed!)
4GB+ RAM recommended
OS: Windows, Linux, or macOS
βοΈ Setting Up Mistral via Ollama
To run the Mistral model locally for LLM inference, follow these steps:
Install Ollama
Go to: https://ollama.com/download
Download and install for your OS (Windows, Mac, Linux)
Pull the Mistral model
ollama pull mistral
Run the Mistral model locally
ollama run mistral
This will start a local server at
http://localhost:11434
which your app can query.
Note: You can replace mistral
with other models supported by Ollama like phi
, gemma
, etc.
π Key Features
Upload any resume in PDF format
Extracts structured details using LLM: name, experience, skills, work history, education
Compares resume with job description and gives:
Match Score (0β10)
Strengths
Weaknesses
Suggested Interview Questions
Displays results in a beautiful, responsive Gradio interface
Shows processing time (performance benchmark)
π Project Flow
Resume Upload
The user uploads a PDF resume
fitz
(PyMuPDF) extracts raw text from the PDF
Resume Parsing
A prompt is generated asking the LLM to extract JSON structured data from the resume
The prompt is fed into a locally running Mistral model via Ollama (localhost)
Job Description Matching
The extracted resume JSON and pasted job description are passed to a second LLM prompt
The model analyzes the match, assigns a score, and lists strengths, weaknesses, and interview questions
Result Display
Everything is visualized in the Gradio app with markdown formatting
A timer displays how long the LLM took to process everything
π§° Sample Prompt to LLM (Parsing Resume)
You are an expert resume parser. Extract the following fields: - Name - Total years of professional experience - Skills - Work History - Education Return in JSON format.
πΉ Sample Prompt to LLM (Matching with JD)
You are an expert HR analyst. Given the candidate's resume (structured JSON) and a job description, analyze: 1. Fit Score (0β10) 2. Strengths 3. Weaknesses 4. Interview Questions
π Example Output
Parsed Resume:
{
"name": "James Bond",
"experience_years": 1,
"skills": ["Python", "SQL", "Machine Learning", "HTML", "CSS", "Flask"]
}
Evaluation:
Fit Rating: 6/10
Strengths: Strong programming base, fast learner, good visual tools
Weaknesses: No production-level experience, limited exposure to CI/CD
Interview Questions: Describe a project using SQL; How would you build a scalable web app?
π Why This Project Matters
Replaces expensive black-box resume filters
Great use of local inference with LLMs
Emphasizes data privacy
Easy for recruiters or HR tech tools to adopt
π Bonus: Gradio UI Snapshot
The Gradio interface includes:
File upload
Textbox for job description
Real-time results (markdown + timer)
Mobile responsive
π What's Next?
Add bulk resume upload (batch evaluation)
Export results to CSV/JSON
Add support for multiple LLMs via config
Integrate with ATS APIs or job boards
π GitHub
π Final Thoughts
This project was a deep dive into how powerful open-source LLMs can be when used smartly. With just Python, a local model, and Gradio, you can build ATS-level functionality that respects privacy, costs nothing, and gives users total control.
Feel free to fork the repo, suggest features, or deploy it in your own use case!
Subscribe to my newsletter
Read articles from Rohit Ahire directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
