When Gemini Ghosted Me: The Chaos & Jugaad Behind Building Quiztelify

Karan DugarKaran Dugar
7 min read

Or how I hacked my way to reliable quiz parsing with two Gemini models and one stubborn brain.🤖

So I was working on this idea, simple in theory : a website that helps students practice NPTEL weekly quizzes. I called it Quiztelify, and it lets users either pick from a list of pre-uploaded courses or upload their own NPTEL quiz PDF. The goal? Parse the PDF, extract all MCQs week-by-week, and present them in a clean, test-like interface for practice.

To handle the parsing, I decided early on to use the Gemini 2.0 Flash API, mainly because, during my manual prompt testing, it consistently gave me the best results compared to other AI models. It’s optimized for speed and does a solid job transforming unstructured inputs — like quiz-heavy PDFs — into structured formats like JSON. But as with all LLMs, the real magic lies in the prompt, and that’s where ChatGPT came in. Think of it as my prompt whisperer. I kept bouncing ideas off it, tweaking wording and restructuring instructions until we landed on the Goldilocks prompt: just right. By crafting a precise prompt that specified the extraction schema (weeks, questions, four options, and bolded correct answers), I was able to guide Gemini effectively. The result was a reliable pipeline that converted raw content into clean, usable data with minimal manual intervention.

Except… boom never happened.

When I parsed a full NPTEL PDF with Gemini 2.0 Flash, I noticed something odd. The response would cut off — not at a hard error, but just kind of fizzle out. The JSON output would include the first few weeks and their questions, and then… it’d just stop. Turns out, Gemini 2.0 Flash has a response window limit, basically, the amount of text it can send back is capped. Not by tokens (as in tokens per input + output), but by its internal output length constraints. So, long PDF in + long JSON out = truncated response.

Jugaad Attempt #1: Split the Prompt in Half

Of course, my first instinct was the classic developer fix, split the job in two. I created two separate prompts, First one for Week 1–6 and Second one for Week 7–12.

First Half Prompt:

You are given a course quiz PDF document that contains multiple-choice quiz questions organized week-wise. Each section starts with a heading such as “Week 1”, “Week 2”, and so on.

  Your task is to:
  - Detect the total number of weeks present in the document.
  - Extract only the **first half** of the weeks (e.g., if there are 12 weeks, extract Week 1 to Week 6).
  - Under each detected week, extract all quiz questions that meet the following criteria:

  Each question must have:
  - A question string
  - Exactly **4 options**
  - One option clearly marked as the correct answer using **bold formatting** in the PDF

  Return your output strictly in the following JSON format:

  {
    "Week 1": [
      {
        "question": "Example question?",
        "options": ["Option A", "Option B", "Option C", "Option D"],
        "answer": "Option B"
      }
    ],
    "Week 2": [...],
    ...
  }

  Instructions:
  - Only include valid weeks (do not fabricate or assume week labels).
  - Only include questions with a clearly bolded correct answer and exactly four options.
  - The "answer" value must exactly match the bolded option.
  - Do not include any explanation, extra text, or notes outside of the JSON.
  - Only return valid JSON as described above.
  - If you feel the pdf content is not correct return a respsone with a message saying "Please upload a valid document".

Second Half Prompt:

Same, but Extract only the **second half** of the weeks (e.g., if there are 12 weeks, extract Week 7 to Week 12).

I figured this would reduce the output size and stay within the response limit. Makes sense, right?

Wrong.

The first prompt worked beautifully. Gemini 2.0 Flash returned full, correct JSON with all questions intact. But the second prompt? It gave me all the correct week headers — “Week 7”, “Week 8”, and so on, but most of the questions were missing. Sometimes I’d get two questions. Sometimes five. Sometimes none. It was like Gemini was feeling lazy after lunch.

I thought it was just a weird fluke. Tried again. And again. But every time, the same issue: weeks without questions. No errors, just… silent failure. Lovely.

And that’s when I reached that classic developer moment, I leaned back and said:

“Okay, you know what? At least this two-prompt thing is almost working.
I’ll park this for now. Let me finish the other features.
This… is a problem for
Future Me.*”*

Every dev knows this moment. You get tired of debugging ghosts and decide to work on the UI instead. That’s what I did. I moved on to building out the rest of Quiztelify — the quiz view, the results modal, the course cards, everything.

But of course, all good things must come to an end, including my illusion that I could ignore this bug forever. Once I wrapped up the rest of the site, I was left staring at the one unfinished piece. The thing I had shoved aside like dirty laundry. Yep, the “I’ll fix this later” moment had arrived. Future Me was now Present Me… and Past Me was a jerk.

Desperate Times Call GPT

So I came crawling back to GPT, hoping for a miracle. And to its credit — you (hi again 👋) delivered.

And you suggested a more structured approach:

  • First prompt: Ask Gemini to detect all the weeks in the document

  • Second prompt: Explicitly request only Week 1–6

  • Third prompt: Explicitly request only Week 7–12

This was like a surgical prompt strategy — split, isolate, extract.
GPT basically said: “Treat Gemini like a forgetful intern. Be specific. Don’t assume.”

Technically, this made sense. It would’ve solved the issue cleanly.
But it came at a cost: speed. This meant three API calls per PDF upload, not exactly snappy when a user just wants to revise for a test. The idea of users staring at a loading spinner while my backend lovingly coaxed out every week’s questions felt… bad.

So, again, I parked it. The dream of a single, reliable Gemini call still glimmered in the distance.

Gemini 2.5 Flash to the Rescue (Kind Of)

After a bit more digging, I stumbled upon Gemini 2.5 Flash and it felt like hope. Unlike 2.0, it didn’t suffer from the same output-cutoff issue. When I ran my prompt through 2.5, it actually returned all the weeks and all the questions in one go. No ghosted MCQs, no mysteriously empty arrays. Just clean, full JSON like I’d always dreamed of.

Of course, life being life — Gemini 2.5 Flash was sloooooow.
If Gemini 2.0 was the overconfident intern who sprints through tasks but forgets half the deliverables, 2.5 was the old-school guy who double-checks everything but takes three coffee breaks while doing it.

That’s when it hit me, a totally unhinged idea, but it made total sense:

“What if I use both?”

Jugaad Attempt #2: The Frankenstein Fix

So I built a hybrid pipeline:

  • Use Gemini 2.0 Flash to process the first half (Week 1–6).
    It’s fast, and the JSON is usually accurate here. (Why it worked in the first half but failed in the second is still a mystery to me).

  • Use Gemini 2.5 Flash to handle Week 7–12, where the data tends to get long and 2.0 starts dropping things.

And Finallyyy, this gave me the best of both worlds: Finally, I had a working system. The PDFs parsed beautifully. The JSON was consistent. The quiz data appeared instantly in the UI. I almost cried.

Quiztelify was built in just two days, but those two days were filled with pure chaos, caffeinated debugging, stubborn AI models, and the kind of duct-tape fixes that only make sense at 2 AM.

What started as a “simple NPTEL quiz parser” turned into a wild ride of prompt engineering, model limitations, and plenty of jugaad. But hey, that’s the fun part, right?

I wanted to share this story not just because it was ridiculous (and kind of hilarious in hindsight), but because building with AI is still very much an up-and-coming adventure. There were weird bugs, partial responses, random hallucinations and moments where I just sat back and thought, “Yeah okay, future me can deal with this.” Honestly, I still don’t fully understand why some things happened the way they did, but I’d love to dig deeper and figure it out someday.

Quiztelify is now live at quiztelify.karnx.dev, and I’m genuinely proud of how it turned out. If you’re prepping for NPTEL, definitely check it out — you can upload your own PDFs or pick from a list of pre-uploaded courses and get instant quiz practice, week by week. The full source code is open-source and available on GitHub: github.com/karannfr/nptel-quiz-app.

And if you’re just here for the chaos story? I hope you smiled, cringed, or at least related to the “how is this my life right now?” moments.

Because if there’s one thing I’ve learned in 48 hours of pure build mode, it’s this:

Two AI models are better than one.
But one stubborn dev is better than both. 💪

1
Subscribe to my newsletter

Read articles from Karan Dugar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Karan Dugar
Karan Dugar