Building a Smart Clinic Chatbot for WhatsApp in Kenya: A Journey with Google Gemini API, Node.js, and Twilio.

I. Introduction
In Kenya, navigating health care can sometimes be a real challenge. Exhausting phone calls to make appointments or inquire about something, making long queues just to ask simple questions, and for consultation. Imagine having a seamless, automated solution right on your phone, a tool that speaks directly to millions where they are most active: WhatsApp. That's precisely the problem I set out to solve by building a WhatsApp chatbot specifically designed for clinics here in Kenya.
In this post, I’ll take you through my journey of creating an AI-powered clinic chatbot using Google Gemini API, Node.js, Twilio, and MongoDB. I'll show the core features, the tech stack, and the exciting challenges I faced and, more importantly, overcame, while building this solution for our local context. This project has been an invaluable learning experience, deepening my understanding of API integration, conversational state management, and the practical application of AI in real-world scenarios.
II. The Foundation of My Chatbot
Choosing the right tools was crucial for this project, especially ensuring they could handle the demands of a growing digital population like Kenya's. Here’s a breakdown of the technologies I used and why they were the perfect fit:
Google Gemini API: The brain of the chatbot. I utilized Google's powerful Gemini 1.5 Flash model for understanding natural language queries, generating human-like responses for general medical FAQs, and enhancing the overall conversational experience. Its advanced NLP capabilities are key to making the bot truly smart.
Node.js & Express.js: An event-driven JavaScript runtime ideal for building efficient backend APIs. Express.js provided the framework for handling incoming Twilio webhooks and managing application routes, crucial for a responsive chatbot.
MongoDB & Mongoose: A flexible database that is perfect for storing conversational sessions and structured appointment details. Mongoose, an ODM (Object Data Modeling) library, makes interacting with MongoDB straightforward and object-oriented, allowing for development.
Twilio WhatsApp API: This connects our bot directly to WhatsApp. In a country like Kenya, where WhatsApp penetration is incredibly high, using Twilio was easy. It reliably handles incoming user messages via webhooks and allows us to send outbound responses, forming the backbone of the WhatsApp integration.
date-fns: A JavaScript date utility library. This was integrated to provide date parsing, allowing users to schedule appointments using natural language like "tomorrow", "next Wednesday," or "June 3rd," significantly improving the user experience over rigid date formats.
III. How it Works
My chatbot offers two primary functionalities, designed to bring efficiency to clinic operations and ease to patients: AI-powered general inquiries and an appointment booking system.
1. AI-Powered General Inquiries
The chatbot can answer a wide range of general medical questions, leveraging the intelligence of Google Gemini. This is crucial for providing quick, reliable information without human intervention.
How it works: When a user sends a message, the bot first checks if they are in a specific booking flow. If not, the message, along with the user's previous conversation history stored in our MongoDB ConversationSession model, is sent to the Gemini API. Gemini processes this context and generates a relevant response.
Maintaining Context: The ConversationSession model is central here. It stores an array of messages, each with a role (user) and parts (containing the text). This allows Gemini to understand the flow of conversation and provide contextually aware replies, even in multi-turn conversations about symptoms or conditions.
Example Conversation (AI Inquiry):
The Screenshot above shows "What are the common symptoms of influenza?" and the detailed bot reply.
The Screenshot above shows "What are the common symptoms of malaria?" and the detailed bot reply.
2. Multi-Turn Appointment Booking
This is the most interactive feature, meticulously guiding the user step-by-step through the appointment scheduling process, simulating a human conversation.
State Management: The core of this feature is the bookingState field in the ConversationSession model. This field tracks exactly where the user is in the booking process (e.g., awaiting_service, awaiting_patient_name, awaiting_date, awaiting_time, awaiting_confirmation). This ensures the conversation flows logically.
Progressive Data Collection: As the user provides information (service, name, date, time), it's temporarily stored in the currentAppointmentData field within their ConversationSession until all necessary details are gathered and confirmed.
Confirmation & Saving: Once all details are collected, the bot asks for a final confirmation. Upon a "yes" response, a new Appointment document is created and saved to MongoDB, formalizing the booking.
The Booking Flow in Action:
User: "book appointment"
Bot: "Okay, let's book an appointment. What type of service are you looking for?"
User: "Dental Check-up"
Bot: "Got it. You want a Dental Check-up. What is the patient's full name?"
User: "Duncan Maina"
The screenshot above shows the flow
3. Date & Time Parsing (Powered by date-fns)
A crucial addition to the booking flow was making date input intuitive and forgiving. Instead of forcing users to specific, rigid formats, the bot now understands common date phrases, a vital feature for diverse user inputs.
The Challenge: Initially, the bot only supported strict date formats like YYYY-MM-DD, which is not very user-friendly, especially for users who prefer natural language.
The Solution: I integrated the date-fns library. In the awaiting_date state, the handleBookingFlow function now intelligently uses parse, addDays, nextMonday, isValid, and format functions from date-fns to interpret various inputs.
How it works:
It first checks for keywords like "today", "tomorrow", "next Monday", "next Wednesday", etc.
If no keyword is matched, it robustly attempts to parse the input against common date formats (e.g., YYYY-MM-DD, MM/DD/YYYY, DD-MM-YYYY, MM/DD/YY).
A critical validation step ensures the parsed date is valid and not in the past, guiding the user if the input is unclear.
Example (Natural Language Date):
Bot: "When would you like to book the appointment? Please provide a date (e.g., 2025-06-15, today, tomorrow, next Monday)."
User: "Next Wednesday"
Bot: "Okay, 28 May 2025. What time would you prefer? (e.g., 10:00 AM, 2:30 PM)"
Screenshot above shows "Next Wednesday" being parsed to "28 May 2025"
Confirmation and Completion:
User: "10:00 AM"
Bot: "Please confirm your appointment details..."
User: "Yes"
Bot: "Appointment for Duncan Maina for General Consultation on Tue, Jul 15, 2025 at 10:00 AM has been successfully booked! Your appointment ID is: 68356de3b127a773a7a2bace."
The screenshot image above shows the confirmation prompt and "yes" reply.
IV. Challenges Faced & Solutions Implemented and Lessons Learnt
Building this chatbot presented its share of hurdles, particularly when it came to API integrations and maintaining a seamless user experience. Overcoming these challenges was a crucial part of my learning process.
Maintaining Conversation Context :
Challenge: For any useful chatbot, remembering past messages is key. How do you ensure the bot understands the context in a step-by-step process like booking an appointment via WhatsApp?
Solution: I implemented the ConversationSession model in MongoDB. Each user's WhatsApp number is linked to a unique session where their entire message history is stored. This enabled the bot to comprehend the ongoing conversation and convey relevant context to the Gemini API, which was crucial for providing accurate replies.
Twilio Webhooks & Asynchronous Responses:
Challenge: Twilio's webhook expects an immediate HTTP 200 OK response (within 15 seconds) to confirm message receipt. However, operations like querying MongoDB or calling the Gemini API take time. If the response isn't immediate, Twilio might retry the webhook, leading to duplicate messages to the user.
Solution: I configured the handleIncomingWhatsApp controller to send an immediate empty MessagingResponse back to Twilio. The actual AI processing and sending of the chatbot's reply are then performed in a separate, background processAIResponse, using twilioClient.messages.create to send an outbound message to the user. This ensures Twilio's requirements are met without delaying the user's experience.
Twilio Daily Message Limits / Sandbox Disconnections:
Challenge: During intensive testing, I frequently encountered RestException [Error]: Account exceeded the 9 daily messages limit, or found my WhatsApp number disconnected from the Twilio Sandbox after the 72-hour expiry.
Solution: Learning to recognize these specific Twilio errors was key. For the message limit, it meant patiently waiting for the reset or considering an account upgrade for more testing. For sandbox disconnections, the fix was simple: just send join <sandbox name> from my WhatsApp number to the Twilio sandbox number again. This is a common aspect of developing with trial accounts.
MongoDB Connection Issues (IP Whitelisting):
Challenge: Initially, my Node.js server struggled to connect to the MongoDB Atlas cluster, throwing an error about "IP not whitelisted". This was a head-scratcher at first!
Solution: This was a security feature on MongoDB Atlas. I learned to identify my public IP address and then add it to the IP Whitelist in the MongoDB Atlas Network Access settings. This crucial step allowed my server to establish a secure connection to the database.
Intermittent Gemini API Errors:
Challenge: While Gemini generally worked very well, at times, I would receive generic "I'm sorry, I encountered an error trying to process your request with the AI. Please try again later" messages.
Solution: Through observation and checking the nodemon logs for specific error codes (like RESOURCE_EXHAUSTED), I determined these were likely temporary API rate limits or brief service outages on Google's end, as the functionality would return after a short period. This experience highlighted the importance of robust error handling and providing clear feedback to the user when external services are unavailable.
V. Future Enhancements: Looking Ahead
This project is a strong foundation, and I have several exciting ideas for future enhancements to make this clinic chatbot even more powerful and useful for the Kenyan healthcare landscape:
Appointment Management: Implementing features to allow users to easily view their existing appointments or cancel them directly through the chatbot. This is a crucial next step for a complete booking system.
Clinic-Specific FAQs / Knowledge Base: Creating a dedicated system to store and retrieve answers to clinic-specific questions (e.g., unique services, specific opening hours, insurance accepted, particular doctor specializations) to provide more accurate and immediate responses than general AI, tailored to a local clinic's needs.
Doctor Specificity & Availability: Enhancing the booking flow to allow users to select a specific doctor for a service, and integrating a more complex availability calendar to show real-time slots.
Admin Dashboard: Building a user-friendly web interface for clinic staff to efficiently view, manage, and confirm appointments, and potentially update the clinic's FAQs directly.
Full Deployment: Moving the chatbot from the Twilio Sandbox to a production-ready setup with a dedicated Twilio number and deploying the backend to a public server for wider access and real-world impact.
VI. Conclusion
Building this clinic chatbot has been an incredibly rewarding experience. It’s my first successful development, and I`m proud of myself. It allowed me to combine my Node.js and MongoDB skills with exciting new technologies like Twilio for seamless WhatsApp integration and Google Gemini for powerful AI-driven conversations. The journey was filled with challenges, from state management to API rate limits, but each one provided a valuable learning opportunity, significantly strengthening my problem-solving abilities and practical understanding of full-stack development.
I'm excited about the transformative potential of AI-powered chatbots to streamline everyday processes, especially in sectors as vital as healthcare in Kenya. This project is a testament to how technology can bridge gaps and improve efficiency.
Feel free to check out the code on my GitHub repository: https://github.com/Maina-Duncan/whatsapp-clinic-chatbot
Asante sana for reading!
Subscribe to my newsletter
Read articles from DUNCAN MAINA directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

DUNCAN MAINA
DUNCAN MAINA
Developer passionate about building practical solutions with AI. Excited to delve deeper into AI, full-stack development (Angular, Node.js, JavaScript), and solve more real-world problems.