Create an AI Chatbot for Your Website: A Step-by-Step Guide to RAG Imp

In today's digital landscape, there's a lot of excitement surrounding AI, AI agents, and AI chatbots. In this blog, we'll explore how to create your own AI chatbot for your website. We'll also delve into the fundamentals of RAG (Retrieval-Augmented Generation), showing you how to gather information from your site and leverage AI to build a chatbot that truly understands your content.

Note: This blog is about understanding the RAG. In production ready applications, we usually use frameworks like langchain and langgraph

Understanding the Flow: How RAG Works

Before jumping into code, let's understand what we're going to do and what tools we'll need. We'll use Node.js, Express.js, and TypeScript for our backend, Qdrant as our vector database, and Gemini from Google for our AI (why Gemini? Because it has a super generous free plan!)

But hey, don't worry too much about the specific technologies, there are plenty of options out there. What's most important is understanding how the system works, since this forms the foundation for all RAG applications. This knowledge will be super helpful for building AI agents or chat bots in the future, so don't skip this part!

The flow is actually pretty simple:

First, we scrape our website to gather context about it for our AI. You could also use other formats like PDFs to provide context.
We need a pool of information from which we can search for content related to what users ask our chatbot.
But how do we retrieve only the relevant information? That's where our hero Qdrant comes in! Qdrant is an open-source vector database and similarity search engine designed to handle high-dimensional vectors. There are other vector databases available too, so feel free to use whatever works for you.
Once we have both the user's question and the related information, we can use AI to create a good response using the information from our vector database and send it back to the user.

Let's summarize the flow:

Retrieve information from our website
Save that information in our vector database
User asks a question to our AI chat bot
Based on the question, retrieve relevant information from our vector database
Use AI (Gemini AI) to create a great response using the retrieved information

You might wonder, how does the vector database fetch information based on the user's question? This is where vector embeddings come in. In simple terms, vector embeddings are just bunches of floating-point numbers that represent text in a way AI can understand and compare.

Let's Start Coding!

Now that you understand the theory, let's dive into the code.

First, let's set up our backend. Feeling bored about setting up a basic backend? No worries! You can directly run this command in your terminal:

npm create basic-express-server

This is my own npm package 😎 that sets up a basic Express server in JavaScript or TypeScript.

Next, let's install all our dependencies. Run the following command:

npm i @google/genai @qdrant/js-client-rest axios uuid

Setting Up Our Vector Database

As outlined in our flow, we first need to collect information from our website. But before that, let's create two functions: one to create vector embeddings from text, and another to save these to our vector database.

Let's start with the vector database. You can use Docker or create a cluster in the cloud. I'm using a cluster that I created in Qdrant cloud. Make sure you have Qdrant's URL and key:

import { QDRANT_KEY, QDRANT_URL } from "./config.js";

const qdrantClient = new QdrantClient({
  url: QDRANT_URL,
  apiKey: QDRANT_KEY,
  checkCompatibility: false,
});

Now we can use this function to insert information into the database:

const WEB_COLLECTION_NAME = "my_portfolio_web";
const VECTOR_SIZE = 3072;

export const insertInDB = async (
  embedding: number[],
  url: string,
  body: string = "",
  head: string = ""
): Promise<void> => {
  try {
    const { exists } = await qdrantClient.collectionExists(WEB_COLLECTION_NAME);
    if (!exists) {
      await qdrantClient.createCollection(WEB_COLLECTION_NAME, {
        vectors: {
          size: VECTOR_SIZE,
          distance: "Cosine",
        },
      });
    }

    await qdrantClient.upsert(WEB_COLLECTION_NAME, {
      wait: true,
      points: [
        {
          id: uuidv4(),
          vector: embedding,
          payload: { url, head, body },
        },
      ],
    });
  } catch (error) {
    throw new ErrorHandler("Failed to insert data into database", 500);
  }
};

Note: I'm using a custom error handler in every function so that I can get more context in response.

throw new ErrorHandler("Failed to insert data into database", 500);

The size parameter represents the dimensions of the vector embedding. Gemini's gemini-embedding-exp-03-07 model has 3072 dimensions. This might differ for different models.

Creating Vector Embeddings

Now that we have a function to insert information into our database, let's create a function that generates vector embeddings. For this, we need the API key for our AI to use the embedding model. Make sure you have your API key ready.

To create embeddings using Gemini's gemini-embedding-exp-03-07 model, first initialize Gemini like this:

import { GoogleGenAI} from "@google/genai";

export const ai = new GoogleGenAI({
  apiKey: AI_API_KEY!,
});

And now let's create a function that generates vector embeddings:

import {  GenerateContentResponse } from "@google/genai";

export const generateVectorEmbedding = async (
  text: string
): Promise<number[]> => {
  try {
    const response = await ai.models.embedContent({
      model: "gemini-embedding-exp-03-07",
      contents: text,
    });

    const embeddings = response.embeddings ?? [];

    if (!embeddings || embeddings.length === 0 || !embeddings[0]?.values) {
      throw new ErrorHandler("Failed to generate embedding", 500);
    }

    const res = embeddings ? embeddings[0]?.values : [];

    return res;
  } catch (error) {
    throw new ErrorHandler("Failed to generate vector embedding", 500);
  }
};

Collecting Website Information

Now let's get the information we'll save in our vector database. I'm scraping my portfolio website to get this information, you can scrape your own website. Here's a function to scrape a website:

export async function scrape(url: string): Promise<ScrapeResult> {
  try {
    const response: AxiosResponse<string> = await axios.get<string>(url);
    const $ = cheerio.load(response.data);

    const head = $("head").html() ?? "";
    const sections: string[] = [];

    $("section").each((_, el) => {
      const html = $(el).html();
      if (html) sections.push(html);
    });

    return { head, sections };
  } catch (error) {
    throw new ErrorHandler("Failed to scrape the URL", 500);
  }
}

Here we're using axios to get the website content and cheerio to help us extract specific tags.

Note: You'll need to adapt two things according to your needs:

I'm specifically targeting section tags because that's how my website is structured. I'm getting an array of sections as a way to chunk the information. You'll need to find your own way to chunk information that works for your site.
You could write this as a recursive function to extract all a tags and get information from those URLs recursively. Just make sure there's a stopping condition so it doesn't run infinitely!

Ingesting Information to the Database

Now that we have the information, let's create a function that ingests it into the database using the three functions we've created:

export const ingest = async (url: string): Promise<void> => {
  try {
    const { head, sections } = await scrape(url);

    if (!sections || !head) {
      throw new ErrorHandler("Failed to ingest the url", 400);
    }

    for (const section of sections) {
      const embedding = await generateVectorEmbedding(section);
      await insertInDB(embedding, url, section, head);
    }
  } catch (error) {
    throw new ErrorHandler("Failed to ingest URL", 500);
  }
};

If you want to save information from PDFs instead, you can use this function. Just make sure you install unpdf:

import { extractText, getDocumentProxy } from "unpdf";

export const pdfToPageChunks = async (pdfBuffer: Buffer): Promise<string[]> => {
  try {
    const pdf = await getDocumentProxy(new Uint8Array(pdfBuffer));

    const { text } = await extractText(pdf, { mergePages: false }); 
// using mergePages:false won't merge the pages so that we get text as array  

    if (Array.isArray(text)) {
      return text;
    } else {
      return [text];
    }
  } catch (error) {
    throw new ErrorHandler("Failed to parse PDF", 500);
  }
};

Retrieving Relevant Information

Now that we've saved the information, we can retrieve it based on user questions. Let's create a function that searches the database based on embeddings of users' questions:

export const retrieveByEmbedding = async (
  queryEmbedding: number[],
  k: number = 3
): Promise<string[]> => {
  try {
    const searchResult = await qdrantClient.search(WEB_COLLECTION_NAME, {
      vector: queryEmbedding,
      limit: k,
    });

    return searchResult.map((hit) => (hit.payload?.body as string) ?? "");
  } catch (error) {
    throw new ErrorHandler("Failed to retrieve data from database", 500);
  }
};

We can generate embeddings for the user's question using the function we created earlier.

Creating the Chat Function

Now that we have everything set up, let's create a function that returns a good response:

export const chat = async (question: string = ""): Promise<string> => {
 try {
    const questionEmbedding = await generateVectorEmbedding(question);
    const contextBodies: string[] = await retrieveByEmbedding(
      questionEmbedding
    );

    const prompt = `
      Context:
      ${contextBodies.join("\n\n")}
        User Question: ${question}
        Answer:
    `;

const response: GenerateContentResponse = await ai.models.generateContent({
      model: "gemini-2.0-flash",
      contents: prompt,
      config: {
  systemInstruction: `
  You are a helpful assistant for a website that answers user questions based on provided context.
            `,
      },
    });

    return response.text ?? "";
  } catch (error) {
    throw new ErrorHandler("Failed to generate response", 500);
  }
};

Note: Make sure to provide more detailed system instructions for better results!

Handling Conversation History

If you want the AI to remember previous conversations, you can use Gemini's multi-turn conversation feature like this:

const chat = ai.chats.create({
 model: "gemini-2.0-flash",
 history:  history: [
      {
        role: "user",
        parts: [{ text: "Hello" }],
      },
      {
        role: "model",
        parts: [{ text: "Great to meet you. What would you like to know?" }],
      },
    ],
});

const response1 = await chat.sendMessage({ message: "Can you telle me about new features?.", });

Conclusion

Hooray! You now have a chat bot that can answer questions about your business or website. The bot uses information directly from your site to provide relevant, accurate responses to users.

If you want to see the complete code, you can check out my GitHub repository

While you could use frameworks like Langchain if you prefer, the basic flow of RAG applications will be very similar to what we've covered here. Understanding these fundamentals will help you build more sophisticated AI solutions in the future!

Create an AI Chatbot for Your Website: A Step-by-Step Guide to RAG Implementation