Hey there, fellow tech enthusiasts! Remember our last adventure where we taught a chatbot to answer questions using a PostgreSQL database? Well, hold onto your keyboards, because we're about to kick things up a notch. Ever wondered if a chatbot could flip through a document before tapping into a database? Spoiler alert: it can!

In this installment of our Exploring LLMs series, we're supercharging our chatbot with the power of Document Retrieval. If you're new here or need a refresher, you can catch up on the previous posts in this series.

Curious? Let's unravel how we pulled this off, step by step, without the fluff. And yes, the code for this chapter is right here if you're itching to get your hands dirty.

Overview of the New Features

Document Retrieval: Our chatbot can now peruse a text document to find answers before bothering the database.
LLM-Based Classification: We're using a language model to decide if the document's answer hits the nail on the head.
Fallback Mechanism: If the document leaves us hanging, we fall back to querying the database, just like old times.

Let's dive into the code modifications and see how this magic happens.

Before We Dive In

For our mission, we'll create a new document file named text_doc.md. Think of it as the chatbot's little black book, containing hobbies and ages of employees from our database, but told in a narrative style. Here's a sneak peek:

John Doe, a 34-year-old from New York, spends most of his weekends hiking and exploring nature, with a personal goal to visit every national park in the U.S. before turning 50. Jane Smith, 29, is passionate about painting and often spends her evenings working on abstract art pieces; she dreams of one day opening her own gallery. Emily Johnson, at 41, is an avid cyclist who participates in local races and has a life goal to complete a triathlon. Michael Brown, 37, loves photography and hopes to publish a book of his work, capturing unique moments from his travels around the world. Lastly, Sarah Davis, a 25-year-old yoga enthusiast, is focused on personal growth and mindfulness, with the goal of becoming a certified instructor and opening her own wellness retreat.

Our goal? To have the chatbot tap into this juicy info when answering questions.

Step 1: Importing the Magic Ingredients

First things first, we need to bring in some extra tools to handle document loading, embeddings, and vector storage. Here's what we're adding to the mix:

from langchain.document_loaders import TextLoader
from langchain.embeddings import HuggingFaceEmbeddings  # Make sure this fits your setup
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA

Why these?

TextLoader: So our chatbot can read the document—because literacy is important!
HuggingFaceEmbeddings: To turn text into numerical vectors that capture meaning.
FAISS: Think of it as a super-fast librarian who can find similar texts in a jiffy.
RetrievalQA: The chain that combines retrieval and question-answering.

Embeddings and Vector Databases? Sounds Fancy!

Imagine trying to find a book in a library without any cataloging system—nightmare, right? Embeddings are like assigning a unique ID to every book based on its content, making similar books have similar IDs. A vector database stores all these IDs (vectors) so we can quickly find what we're looking for.

Embeddings: Convert text into high-dimensional vectors. Similar texts = similar vectors.
FAISS: An efficient way to store and search through these vectors.

By using embeddings and FAISS, our chatbot can swiftly find relevant pieces of text from our document.

Step 2: Modifying the `init` Method

Time to teach our chatbot some new tricks!

Loading the Text Document

We need to load text_doc.md so our chatbot can access it:

if document_path:
    # Load the document
    loader = TextLoader(document_path)
    documents = loader.load()

TextLoader reads the content of the file.
documents is a list containing the loaded text.

Creating Embeddings and the Vector Store

Next, we turn the text into embeddings and store them:

# Create embeddings and vectorstore
embeddings = HuggingFaceEmbeddings()
vectorstore = FAISS.from_documents(documents, embeddings)

How does this work?

Embedding Generation: Each piece of text is converted into a numerical vector.
Vector Storage: We store these vectors in FAISS for quick retrieval.

Setting Up the Retriever and the Retrieval QA Chain

Now, let's set up the system that fetches relevant text based on a query:

# Create a retriever
retriever = vectorstore.as_retriever()

# Create a RetrievalQA chain
self.retrieval_qa_chain = RetrievalQA.from_chain_type(
    llm=self.llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

Retriever: Finds relevant text snippets by comparing vectors.
RetrievalQA Chain: Uses the retriever and the LLM to generate answers.

Handling the Absence of a Document

In case we don't have a document, we ensure our chatbot doesn't throw a tantrum:

else:
    self.retrieval_qa_chain = None

Step 3: Adding a Classification Step

But wait, how do we know if the document's answer is good enough? Let's add a step where the LLM acts as a quality checker.

Creating a Classification Prompt

We craft a prompt to ask the LLM if the answer contains the needed information:

self.classification_prompt = PromptTemplate.from_template(
            """
            Given the user's question and the assistant's answer, determine whether the assistant's answer addresses the user's question, it is okay even if the answer is only partially correct, as long as it is not completely empty of any information, in such cases start your answer with yes, otherwise no.

            Question: {question}
            Answer: {answer}
            """
        )

Purpose: Have the LLM judge the answer.
Constraint: Keep it simple.

Creating the Classification Chain

Now, we tie it all together:

self.classification_chain = LLMChain(
    llm=self.llm,
    prompt=self.classification_prompt
)

Step 4: Modifying the `get_response` Method

Time to update how our chatbot responds to questions.

Attempting to Answer Using the Document

First, we try to answer using the document:

if self.retrieval_qa_chain:
    try:
        # Get answer from RetrievalQA chain
        response = self.retrieval_qa_chain({"query": question})
        answer = response["result"]

retrieval_qa_chain attempts to find an answer in the document.
answer is what the LLM comes up with.

Classifying the Answer

Now, we check if the answer is satisfactory:

# Use the LLM to classify whether the answer contains the information
classification_input = {
    "question": question,
    "answer": answer
}
classification_result = self.classification_chain.run(classification_input).strip().lower()

classification_result tells us "yes" or "no".

Deciding Whether to Use the Retrieved Answer

Based on the classification, we decide our next move:

if classification_result == "yes":
    # Update memory
    self.memory.save_context({"question": question}, {"answer": answer})
    return answer
else:
    # Proceed to SQL chain
    pass

If "Yes": We return the answer.
If "No": Time to query the database.

Exception Handling

We make sure to handle any hiccups gracefully:

except Exception as e:
    # If there is any error, proceed to SQL chain
    print(f"Error in RetrievalQA chain: {e}")

Proceeding to Query the Database

If the document didn't help, we fall back to our trusty SQL chain:

# Prepare the inputs
inputs = {
    "question": question,
}

# Call the chain
response = self.chain.invoke(inputs)

# Update memory
self.memory.save_context({"question": question}, {"answer": response})

return response

Step 5: How It All Comes Together

User Interaction Flow

User asks a question: "What are Jane Smith's hobbies?"
Document Retrieval Attempt:
- The chatbot searches text_doc.md.
- Finds information about Jane's passion for painting.
Answer Classification:
- The LLM confirms the answer is relevant.
- The chatbot returns: "Jane Smith is passionate about painting and dreams of opening her own gallery."
Fallback to Database Query:
- If the user asks: "What is Jane Smith's salary?"
- Document lacks this info.
- Classification returns "No."
- Chatbot queries the database and provides the salary.

Acknowledging Limitations and Future Improvements

Now, let's address the elephant in the room. This isn't a production-ready solution—yet.

Gaps and Areas for Improvement:

Scalability: Handling larger documents or multiple files efficiently.
Enhanced Retrieval: Smarter ways to chunk and retrieve documents.
Error Handling: More robust mechanisms to catch and log errors.
Security: Sanitizing inputs to prevent nasty surprises like SQL injections.

Our Intent: This series is all about giving you a taste of how LLMs can be harnessed. Moving forward, we'll dive deeper into each of these components—embeddings, vector databases, retrieval mechanisms, and more—to truly understand the nuts and bolts.

Conclusion

By adding a document retrieval mechanism and an LLM-based classification step, we've made our chatbot smarter and more resourceful. It's like giving it a library card before sending it to the database. This approach, known as Retrieval-Augmented Generation (RAG), makes the chatbot more efficient and user-friendly.

We've introduced key concepts like embeddings and vector databases, essential tools in the AI toolkit. While there's room for improvement, we've laid a solid foundation.

Key Takeaways

Embeddings: Turning text into numbers to capture meaning.
Vector Databases: Storing and searching these numerical representations efficiently.
Retrieval-Augmented Generation (RAG): Combining retrieval with generation for better answers.
LLM-Based Classification: Letting the language model judge answer relevance.
Continuous Improvement: Recognizing limitations and planning for enhancements is crucial in developing effective AI systems.

Stay tuned as we dive deeper into these components in upcoming posts. Until then, happy coding!

P.S. Got stuck or have questions? Drop a comment below or reach out—I'm all ears!

Enhancing Our Chatbot with Document Retrieval: Exploring LLMs — 6

Overview of the New Features

Before We Dive In

Step 1: Importing the Magic Ingredients

Embeddings and Vector Databases? Sounds Fancy!

Step 2: Modifying the `init` Method

Loading the Text Document

Creating Embeddings and the Vector Store

Setting Up the Retriever and the Retrieval QA Chain

Handling the Absence of a Document

Step 3: Adding a Classification Step

Creating a Classification Prompt

Creating the Classification Chain

Step 4: Modifying the `get_response` Method

Attempting to Answer Using the Document

Classifying the Answer

Deciding Whether to Use the Retrieved Answer

Exception Handling

Proceeding to Query the Database

Step 5: How It All Comes Together

User Interaction Flow

Acknowledging Limitations and Future Improvements

Conclusion

Key Takeaways

Subscribe to my newsletter

Abou Zuhayr

Abou Zuhayr

Enhancing Our Chatbot with Document Retrieval: Exploring LLMs — 6

Overview of the New Features

Before We Dive In

Step 1: Importing the Magic Ingredients

Embeddings and Vector Databases? Sounds Fancy!

Step 2: Modifying the __init__ Method

Loading the Text Document

Creating Embeddings and the Vector Store

Setting Up the Retriever and the Retrieval QA Chain

Handling the Absence of a Document

Step 3: Adding a Classification Step

Creating a Classification Prompt

Creating the Classification Chain

Step 4: Modifying the get_response Method

Attempting to Answer Using the Document

Classifying the Answer

Deciding Whether to Use the Retrieved Answer

Exception Handling

Proceeding to Query the Database

Step 5: How It All Comes Together

User Interaction Flow

Acknowledging Limitations and Future Improvements

Conclusion

Key Takeaways

Subscribe to my newsletter

Abou Zuhayr

Abou Zuhayr

Step 2: Modifying the `init` Method

Step 4: Modifying the `get_response` Method