Building Real-Time Financial News RAG Chatbot with Gemini and Qdrant
Introduction
Most of us have invested time and effort in buying real estate, mutual funds, bonds, and more. Before investing, we search for a suitable banking partner. For long-term investments, we research which banks are performing well minus scams or frauds, and offer reasonable interest rates and returns. So we need to stay updated with financial news.
What if you had at your disposal a real-time financial news chatbot that could provide you with all the news related to finance and economics? Does that sound interesting? Retrieval Augmented Generation (RAG) has now made this possible. We can leverage large language models and vector databases to get our queries answered.
Let's create a real-time financial news RAG chatbot and see if it accurately answers questions using available data. It can be real-time by feeding a vector database with the latest news.
Real-Time Financial News RAG Chatbot Using Gemini
As our starting point, we took the Indian Financial News dataset. This dataset consists of financial news related to Indian banks. It is updated till 26th May 2020.
Before we get started, let’s install the required dependencies.
%pip install -q llama-index 'google-generativeai>=0.3.0' qdrant_client llama-index-embeddings-fastembed fastembed llama-index-llms-gemini |
Preparing the Node
As the dataset is a CSV file, let’s load the data.
from llama_index.core import SimpleDirectoryReader |
All the documents are ready; now we will split the sentences into defined chunk sizes.
from llama_index.core.node_parser.text import SentenceSplitter
text_parser = SentenceSplitter(chunk_size=1024)
text_chunks = [] # This will hold all the chunks of text from all documents doc_idxs = [] # This will keep track of the document each chunk came from
for doc_idx, doc in enumerate(docs): # Split the current document's text into chunks cur_text_chunks = text_parser.split_text(doc.text)
# Extend the list of all text chunks with the chunks from the current document text_chunks.extend(cur_text_chunks)
# Extend the document index list with the index of the current document, repeated for each chunk doc_idxs.extend([doc_idx] * len(cur_text_chunks))
Then, we will create a text node object and assign the metadata to it. We will store all the nodes in one node list.
from llama_index.core.schema import TextNode |
Initializing the Qdrant Vector Store
To store the nodes, we need a vector store. Here, we have chosen Qdrant as our vector store. Qdrant is a high-performance vector database with all the specific features that a vector store should have. It is fast and accurate by utilizing the HNSW algorithm for approximate nearest neighbor search. Qdrant supports additional payload and filters based on payload values by providing an easy-to-use API. Additionally, it supports docker installation, is equipped with in-memory storage of vectors, is cloud-native, and scales horizontally. Developed in the Rust language, Qdrant implements dynamic query planning and payload data indexing.
First, we’ll create a collection in the vector store index.
from llama_index.core import VectorStoreIndex, StorageContext |
Gemini Embeddings and Text Model
The vector store and nodes are ready, but the vector store is not going to directly accept the nodes. They require embeddings, and for embeddings, we are using the Gemini embedding model here. We'll be leveraging the Gemini LLM, which is a very capable family of multimodal models. Built on the transformer architecture and trained on TPUs, the Gemini model excels in summarization, reading comprehension tasks with per-task fine-tuning, multilinguality, long context, coding, complex reasoning, mathematics, and of course, multimodality.
We’ll initiate the Google API key, which you can obtain from Google AI Studio.
%env GOOGLE_API_KEY = "your-api-key" |
Now, using the API key, we will generate the embeddings using the FastEmbed embedding model and the Gemini LLM in Llamaindex’s Settings.
from llama_index.embeddings.fastembed import FastEmbedEmbedding |
from llama_index.llms.gemini import Gemini
Settings.embed_model = embed_model Settings.llm = Gemini(model="models/gemini-pro") Settings.transformations = [SentenceSplitter(chunk_size=1024)] storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex( nodes=nodes, storage_context=storage_context, transformations=Settings.transformations, )
The vector store is saved in the storage context, and the index has been initiated with it and the nodes.
HyDE Query Transformation
Now, we’ll initiate the vector query engine with a response synthesizer and vector retriever. Vector Retriever is initiated with the vector index retriever in which the index was included. Response Synthesizer generates a response from an LLM, using a user query and a given set of text chunks. The output of a response synthesizer is a Response Object.
from llama_index.core import get_response_synthesizer |
We will employ the HyDE query transformer for advanced retrieval. HyDE (Hypothetical Document Embeddings) facilitates zero-shot prompt-based instruction-following in a large language model. It generates a hypothetical document encapsulating relevant text patterns, converts them into embedding vectors, and averages them to create a single embedding. This procedure identifies the corresponding actual embedding through vector similarity in the document embedding space, thereby eliminating the need for a retrieval step involving querying an input and obtaining a document from a large database.
The HyDE query transformation assists in delivering responses directly and concisely.
from llama_index.core.indices.query.query_transform import HyDEQueryTransform from llama_index.core.query_engine import TransformQueryEngine
hyde = HyDEQueryTransform(include_original=True) hyde_query_engine = TransformQueryEngine(vector_query_engine, hyde)
Leveraging Gradio UI for Chatbot Implementation
For deploying a chatbot, we will use Gradio.
def queries(query_str): |
Query Time!
Let’s query the chatbot.
Question 1:
Tell me all the news about scam. |
PNB scam fallout: Trade finance hit as caution prevails, premium soars |
Question 2:
What is the latest news about RBI? |
The RBI has deferred the launch of IndAs again, awaiting amendments to the banking laws. |
Question 3:
Can you tell me all the news related to home loans? |
Lending rates cut leads to balance transfers in home loan market: ICRA |
Question 4:
Tell me about the news of Yes Bank Scam. |
The RBI discovered more than $450 million in extra bad loans at Yes Bank. The gross non-performing assets assessed by the Reserve Bank of India were $457 million higher than Yes Bank had disclosed as of November 20, 2019. |
Question 5:
What is the latest news about NBFC? |
The latest news about NBFC is that RBI eases norms for banks to lend more to NBFCs, housing finance companies. RBI enhances single-borrower exposure limit to 15% of bank's capital. |
Conclusion
Building a real-time financial news RAG chatbot using the Indian Financial News dataset proved to be a rewarding journey. Now you know that you can easily ask questions about banks and get instant answers! It’s now time for you to make your own chatbot. Hope you enjoyed reading this blog.
This blog was earlier posted here: https://medium.com/p/64c0a3fbe45b
Subscribe to my newsletter
Read articles from Akriti Upadhyay directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by