Exploring RAGs 3

So far we have used mostly simple and naive systems to try and implement a RAG system. This will be perhaps the last post to deal with one such model.
I was dabbling with the idea about having a program, with which i could chat with a set of documents if and when needed. This is also the closest to an actual real life use case i’ve gone towards.
Understanding why
Firstly, in situations where there is a massive set of documents for a product or a tool, it can be difficult for newbies to just search for the right keywords and get what they are looking for with the first couple searches.
Notwithstanding what it says about the actual writing of such documentations themselves, I hope we can both agree, that there needs to be a solution apart from a complete rewrite.
One such solution could be implemented by using really fast small scale LLMs whose job is to parse the user’s search’s meaning, somehow figure out which parts of the documentation is actually relevant (the most tricky part of this whole thing) and then just summarizing the entire content of those documents into the LLM context such that the user can now talk to the bot about it and clear his doubts. Perhaps have it generate some examples that can help them get started.
Some production ready implementations of this idea already exists in several documentations of widely used tools, primarily seen within the docs of AI tools themselves, trying to showcase their functionality in their own documentation sites. But here we will be implementing only a very simple, local, and terminal based implementation of it, running on dummy data.
A lot of optimization is indeed left in the code, along with the entire front end and backend connections needed to display the results in a graphically pleasing and easy to use way.
Regardless of that, this is a simple and solid foundation upon which one can try to implement their own larger, better optimized and more user friendly implementations.
So let’s take each part of the problem one by one and deal with it.
Jsonify
The main method of storage of our data will be in Json or similar formats so that they can be efficiently stored and also efficiently transferred back and forth between the LLM and the web servers should we choose to deploy it somewhere.
The first step then, is to convert all our documentation into Json. That’s where this program comes in.
LLM
This is a general piece of code that will work with any program that needs an LLM interface.
The actual connection with an LLM, local or online will be implemented here, along with whatever useful helper functions we might need that are frequently needed.
Chatbot
This is the actual implementation of the chatbot that chats with the user and retrieves relevant information.
- todo: the relevant files are rechecked with every single query, make sure that they stay in the memory until and unless the next query is wildly different from the previous ones.
Potential related use
I tried to change it somewhat to turn it into a college notes summarizer using a completely different approach that still doesn’t use the actual Sentence transformer or vector embedding methods, but it is unoptimized, slow and impractical for most use cases until improved.
Still, when done right it can be a great help to students. So if there is sufficient response from you guys, I will actually go in there, clean up and optimize the code and release a post on the usable version in here sometimes later.
Conclusion
So that was my (perhaps) last naive implementation of a RAG. At this point, the limiting factor has become just how slow it is to push raw walls of texts into the LLM and letting it decide which one is relevant.
Later on, once the metadata is put into an embedding, it will become much faster to let vector logic do it’s work, and let the LLM do what it’s actually meant to do, process the text and generate insights.
See you soon in the next post.
(note: i actually wrote this almost over a year ago and forgot to post this lol)
Subscribe to my newsletter
Read articles from Sukalyan Roy directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
