Between Sora, Gemini’s impressive new real-time capabilities, and everything else going on in the world of AI, you might be forgiven for thinking that we've reached beyond the point of saturation.

But to quote many a shady late-night advertising TV sales pitch, “but wait there's more.”

The purpose of this post is to demonstrate a workflow for creating a personal AI agent using the excellent Open Web UI which - in my opinion - competes with LibreChat for the title of the best LLM front-end that can be self-hosted. But although this is the platform I've used for this experiment, it should work fairly well regardless of which front end you're using.

Small Context, Big Results

The beautiful thing about using context in AI workflows is that you don't need an awful lot of it to achieve great things. Okay, so getting personalized beer recommendations from a bot might be stretching the definition of “big things” just a little bit, but the use case here is one of many that can be easily achieved by taking a slightly deliberate approach to generating context for personal use.

In my opinion, a huge opportunity is being missed in leveraging more modest implementations of RAG workflows for personal inference.

When most people think about RAG, they think about vectorizing enormous enterprise document stores. But it can be used very effectively for personal uses, too. Consider the example of somebody developing a job search assistant who can curate a personal knowledge store with their resume, career aspirations, and even update it with their interview statuses as they go along.

Generating Knowledge / Data Stores

The implementation for contextual data vector stores that's OpenAI has used in its Assistants has inspired the implementation for a lot of other tools.

Context data can be very lightweight - it doesn't need to be anything more complicated than markdown files or JSON containing snippets of information about a specific topic. I've personally adopted the nickname “context snippets” to describe these documents, for want of a better term.

A popular implementation, and OpenWebUI does it very well, is the ability to create data stores consisting of collections of documents around a similar theme. So to get going with this project, I created a knowledge called “Daniel's Food and Drink Preferences” and then began generating the requisite files to fill it up.

I Love Dry Cider. Do you?

I like to take the approach of thinking strategically about what I want to include in each vector store.

The example chosen for this demonstration is clearly a flippant one, but for more serious uses (consider perhaps a vector store with your medical and health data) you'll want to think carefully about what type of information would be useful for an agent working with it as context.

In that example, it might be things like your medication list, your health history, your wellness objectives, but in the case of my food and drink knowledge store it was lighter topics such as my cider preferences.

Lately, I've become a huge fan of voice typing, and speech-to-text is another of the many technologies that have developed tremendously in recent years thanks to advances in AI. I use a dictation setup in order to jot down these context notes as naturally as possible, and try to imagine that I'm speaking to a friend, giving them all the mundane details about whatever note I'm capturing.

The question of how to get your context data from a convenient format into vector storage needs a little bit of polish in today's tools. You can either use platform frontends like that in OpenAI Playground (OpenWebUI's implementation is a bit better). Or if you're feeling up to the project, you can create your own frontend to develop a data pipeline for sending off your context data pieces for embedding.

Context Data Is In Flux, Like Life

I didn't start this blog on any kind of mission to promote Open Web UI, but having gotten into it, I feel the need to commend their implementation of this feature. I believe that a very important facet of context for personal uses is the fact that it is a living body of data.

This is where context implementations that are really only designed for one-time writing are, in my opinion, rather flawed. Some pieces of context data, like the city we were born in, remain constant, but others, like whether we're looking for a job or how many rooms are in our apartment, might be in a periodic state of flux as our life circumstances evolve.

The most powerful mechanisms for managing personal context are those that allow the context store to be edited, deleted and added to just like any other pool of textual data.

Now, The System Prompt

It may seem like this is an awful lot of work, but once you get the hang of configuring assistants with personal context data, you'll discover that the time invested is well worth it.

To achieve an agent-like behavior from a standard large language model API endpoint, the next requirement is to configure a system prompt to modify the default behavior of the model and hone it in on the objective of assisting with whatever we configured the agent for.

These don't need to be pieces of poetry or works of art. They just need to be instructive enough and determinative enough to guide the model towards the expected and desired behavior.

For the purpose of this example:

“You are the food and drink advisory assistant to the user Daniel Rosehill. Your name is Dave. Daniel will ask you to provide your recommendations from a menu which you might supply through an image upload. Quickly parse and analyse the context of the menu and provide Daniel a recommendation based upon your knowledge of his tastes, which is in your context.”

Which Underlying LLM To Use?

Deciding on the model is a matter of preference and sometimes also of budget.

If you're configuring a high volume personal context agent, perhaps something like generating personalized cover letters, then you might want to go for something like GPT 3.5 for the relatively easy task of text editing.

But you might want a model with stronger reasoning abilities for more involved and complicated usages.

Finally, Connect Agent To Knowledge

Where connecting curated personal context stores to agents becomes positively transformative in my opinion is when you begin attaching multiple storage tranches to individual agents.

This allows you fine-grained control over the type of information you wish to ground your agent on and provides a compelling reason to carefully gather and segregate your context data thinking about what kind of use cases it can support.

But today, I’m just connecting the humble beer advisory tool to the new “food and drink preferences” context pool:

Hey Dave, What Beer Should I Order?

if you've made it this far, then the good news is that we’re finally at the eagerly awaited end point of this journey when we can ask our newly minted bot to provide us with some personal recommendations for beers.

AI to the rescue situation one: you're in an airport bar somewhere in Germany, you've no idea what any of these beers are, and you need your trusty AI sidekick to guide you towards something you’ll have a better chance than not of enjoying:

Good thing we have our new on-call personal beer assistant on hand!

“Dave” has reviewed its context and provided you with recommendations based upon what you've told it about the type of beer you enjoy.

But wait, there's more!

Assuming that the underlying model that you chose is vision capable, then you can provide a screenshot or take a photo of a beer menu or the taps that are visible at the bar and then ask your personal assistant to give you its sage advice.

To model that experience, I picked a random craft beer menu from the internet.

And prompted like this:

Dave comes to the rescue again, successfully deciphering the text out of the menu and then providing recommendations based on context.

Ways You Can Use Context In Personal RAG Workflows

Generate a context repository for holding little pieces of information that you commonly need to retrieve like your ZIP code or other IDs. Needless to say, this idea involves some privacy risk, and you should make sure that all the tools in your stack are trustworthy before committing personal details into digital systems. This context data can be provided to a general assistant or multiple assistants to be on hand with this information whenever you require it.
Generate a household management repository containing all the little bits and pieces of information that keep your house in working order. This might include recurrent grocery lists, lists of chores, a maintenance calendar, and other things that keep your house from imploding into disaster. By populating and maintaining this one pool of data, you can keep multiple assistants in good working order. You might, for example, have a grocery buying assistant, another for chores, etc.
If you're tired of arguing over what movie to watch or you have very particular tastes in movies, why not develop your own entertainment context repository containing not only your preferences but also a running log of what you've viewed? Okay, this probably requires way more organization and willpower than even the most ardent movie buff is likely to sum up. But it demonstrates how some small context curation can replicate the kind of experiences offered by many personalized LLM tools.

Personal RAG: Beer Recommendation Assistants (But Really, So Much More!)

Table of contents