Build a Context-Aware Site Search with AI integration

It's 2023 and business search engines are innovating alongside top consumer search engines. In this post, we will be using ReactiveSearch's versatile stack for integrating AI and bringing context awareness to site search.

ReactiveSearch platform

Understanding ReactiveSearch

With native connectors for Elasticsearch, OpenSearch, Solr and OpenAI, ReactiveSearch enables businesses to choose an incremental search improvement and adoption strategy over betting on a completely new search solution.

How it works

Connect search engine index, LLM such as OpenAI's ChatGPT, or integrate a HTTP endpoint as a search data source.

Then, author a pipeline that takes a HTTP request as input, modifies the request by either using ReactiveSearch's pre-built stages or by writing JavaScript, queries search sources via connectors, and then transforms, re-orders and optionally enriches the response which is then served to the client. This orchestration happens in milliseconds, our benchmarks mark a typical pipeline doing the above adding a ~10ms overhead and can scale up to 300 QPS (that's about ~1MM requests in an hour). V8 engine isolates make a significant contribution as the choice of environment for executing JavaScript ⚡️ fast.

Pipeline Illustration

As the above image illustrates, the orchestration of the search pipeline is written as a YAML (or JSON) and the stages each represent a middleware. Here, the authorization, reactiveSearchQuery and elasticsearchQuery are pre-built stages and the user is writing a JavaScript for google_knowledge_graph and merge_response stages, to make a fetch request to Google's Knowledge Graph API and to merge the search engine and KG responses before serving it to the client. Stages that are I/O heavy and independent can be executed asynchronously as is the case here for the KG API call and the Elasticsearch Query stages.

Pipelines come with starter templates for over 10 use cases today and the authoring interface is via Monaco editor (what powers VS Code) with autocompletion and usage tooltip support to help you with the syntax. The JavaScript syntax supported by V8 is very close to the browser JavaScript runtimes -- we will illuminate the usage of all of this with a use-case of building an E-Commerce search pipeline.

Context-Aware Search

A context-aware E-Commerce search should:

  • Display Recent and Popular suggestions when no query is entered.
  • Provide relevant suggestions during active user input.
  • Offer AI-generated answers to user questions.
  • Handle typos during product searches.

To visualize the end result, refer to the embedded demo below.

Search Index

Before we build the search, we will need to index the data. For this use case, we've ingested 100,000 products from different categories through the Best Buy developer API. You likely already have a search index configured in Elasticsearch, OpenSearch or Solr, so we will skip over this part.

We will start with the ReactiveSearch dashboard and take a closer look at both the search index and the Browse Data functionality.

At this point, you probably want to get a free 14-day evaluation ↗️ of the ReactiveSearch platform. Choose Search Cluster for a green field project, or choose Serverless Search if you're connecting to an existing search index.

Building the Search Pipeline

Let's take a closer look at the process of building a search pipeline. We will start out with a pre-built pipeline template, orchestrate the pipeline stages with JSON and then write some JavaScript for building the context-aware search query.

Since we intend to use the no-code UI builder for building the search UI in the following step, we will assume the request payload to be using the ReactiveSearch API - a declarative API for capturing search intent. Pipelines by themselves make no assumptions of the use of ReactiveSearch API, though it certainly helps with being able to use the pre-built stages as well as the no-code UI builder.

The orchestration for the pipeline that we defined is as follows:

{
    "enabled": true,
    "description": "Best Buy Search Pipeline",
    "routes": [
        {
            "path": "/best-buy-set-pipeline/_reactivesearch",
            "method": "POST",
            "classify": {
                "category": "reactivesearch"
            }
        }
    ],
    "envs": {
        "index": [
            "best-buy-set-2023"
        ]
    },
    "stages": [
        {
            "id": "auth",
            "use": "authorization"
        },
        {
            "id": "generateRequest",
            "scriptRef": "generateRequest",
            "continueOnError": true
        },
        {
            "id": "query",
            "use": "reactivesearchQuery",
            "continueOnError": false
        },
        {
            "id": "es_query",
            "use": "elasticsearchQuery",
            "continueOnError": false
        },
        {
            "id": "typo check",
            "scriptRef": "checkTypo",
            "continueOnError": false
        },
        {
            "id": "researchQuery",
            "use": "reactivesearchQuery",
            "continueOnError": false,
            "trigger": {
                "expression": "context.envs.research == true"
            }
        },
        {
            "id": "research_es_query",
            "use": "elasticsearchQuery",
            "continueOnError": false,
            "trigger": {
                "expression": "context.envs.research == true"
            }
        },
        {
            "id": "answerAI",
            "use": "AIAnswer",
            "inputs": {
                "topDocsForContext": 3,
                "docTemplate": "${source.name}",
                "queryTemplate": "Can you tell me about: ${value}",
                "apiKey": "{{ OPENAI_API_KEY }}"
            }
        }
    ]
}

Our first custom script takes the browser query and modifies it to set the search and suggestion queries to use based on the search intent, which is detected based on the presence/absence of user input as well as the specificity of it.

The second custom script for applying typo tolerance is triggered when no hits are returned by the original query.

Below is a template for how a JavaScript stage should be defined.

// your function handler should always be named as handleRequest()
function handleRequest() {
    // Accessible variables within the function: context,
    // e.g. 1. context.envs contains the envs set by the pipeline and dynamically at runtime
    //      2. JSON.parse(context.request.body) provides the JSON of the request body - useful for changes to request body
    //      3. JSON.parse(context.response.body) provides the JSON of the request body - useful for enriching the response body
    // The function expects a return value of the context if you're making changes to the request or response body or setting a variable
    // e.g. return { ...context, myVar: myVar }
    console.log('request body is: ', context.request.body);
    var myVar = 'test';
    // sets myVar at the top-level in the context,
    return { ...context, myVar };
}

As this template shows, a custom JavaScript stage will define a function handleRequest() { handler that may read the request context, modify it, set a new variable in the context or add console logs.

Read the pipeline docs for concepts and how-to guides.

Building the search UI

Now that we have the search pipeline configured, let's consume it with a search UI. For this use case, we will choose to do this with the no-code UI builder. ReactiveSearch UI kit is also a good choice for building a React, Vue or Flutter based search UI.

Summary

ReactiveSearch offers a versatile stack for incrementally adopting AI to an existing business search offering an alternative to betting on new solutions. It does this by providing connectors to Elasticsearch, OpenSearch, Solr and OpenAI. In this post, we went over setting up the search index, browsing the data, building the search pipeline for context-aware search, and finally building the search UI.

Browse other interactive use-cases of ReactiveSearch over here ↗️

Get a free 14-day evaluation ↗️ of the ReactiveSearch platform

0
Subscribe to my newsletter

Read articles from Siddharth Kothari directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Siddharth Kothari
Siddharth Kothari

CEO @reactivesearch, search engine dx