Building Hybrid Semantic Search with MongoDB Atlas, FastAPI (Sentence Transformers) & JavaScript

Varun KumawatVarun Kumawat
11 min read

In this article, we'll implement hybrid semantic search using MongoDB Atlas Full-Text Search and Vector Search. This approach powers a fast, relevant search experience by combining keyword and semantic understanding. We'll use a FastAPI backend for creating vector embeddings and a Next.js 15.2.2 frontend.

Note: This guide assumes you're already familiar with MongoDB, REST APIs, JWT Authentication, Next.js, FastAPI, and basic frontend/backend integration. It’s best suited for intermediate to advanced developers looking to implement hybrid search in production-ready apps. Please follow the official MongoDB guide here.

  • Full-Text Search matches exact keywords or similar phrases using fuzzy logic.

  • Vector Search captures semantic meaning.

  • Hybrid Search uses both, ranking documents based on a blend of lexical and semantic relevance.

Semantic search goes beyond keywords by understanding the meaning behind the text. Instead of relying solely on exact string matches, it leverages machine learning to match based on context. Read more.

How It Works

  1. Create Embeddings: We use models like all-MiniLM-L6-v2 from sentence-transformers to convert text into fixed-length numerical vectors.

  2. Store Embeddings: These embeddings are stored in the embedding field of each document.

  3. Compare with Cosine Similarity: During a search, we embed the query and compare it against stored embeddings using cosine similarity, which measures angular closeness between vectors.

  4. Fetch Closest Matches: Documents with vectors closest to the query vector are considered semantically relevant.

  • Finds relevant results even if keywords are not exact.

  • Understands synonyms, paraphrases, and context.

  • Works great for user-generated content, Q&A, or long-form text.

Architecture

This section explains the overall architecture and flow of implementing hybrid semantic search using MongoDB Atlas, FastAPI, and Next.js 15.2.2. We'll also cover key considerations like embedding management, security, and communication across services.

System Components

  • Frontend: Built with Next.js, responsible for rendering UI, capturing user queries, and displaying results.

  • Backend: A FastAPI service generates vector embeddings for documents and queries.

  • Database: MongoDB Atlas, with Full-Text and Vector Search indexes for hybrid querying.

Secure Backend Communication with JWT

We use HTTP-only cookies to store JWT tokens for authentication. Because these cookies cannot be accessed via JavaScript (for security reasons), the frontend cannot directly attach them to API requests sent to the FastAPI backend.

Solution: Proxy Requests via Next.js API Routes

To work around this limitation:

  1. The frontend sends requests to a Next.js API route.

  2. This API route forwards the request to the FastAPI backend.

  3. Since the request comes from the same origin, the HTTP-only cookie is sent automatically.

This approach avoids CORS issues and keeps tokens secure.

Alternative Without Proxy

If you prefer to avoid a proxy, here are your options:

1. Use Bearer Tokens

Store the JWT in localStorage or sessionStorage, and manually attach it in the Authorization header. This is easier but less secure and not recommended, since the token is accessible via JavaScript.

2. Configure CORS with Cookies

If the frontend and backend are on different domains:

  • Configure CORS on the FastAPI backend to allow credentials.

  • Set cookies with SameSite=None; Secure=true.

  • Set withCredentials: true on fetch calls from the frontend.

Sample Data

We will focus on the tweets collection and implement the Hybrid Semantic Search on it.

Tweet Document Schema

const tweetSchema = new mongoose.Schema(
  {
    content: { type: String },
    tag: { type: String },
    image: { type: String },
    postedBy: { type: mongoose.Schema.Types.ObjectId, ref: "User" },
    postedTweetTime: {
      type: String,
      default: moment().format("MMMM Do YYYY, h:mm:ss a"),
    },
    likes: { type: Array },
    likeTweetBtn: { type: String, default: "black" },
    retweetBtn: { type: String, default: "black" },
    retweetedByUser: { type: String },
    isRetweeted: { type: Boolean, default: false },
    retweets: { type: Array, default: [] },
    isEdited: { type: Boolean, default: false },
    shares: { type: Number, default: 0 },
    comments: [{ type: mongoose.Schema.Types.ObjectId, ref: "Comment" }],
    retweetedFrom: { type: mongoose.Schema.Types.ObjectId, ref: "Tweet" },
    embedding: {
      type: [Number],
    },
    embeddingUpdatedAt: {
      type: Date,
      default: null,
    },
  },
  { timestamps: true }
);

Backend: FastAPI Embedding Service

We use sentence-transformers to embed text content. Read more

Installation

pip install fastapi uvicorn sentence-transformers pydantic python-dotenv pymongo
#app.py
from dotenv import load_dotenv
from fastapi import FastAPI, Request, status
from fastapi.responses import JSONResponse
from pymongo import MongoClient
from sentence_transformers import SentenceTransformer
from pydantic import BaseModel
from fastapi.middleware.cors import CORSMiddleware
from datetime import datetime
import os
from jwt.exceptions import ExpiredSignatureError, InvalidTokenError

load_dotenv()
app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:3000"],  # Add your frontend origin Or "*" to allow all
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

model = SentenceTransformer("all-MiniLM-L6-v2")
client = MongoClient(os.environ["MONGODB_URI"]) # Create this variable in a .env file
db = client["your_db_name"]
tweets = db["your_collection_name"]

class QueryInput(BaseModel):
    query: str

JWT_SECRET = os.environ["JWT_SECRET"]
ALGORITHM = "HS256"

def verify_jwt_from_request(request: Request):
    auth_header = request.headers.get("Authorization")
    if not auth_header or not auth_header.startswith("Bearer "):
        raise HTTPException(status_code=401, detail="Unauthorized: Missing or invalid token")
    token = auth_header.split(" ")[1]
    try:
        payload = jwt.decode(token, JWT_SECRET, algorithms=[ALGORITHM])
        return payload  # you can access payload["id"] if you included user id
    except ExpiredSignatureError:
        raise HTTPException(status_code=401, detail="Token expired")
    except InvalidTokenError:
        raise HTTPException(status_code=401, detail="Invalid token")

@app.post("/embed-doc")
async def embed_doc(request: Request):
    try:
        verify_jwt_from_request(request)
        cursor = tweets.find({
            "$or": [
                {"embedding": {"$exists": False}},
                {"embeddingUpdatedAt": {"$exists": False}}
            ]
        })
        updated_count = 0
        for doc in cursor:
            updated_at = doc.get("updatedAt", datetime.min)
            embedding_updated_at = doc.get("embeddingUpdatedAt", datetime.min)
            if embedding_updated_at >= updated_at:
                continue
            content = f"{doc.get('content', '')} {doc.get('tag', '')}".strip()
            if not content:
                continue
            try:
                embedding = model.encode(content).tolist()
                tweets.update_one(
                    {"_id": doc["_id"]},
                    {"$set": {"embedding": embedding, "embeddingUpdatedAt": datetime.utcnow()}}
                )
                updated_count += 1
            except Exception as embed_err:
                logger.info(f"Failed to embed tweet {doc.get('_id')}: {embed_err}")
                continue  
        return JSONResponse(
            status_code=status.HTTP_200_OK,
            content={"status": "success", "message": f"Embeddings created for {updated_count} tweet(s)"}
        )
    except Exception as e:
        logger.info(f"Authorization or unknown error: {traceback.format_exc()}")
        return JSONResponse(
            status_code=status.HTTP_401_UNAUTHORIZED,
            content={"status": "error", "message": f"Unauthorized: {str(e)}"}
        )

@app.post("/embed-query")
async def embed_query(input: QueryInput, request: Request):
    try:
        verify_jwt_from_request(request)
        embedding = model.encode(input.query).tolist()
        logger.info("Embedding successfully generated")
        return {"embedding": embedding}
    except Exception as e:
        logger.error(f"Error in /embed-query: {str(e)}")
        return JSONResponse(
            status_code=500,
            content={"status": "error", "message": str(e)}
        )

Overview of app.py

This FastAPI app has two main roles:

  1. /embed-doc
    Scans your MongoDB tweets collection, generates embeddings for new or updated documents using the all-MiniLM-L6-v2 model, and stores these vectors in the database.

  2. /embed-query
    Receives a search query, generates its embedding, and returns it for use in semantic search.

  3. JWT Token Validation
    Includes a verify_jwt_from_request() helper that extracts and verifies a JWT from the Authorization header. It ensures only authenticated users can access protected endpoints.

The app uses CORS middleware to allow secure communication from your frontend and includes error handling to ensure smooth operation.

When to Call /embed-doc

Whenever a new tweet is created or updated, call /embed-doc from your frontend to create or update embeddings:

const response = await fetch(`${url}/embed-doc`, {
  // Replace `url` with the full origin (protocol + domain + port) of your FastAPI backend, e.g., "http://localhost:8000"
   method: "POST",
   headers: {
     "Content-Type": "application/json",
   },
});

Hybrid Search Aggregation Pipeline

Here’s the complete aggregation logic used to combine results:

const vectorWeight = 0.3;
const fullTextWeight = 0.7;

export const getAggregatePipeline = (
  query: string,
  queryEmbedding: number[]
) => {
  const pipeline: any[] = [];

  const vectorPipeline = [
    {
      $vectorSearch: {
        index: "tweets-vector-index",
        path: "embedding",
        queryVector: queryEmbedding,
        numCandidates: 450,
        limit: 50,
      },
    },
    { $group: { _id: null, docs: { $push: "$$ROOT" } } },
    { $unwind: { path: "$docs", includeArrayIndex: "rank" } },
    {
      $addFields: {
        vs_score: {
          $multiply: [
            vectorWeight,
            { $divide: [1.0, { $add: ["$rank", 60] }] },
          ],
        },
      },
    },
    {
      $project:
            {
              vs_score: 1,
              _id: "$docs._id",
              content: "$docs.content",
              postedBy: "$docs.postedBy",
              postedTweetTime: "$docs.postedTweetTime",
              retweetBtn: "$docs.retweetBtn",
              likeTweetBtn: "$docs.likeTweetBtn",
              tag: "$docs.tag",
              isRetweeted: "$docs.isRetweeted",
              isEdited: "$docs.isEdited",
              image: "$docs.image",
              createdAt: "$docs.createdAt",
              updatedAt: "$docs.updatedAt",
              likes: "$docs.likes",
              retweets: "$docs.retweets",
              comments: "$docs.comments",
            },
    },
  ];

  const fullTextPipeline = [
      {
        $search: {
          index: "tweets-text-index",
          text: {
            query,
            path: ["content", "tag"],
            fuzzy: {
              maxEdits: 2,
              prefixLength: 1,
              maxExpansions: 50,
            },
          },
        },
      },
      {
        $addFields: {
          fts_score: { $meta: "searchScore" },
        },
      },
      { $sort: { fts_score: -1 } },
      { $limit: 50 },
      // Lookup postedBy user
      {
        $lookup: {
          from: "users",
          localField: "postedBy",
          foreignField: "_id",
          as: "postedByUser",
        },
      },
      {
        $unwind: {
          path: "$postedByUser",
          preserveNullAndEmptyArrays: true,
        },
      },
      {
        $addFields: {
          postedBy: {
            _id: "$postedByUser._id",
            username: "$postedByUser.username",
            avatar: "$postedByUser.avatar",
            bio: "$postedByUser.bio",
          },
        },
      },
      // Lookup comments
      {
        $lookup: {
          from: "comments",
          localField: "comments",
          foreignField: "_id",
          as: "comments",
        },
      },
      {
        $unwind: {
          path: "$comments",
          preserveNullAndEmptyArrays: true,
        },
      },
      {
        $lookup: {
          from: "users",
          localField: "comments.postedBy",
          foreignField: "_id",
          as: "comments.postedByUser",
        },
      },
      {
        $unwind: {
          path: "$comments.postedByUser",
          preserveNullAndEmptyArrays: true,
        },
      },
      {
        $addFields: {
          "comments.postedBy": {
            _id: "$comments.postedByUser._id",
            username: "$comments.postedByUser.username",
            avatar: "$comments.postedByUser.avatar",
            bio: "$comments.postedByUser.bio",
          },
        },
      },
      {
        $project: {
          "comments.postedByUser": 0,
        },
      },
      // Regroup comments
      {
        $group: {
          _id: "$_id",
          doc: { $first: "$$ROOT" },
          comments: { $push: "$comments" },
        },
      },
      {
        $addFields: {
          "doc.comments": "$comments",
        },
      },
      {
        $replaceRoot: {
          newRoot: "$doc",
        },
      },
      // Final projection
      {
        $project: {
          fts_score: 1,
          _id: 1,
          content: 1,
          postedBy: 1,
          postedTweetTime: 1,
          retweetBtn: 1,
          likeTweetBtn: 1,
          tag: 1,
          isRetweeted: 1,
          retweetedByUser: 1,
          isEdited: 1,
          image: 1,
          createdAt: 1,
          updatedAt: 1,
          likes: 1,
          retweets: 1,
          comments: 1,
          followers: 1,
        },
      },
  ];

  pipeline.push(...vectorPipeline,
    {
        $unionWith: {
          coll: type,
          pipeline: fullTextPipeline,
        },
      },
      // Add combined_score before grouping
      {
        $addFields: {
          combined_score: {
            $add: [
              { $ifNull: ["$fts_score", 0] },
              { $ifNull: ["$vs_score", 0] },
            ],
          },
        },
      },
      {
        $group: {
          _id: "$_id",
          doc: { $first: "$$ROOT" },
        },
      },
      {
        $replaceRoot: {
          newRoot: "$doc",
        },
      },
      {
        $sort: { combined_score: -1 },
      },
      {
        $limit: 50,
      }
  )

  return pipeline;
};

Overview of getAggregatePipeline

This function generates a MongoDB aggregation pipeline to perform hybrid search combining vector similarity and full-text search on a "tweets" collection.

Key parts:

  1. Vector Search Pipeline (vectorPipeline):

    • Uses MongoDB’s $vectorSearch to find documents whose embedding vectors are closest to the query vector (queryEmbedding).

    • Limits to 50 results and calculates a vector search score (vs_score) weighted by vectorWeight.

    • Projects necessary tweet fields for the final output.

  2. Full-Text Search Pipeline (fullTextPipeline):

    • Uses $search to perform a fuzzy full-text search on content and tag fields.

    • Assigns a full-text score (fts_score) based on relevance and sorts by it.

    • Performs lookups to join user info (postedByUser) and comments with commenter info for richer results.

  3. Combining Results:

    • Uses $unionWith to merge full-text search results into the vector search pipeline.

    • Adds a new field combined_score which sums the weighted vector score and full-text score.

    • Groups by document _id to remove duplicates, then sorts by the combined score descending, returning the top 50.

Key Stages Explained

  • $search: Full-text search with fuzzy logic (maxEdits, prefixLength).

  • $vectorSearch: Finds semantically close vectors.

  • $group + $unwind: Used to rank and unpack results.

  • $addFields: Injects a score modifier for text/vector weighting.

  • $project: Selects final shape of each document.

  • $lookup: Joins with users and comments to populate user data.

  • $replaceRoot: Restructures document after transformations.

  • combined_score: Used to merge both result types (vector + text).

To understand in detail, please follow the official documentation here.

MongoDB Atlas: Creating Indexes

To enable hybrid search, you’ll need to create two indexes in MongoDB Atlas: a full-text search index and a vector search index.

Full-Text Index (tweets-text-index)

Steps to create it:

  1. Go to MongoDB Atlas and open your cluster.

  2. Navigate to the Search tab in the left-hand sidebar.

  3. Click “Create Search Index”.

  4. Choose:

    • Database: your database name (e.g., myapp)

    • Collection: tweets

    • Index Name: tweets-text-index

  5. In the Definition tab, switch to JSON mode and paste:

     {
       "mappings": {
         "dynamic": false,
         "fields": {
           "content": [{ "type": "string" }],
           "tag": [{ "type": "string" }]
         }
       }
     }
    

Vector Index (tweets-vector-index)

Steps to create it:

  1. While still in the Search tab, click “Create Search Index” again.

  2. Choose:

    • Database: your database name

    • Collection: tweets

    • Index Name: tweets-vector-index

  3. Switch to the JSON mode and paste:

     {
       "fields": [
         {
           "type": "vector",
           "path": "embedding",
           "numDimensions": 384,
           "similarity": "cosine"
         }
       ]
     }
    

Index Sync Behavior

  • MongoDB Atlas automatically keeps these indexes in sync when documents are inserted or updated.

  • Important: Ensure that the embedding field exists in documents — otherwise, the vector index will skip those documents.

Final Showcase: Hybrid Search in Action

Here's how it all comes together:

  1. User enters a search query in the Next.js frontend.

  2. The query is sent to an API route (either directly or via proxy).

  3. FastAPI backend embeds the query into a vector.

  4. A MongoDB Atlas aggregation pipeline runs:

    • A $vectorSearch stage finds documents with similar embeddings.

    • A $search stage performs full-text search on relevant fields.

    • Scores from both are combined using weighted scoring logic.

    • Documents are sorted by the combined score and returned.

  5. Next.js renders the results on the frontend.

This hybrid technique enables:

  • Contextual matching even with vague or paraphrased queries.

  • Keyword precision when specific terms are used.

  • Fast and scalable search across your documents.

Together, this architecture provides a modern, production-ready search experience with real semantic understanding. You now have a complete foundation for building semantic + keyword search in real-world applications using MongoDB Atlas, FastAPI, and Next.js.

Check out the working version here (Built a Twitter clone that uses this search module to intelligently search users and tweets. Get full code here.)

10
Subscribe to my newsletter

Read articles from Varun Kumawat directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Varun Kumawat
Varun Kumawat

Developer. Founder, DevHub. Mentor.