Building Hybrid Semantic Search with MongoDB Atlas, FastAPI (Sentence Transformers) & JavaScript


In this article, we'll implement hybrid semantic search using MongoDB Atlas Full-Text Search and Vector Search. This approach powers a fast, relevant search experience by combining keyword and semantic understanding. We'll use a FastAPI backend for creating vector embeddings and a Next.js 15.2.2 frontend.
Note: This guide assumes you're already familiar with MongoDB, REST APIs, JWT Authentication, Next.js, FastAPI, and basic frontend/backend integration. It’s best suited for intermediate to advanced developers looking to implement hybrid search in production-ready apps. Please follow the official MongoDB guide here.
Why Hybrid Search?
Full-Text Search matches exact keywords or similar phrases using fuzzy logic.
Vector Search captures semantic meaning.
Hybrid Search uses both, ranking documents based on a blend of lexical and semantic relevance.
What is Semantic Search?
Semantic search goes beyond keywords by understanding the meaning behind the text. Instead of relying solely on exact string matches, it leverages machine learning to match based on context. Read more.
How It Works
Create Embeddings: We use models like
all-MiniLM-L6-v2
fromsentence-transformers
to convert text into fixed-length numerical vectors.Store Embeddings: These embeddings are stored in the
embedding
field of each document.Compare with Cosine Similarity: During a search, we embed the query and compare it against stored embeddings using cosine similarity, which measures angular closeness between vectors.
Fetch Closest Matches: Documents with vectors closest to the query vector are considered semantically relevant.
Benefits of Semantic Search
Finds relevant results even if keywords are not exact.
Understands synonyms, paraphrases, and context.
Works great for user-generated content, Q&A, or long-form text.
Architecture
This section explains the overall architecture and flow of implementing hybrid semantic search using MongoDB Atlas, FastAPI, and Next.js 15.2.2. We'll also cover key considerations like embedding management, security, and communication across services.
System Components
Frontend: Built with Next.js, responsible for rendering UI, capturing user queries, and displaying results.
Backend: A FastAPI service generates vector embeddings for documents and queries.
Database: MongoDB Atlas, with Full-Text and Vector Search indexes for hybrid querying.
Secure Backend Communication with JWT
We use HTTP-only cookies to store JWT tokens for authentication. Because these cookies cannot be accessed via JavaScript (for security reasons), the frontend cannot directly attach them to API requests sent to the FastAPI backend.
Solution: Proxy Requests via Next.js API Routes
To work around this limitation:
The frontend sends requests to a Next.js API route.
This API route forwards the request to the FastAPI backend.
Since the request comes from the same origin, the HTTP-only cookie is sent automatically.
This approach avoids CORS issues and keeps tokens secure.
Alternative Without Proxy
If you prefer to avoid a proxy, here are your options:
1. Use Bearer Tokens
Store the JWT in localStorage
or sessionStorage
, and manually attach it in the Authorization
header. This is easier but less secure and not recommended, since the token is accessible via JavaScript.
2. Configure CORS with Cookies
If the frontend and backend are on different domains:
Configure CORS on the FastAPI backend to allow credentials.
Set cookies with
SameSite=None; Secure=true
.Set
withCredentials: true
on fetch calls from the frontend.
Sample Data
We will focus on the tweets
collection and implement the Hybrid Semantic Search on it.
Tweet Document Schema
const tweetSchema = new mongoose.Schema(
{
content: { type: String },
tag: { type: String },
image: { type: String },
postedBy: { type: mongoose.Schema.Types.ObjectId, ref: "User" },
postedTweetTime: {
type: String,
default: moment().format("MMMM Do YYYY, h:mm:ss a"),
},
likes: { type: Array },
likeTweetBtn: { type: String, default: "black" },
retweetBtn: { type: String, default: "black" },
retweetedByUser: { type: String },
isRetweeted: { type: Boolean, default: false },
retweets: { type: Array, default: [] },
isEdited: { type: Boolean, default: false },
shares: { type: Number, default: 0 },
comments: [{ type: mongoose.Schema.Types.ObjectId, ref: "Comment" }],
retweetedFrom: { type: mongoose.Schema.Types.ObjectId, ref: "Tweet" },
embedding: {
type: [Number],
},
embeddingUpdatedAt: {
type: Date,
default: null,
},
},
{ timestamps: true }
);
Backend: FastAPI Embedding Service
We use sentence-transformers
to embed text content. Read more
Installation
pip install fastapi uvicorn sentence-transformers pydantic python-dotenv pymongo
#app.py
from dotenv import load_dotenv
from fastapi import FastAPI, Request, status
from fastapi.responses import JSONResponse
from pymongo import MongoClient
from sentence_transformers import SentenceTransformer
from pydantic import BaseModel
from fastapi.middleware.cors import CORSMiddleware
from datetime import datetime
import os
from jwt.exceptions import ExpiredSignatureError, InvalidTokenError
load_dotenv()
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000"], # Add your frontend origin Or "*" to allow all
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
model = SentenceTransformer("all-MiniLM-L6-v2")
client = MongoClient(os.environ["MONGODB_URI"]) # Create this variable in a .env file
db = client["your_db_name"]
tweets = db["your_collection_name"]
class QueryInput(BaseModel):
query: str
JWT_SECRET = os.environ["JWT_SECRET"]
ALGORITHM = "HS256"
def verify_jwt_from_request(request: Request):
auth_header = request.headers.get("Authorization")
if not auth_header or not auth_header.startswith("Bearer "):
raise HTTPException(status_code=401, detail="Unauthorized: Missing or invalid token")
token = auth_header.split(" ")[1]
try:
payload = jwt.decode(token, JWT_SECRET, algorithms=[ALGORITHM])
return payload # you can access payload["id"] if you included user id
except ExpiredSignatureError:
raise HTTPException(status_code=401, detail="Token expired")
except InvalidTokenError:
raise HTTPException(status_code=401, detail="Invalid token")
@app.post("/embed-doc")
async def embed_doc(request: Request):
try:
verify_jwt_from_request(request)
cursor = tweets.find({
"$or": [
{"embedding": {"$exists": False}},
{"embeddingUpdatedAt": {"$exists": False}}
]
})
updated_count = 0
for doc in cursor:
updated_at = doc.get("updatedAt", datetime.min)
embedding_updated_at = doc.get("embeddingUpdatedAt", datetime.min)
if embedding_updated_at >= updated_at:
continue
content = f"{doc.get('content', '')} {doc.get('tag', '')}".strip()
if not content:
continue
try:
embedding = model.encode(content).tolist()
tweets.update_one(
{"_id": doc["_id"]},
{"$set": {"embedding": embedding, "embeddingUpdatedAt": datetime.utcnow()}}
)
updated_count += 1
except Exception as embed_err:
logger.info(f"Failed to embed tweet {doc.get('_id')}: {embed_err}")
continue
return JSONResponse(
status_code=status.HTTP_200_OK,
content={"status": "success", "message": f"Embeddings created for {updated_count} tweet(s)"}
)
except Exception as e:
logger.info(f"Authorization or unknown error: {traceback.format_exc()}")
return JSONResponse(
status_code=status.HTTP_401_UNAUTHORIZED,
content={"status": "error", "message": f"Unauthorized: {str(e)}"}
)
@app.post("/embed-query")
async def embed_query(input: QueryInput, request: Request):
try:
verify_jwt_from_request(request)
embedding = model.encode(input.query).tolist()
logger.info("Embedding successfully generated")
return {"embedding": embedding}
except Exception as e:
logger.error(f"Error in /embed-query: {str(e)}")
return JSONResponse(
status_code=500,
content={"status": "error", "message": str(e)}
)
Overview of app.py
This FastAPI app has two main roles:
/embed-doc
Scans your MongoDB tweets collection, generates embeddings for new or updated documents using theall-MiniLM-L6-v2
model, and stores these vectors in the database./embed-query
Receives a search query, generates its embedding, and returns it for use in semantic search.JWT Token Validation
Includes averify_jwt_from_request()
helper that extracts and verifies a JWT from theAuthorization
header. It ensures only authenticated users can access protected endpoints.
The app uses CORS middleware to allow secure communication from your frontend and includes error handling to ensure smooth operation.
When to Call /embed-doc
Whenever a new tweet is created or updated, call /embed-doc
from your frontend to create or update embeddings:
const response = await fetch(`${url}/embed-doc`, {
// Replace `url` with the full origin (protocol + domain + port) of your FastAPI backend, e.g., "http://localhost:8000"
method: "POST",
headers: {
"Content-Type": "application/json",
},
});
Hybrid Search Aggregation Pipeline
Here’s the complete aggregation logic used to combine results:
const vectorWeight = 0.3;
const fullTextWeight = 0.7;
export const getAggregatePipeline = (
query: string,
queryEmbedding: number[]
) => {
const pipeline: any[] = [];
const vectorPipeline = [
{
$vectorSearch: {
index: "tweets-vector-index",
path: "embedding",
queryVector: queryEmbedding,
numCandidates: 450,
limit: 50,
},
},
{ $group: { _id: null, docs: { $push: "$$ROOT" } } },
{ $unwind: { path: "$docs", includeArrayIndex: "rank" } },
{
$addFields: {
vs_score: {
$multiply: [
vectorWeight,
{ $divide: [1.0, { $add: ["$rank", 60] }] },
],
},
},
},
{
$project:
{
vs_score: 1,
_id: "$docs._id",
content: "$docs.content",
postedBy: "$docs.postedBy",
postedTweetTime: "$docs.postedTweetTime",
retweetBtn: "$docs.retweetBtn",
likeTweetBtn: "$docs.likeTweetBtn",
tag: "$docs.tag",
isRetweeted: "$docs.isRetweeted",
isEdited: "$docs.isEdited",
image: "$docs.image",
createdAt: "$docs.createdAt",
updatedAt: "$docs.updatedAt",
likes: "$docs.likes",
retweets: "$docs.retweets",
comments: "$docs.comments",
},
},
];
const fullTextPipeline = [
{
$search: {
index: "tweets-text-index",
text: {
query,
path: ["content", "tag"],
fuzzy: {
maxEdits: 2,
prefixLength: 1,
maxExpansions: 50,
},
},
},
},
{
$addFields: {
fts_score: { $meta: "searchScore" },
},
},
{ $sort: { fts_score: -1 } },
{ $limit: 50 },
// Lookup postedBy user
{
$lookup: {
from: "users",
localField: "postedBy",
foreignField: "_id",
as: "postedByUser",
},
},
{
$unwind: {
path: "$postedByUser",
preserveNullAndEmptyArrays: true,
},
},
{
$addFields: {
postedBy: {
_id: "$postedByUser._id",
username: "$postedByUser.username",
avatar: "$postedByUser.avatar",
bio: "$postedByUser.bio",
},
},
},
// Lookup comments
{
$lookup: {
from: "comments",
localField: "comments",
foreignField: "_id",
as: "comments",
},
},
{
$unwind: {
path: "$comments",
preserveNullAndEmptyArrays: true,
},
},
{
$lookup: {
from: "users",
localField: "comments.postedBy",
foreignField: "_id",
as: "comments.postedByUser",
},
},
{
$unwind: {
path: "$comments.postedByUser",
preserveNullAndEmptyArrays: true,
},
},
{
$addFields: {
"comments.postedBy": {
_id: "$comments.postedByUser._id",
username: "$comments.postedByUser.username",
avatar: "$comments.postedByUser.avatar",
bio: "$comments.postedByUser.bio",
},
},
},
{
$project: {
"comments.postedByUser": 0,
},
},
// Regroup comments
{
$group: {
_id: "$_id",
doc: { $first: "$$ROOT" },
comments: { $push: "$comments" },
},
},
{
$addFields: {
"doc.comments": "$comments",
},
},
{
$replaceRoot: {
newRoot: "$doc",
},
},
// Final projection
{
$project: {
fts_score: 1,
_id: 1,
content: 1,
postedBy: 1,
postedTweetTime: 1,
retweetBtn: 1,
likeTweetBtn: 1,
tag: 1,
isRetweeted: 1,
retweetedByUser: 1,
isEdited: 1,
image: 1,
createdAt: 1,
updatedAt: 1,
likes: 1,
retweets: 1,
comments: 1,
followers: 1,
},
},
];
pipeline.push(...vectorPipeline,
{
$unionWith: {
coll: type,
pipeline: fullTextPipeline,
},
},
// Add combined_score before grouping
{
$addFields: {
combined_score: {
$add: [
{ $ifNull: ["$fts_score", 0] },
{ $ifNull: ["$vs_score", 0] },
],
},
},
},
{
$group: {
_id: "$_id",
doc: { $first: "$$ROOT" },
},
},
{
$replaceRoot: {
newRoot: "$doc",
},
},
{
$sort: { combined_score: -1 },
},
{
$limit: 50,
}
)
return pipeline;
};
Overview of getAggregatePipeline
This function generates a MongoDB aggregation pipeline to perform hybrid search combining vector similarity and full-text search on a "tweets" collection.
Key parts:
Vector Search Pipeline (
vectorPipeline
):Uses MongoDB’s
$vectorSearch
to find documents whoseembedding
vectors are closest to the query vector (queryEmbedding
).Limits to 50 results and calculates a vector search score (
vs_score
) weighted byvectorWeight
.Projects necessary tweet fields for the final output.
Full-Text Search Pipeline (
fullTextPipeline
):Uses
$search
to perform a fuzzy full-text search oncontent
andtag
fields.Assigns a full-text score (
fts_score
) based on relevance and sorts by it.Performs lookups to join user info (
postedByUser
) and comments with commenter info for richer results.
Combining Results:
Uses
$unionWith
to merge full-text search results into the vector search pipeline.Adds a new field
combined_score
which sums the weighted vector score and full-text score.Groups by document
_id
to remove duplicates, then sorts by the combined score descending, returning the top 50.
Key Stages Explained
$search: Full-text search with fuzzy logic (maxEdits, prefixLength).
$vectorSearch: Finds semantically close vectors.
$group + $unwind: Used to rank and unpack results.
$addFields: Injects a score modifier for text/vector weighting.
$project: Selects final shape of each document.
$lookup: Joins with
users
andcomments
to populate user data.$replaceRoot: Restructures document after transformations.
combined_score: Used to merge both result types (vector + text).
To understand in detail, please follow the official documentation here.
MongoDB Atlas: Creating Indexes
To enable hybrid search, you’ll need to create two indexes in MongoDB Atlas: a full-text search index and a vector search index.
Full-Text Index (tweets-text-index
)
Steps to create it:
Go to MongoDB Atlas and open your cluster.
Navigate to the Search tab in the left-hand sidebar.
Click “Create Search Index”.
Choose:
Database: your database name (e.g.,
myapp
)Collection:
tweets
Index Name:
tweets-text-index
In the Definition tab, switch to JSON mode and paste:
{ "mappings": { "dynamic": false, "fields": { "content": [{ "type": "string" }], "tag": [{ "type": "string" }] } } }
Vector Index (tweets-vector-index
)
Steps to create it:
While still in the Search tab, click “Create Search Index” again.
Choose:
Database: your database name
Collection:
tweets
Index Name:
tweets-vector-index
Switch to the JSON mode and paste:
{ "fields": [ { "type": "vector", "path": "embedding", "numDimensions": 384, "similarity": "cosine" } ] }
Index Sync Behavior
MongoDB Atlas automatically keeps these indexes in sync when documents are inserted or updated.
Important: Ensure that the
embedding
field exists in documents — otherwise, the vector index will skip those documents.
Final Showcase: Hybrid Search in Action
Here's how it all comes together:
User enters a search query in the Next.js frontend.
The query is sent to an API route (either directly or via proxy).
FastAPI backend embeds the query into a vector.
A MongoDB Atlas aggregation pipeline runs:
A
$vectorSearch
stage finds documents with similar embeddings.A
$search
stage performs full-text search on relevant fields.Scores from both are combined using weighted scoring logic.
Documents are sorted by the combined score and returned.
Next.js renders the results on the frontend.
This hybrid technique enables:
Contextual matching even with vague or paraphrased queries.
Keyword precision when specific terms are used.
Fast and scalable search across your documents.
Together, this architecture provides a modern, production-ready search experience with real semantic understanding. You now have a complete foundation for building semantic + keyword search in real-world applications using MongoDB Atlas, FastAPI, and Next.js.
Check out the working version here (Built a Twitter clone that uses this search module to intelligently search users and tweets. Get full code here.)
Subscribe to my newsletter
Read articles from Varun Kumawat directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Varun Kumawat
Varun Kumawat
Developer. Founder, DevHub. Mentor.