Lessons from My Weekend Building an AI Agent in Next.js

Benard OgutuBenard Ogutu
8 min read

Introduction

This weekend, I had a bit of free time , and, naturally, my brain used it to remind me of everything I haven’t done yet.

Between two online courses, my full-time job, a solo project, and regular life stuff, things have been feeling… a little crowded. It’s easy to feel overwhelmed when your to-do list looks like it’s growing faster than you can check things off.

So I took a step back and opened ClickUp to figure out what actually needed my attention. Top of the list? A project I’m leading at work. We're building a tool to make some of our internal workflows smoother, especially when it comes to handling user feedback.

My task was to implement an AI agent and integrate it into our existing Next.js app. I had tried something similar before using LangChain and Pinecone with a basic RAG setup. But honestly, the results weren’t great. There were hallucinations, the solution was way more complex than it needed to be, and there were too many weird edge cases.

But with all the recent buzz around agentic AI, I figured it was time to give it another shot.

Long story short: I spent the weekend building a working AI agent. I learned a lot, avoided some old mistakes, and came out the other side with something actually useful. In this post, I’ll walk you through how I did it, step by step, so you don’t have to stumble through it like I did.

Wait, What Even Is an AI Agent?

Let's get our buzzwords straight before diving deeper.

Generative AI refers to models that can create new content i.e. text, images, videos, based on the data they've been trained on. You're probably familiar with ChatGPT, Gemini, and Claude. These are generative models trained on billions of data points that recognize statistical patterns to provide the most suitable response to your input. Large Language Models (LLMs) are specifically the type of generative AI focused on text generation.

How LLMs work: The process is pretty straightforward, they take your prompts as input, calculate probabilities for the next possible words, and build text word by word through continuous prediction until reaching some stopping point like word limits or specific ending tokens.

To understand why I found RAG pipelines so frustrating, you should know a bit about embedding models. These convert text, images, or audio into meaningful numerical vectors and compare them to find patterns or connections. For a RAG application, you need a combination of:

  • Embedding models to convert data into useful numerical formats.

  • A large language model (LLM) for generating text.

  • A vector database to store and retrieve that data efficiently.

Quite the complex setup!

But for an agentic system? You only need an LLM plus well-defined tools (actions). Much simpler, don't you think?

So what is an AI agent? Google defines them as "software systems that use AI to pursue goals and complete tasks on behalf of users." Think of them as personal assistants who can handle tasks that would normally eat up hours of your time. They can understand what you're asking for, break it down into steps, and execute those steps using the tools they have access to.

So How Do I Build Something Useful With This Knowledge?

Now let's take this knowledge and build something practical. We'll create an agent that monitors user feedback trends on Reddit for our new app "XYZ," a Tinder for developers. Our agent will track user complaints, bug reports, positive feedback, and any other information related to XYZ.

The typical workflow for monitoring user feedback would be manually checking social platforms like Reddit or Twitter, searching for reviews, then categorizing feedback based on sentiment to identify positive and negative trends. You'd extract useful information like error codes, sentiments, and how many people are reporting the same issues. With a new product that has low feedback volume, this might be manageable, but for popular products with thousands of daily comments, you'd need a large customer support team.

Here's how we can implement an agent to handle this task automatically:

In this tutorial, we'll use NextJS, Vercel AI SDK, and Azure OpenAI. You could substitute these technologies (like using Nuxt instead of NextJS, LangChain instead of Vercel AI SDK, or Gemini instead of Azure OpenAI) — the core concepts remain the same.

Step 1: Set up your NextJS project

First, initiate your NextJS project by running these commands:

npx create-next-app@latest xyz-social-agent
cd xyz-social-agent

Step 2: Install the dependencies

Run the following command to install what we need:

npm install ai @ai-sdk/react @ai-sdk/azure zod

Vercel's AI SDK provides a unified interface to interact with various large language models, offering a consistent way to work with different providers like Gemini, ChatGPT, etc. We'll use Zod for schema definitions and validations, which we'll explain more later.

Step 3: Configure your LLM's API key

Create a .env.local file at the root of your application with the following variables:

AZURE_OPENAI_API_INSTANCE_NAME=**********************
AZURE_OPENAI_API_KEY=**********************
AZURE_OPENAI_API_VERSION=**********************

Step 4: Create AI configuration

Create a lib folder at the root directory and inside that create a file called ai-client.ts (not model.ts as mentioned earlier) with:

import { createAzure } from "@ai-sdk/azure"

export const azure = createAzure({
  resourceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME!,
  apiKey: process.env.AZURE_OPENAI_API_KEY!,
  apiVersion: process.env.AZURE_OPENAI_API_VERSION!,
})

Note: I've added non-null assertions (!) to the environment variables since TypeScript would otherwise complain about potentially undefined values.

Step 5: Create a route handler

Create a route handler at app/api/chat/route.ts with the following code:

import { streamText } from 'ai';
import { azure } from "@/lib/ai-client";

// Allow streaming responses up to 30 seconds
export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: azure('gpt-4o'),
    messages,
  });

  return result.toDataStreamResponse();
}

This code uses our Azure model configuration and the streamText function provided by the AI SDK to stream responses back to the user rather than sending a single message chunk. The AI SDK keeps track of the conversation history ("user" and "ai" messages) by default.

Step 6: Integrate actions for the model

To create an agentic system, we need to provide tools for the AI to use. Let's enhance our route handler with a Reddit search tool:

import { streamText, tool } from 'ai';
import { azure } from "@/lib/ai-client";
import { z } from "zod";

// Allow streaming responses up to 30 seconds
export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: azure('gpt-4o'),
    messages,
    tools: {
      searchReddit: tool({
        description: "Fetches feedback from Reddit for analysis",
        parameters: z.object({
          searchTerm: z.string().optional().default("XYZ app"),
          sortBy: z.enum(["createdAt", "comments", "score"]).optional().default("createdAt"),
        }),
        execute: async ({ searchTerm, sortBy }) => {
          // Here you would implement the actual Reddit API call
          // For example using Reddit's API or a scraping solution

          // Mock implementation for demonstration
          console.log(`Searching Reddit for "${searchTerm}" sorted by ${sortBy}`);

          // In a real implementation, you would return actual Reddit data
          return {
            posts: [
              {
                title: "XYZ app keeps crashing when I try to match",
                content: "Every time I swipe right on a developer profile, the app crashes. Anyone else experiencing this?",
                comments: 15,
                score: 45,
                createdAt: "2023-06-15T10:30:00Z"
              },
              // More posts would be returned in a real implementation
            ]
          };
        }
      })
    }
  });

  return result.toDataStreamResponse();
}

The tool function allows us to define actions for the AI. The description tells the AI what the tool does, while the parameters define the expected inputs. The model will infer these parameters from the user's prompt (e.g., "What is the latest on XYZ today?" would set searchTerm to "XYZ" and use the default sort criteria).

The execute function contains the actual logic for the tool. In a real application, you would implement API calls to fetch Reddit data. The results are passed to the next tool in the chain or returned as output for the model to use in its response.

Step 7: Set up the UI

Add this code to your app/page.tsx:

'use client';

import { useChat } from '@ai-sdk/react';

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat();
  return (
    <div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
      {messages.map(message => (
        <div key={message.id} className="whitespace-pre-wrap mb-4 p-2 rounded bg-gray-50 dark:bg-zinc-800">
          <strong>{message.role === 'user' ? 'You: ' : 'AI: '}</strong>
          {message.parts.map((part, i) => {
            switch (part.type) {
              case 'text':
                return <div key={`${message.id}-${i}`}>{part.text}</div>;
              default:
                return null; 
            }
          })}
        </div>
      ))}

      <form onSubmit={handleSubmit} className="fixed bottom-0 w-full max-w-md mb-8">
        <input
          className="w-full p-2 border border-zinc-300 dark:border-zinc-700 dark:bg-zinc-800 rounded shadow-xl"
          value={input}
          placeholder="Ask about XYZ app feedback..."
          onChange={handleInputChange}
        />
      </form>
    </div>
  );
}

The useChat hook provides:

  • messages: The current chat messages (array of objects with id, role, and parts)

  • input: The current value of the user's input field

  • handleInputChange and handleSubmit: Functions to handle user interactions

Step 8: Run the application

In your terminal, run:

npm run dev

Test the chat interface by asking questions like "What are the top user complaints for product XYZ today?" The model should understand your prompt's context and provide relevant feedback from Reddit.

To make this agent truly useful, you could:

  1. Implement actual Reddit API integration or use a scraping library

  2. Add more tools for other platforms like Twitter, Google reviews, or app stores

  3. Create visualization components to track sentiment trends over time

  4. Add features to automatically categorize and prioritize issues

  5. Set up notifications for critical issues

The beauty of this agentic approach is how easily you can extend it with new capabilities as your needs evolve! Remember to keep your system modular and flexible, allowing for easy updates and the addition of new tools as your needs evolve. Testing and iterating on your setup will help you refine the agent's capabilities and improve its accuracy.

I encourage you to try implementing these steps in your own projects. Experiment with different tools and platforms, and don't hesitate to share your results and feedback.

👋👋👋👋

11
Subscribe to my newsletter

Read articles from Benard Ogutu directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Benard Ogutu
Benard Ogutu