Building My First AI Tool: Technical Deep Dive and Launch Guide

Lakshay GuptaLakshay Gupta
10 min read

Do you plan to take your weekend project out of localhost and want people around the world to use it? Then this blog is all you need to understand how to make your project - launch ready. In this blog, I will cover how I made JurnAI - an AI powered self-help tool, exploring everything from project setup to infra and deployment. At last, I will also share some critical security features that you should add before making your project public.

Introduction

Before beginning, I want to take a moment to let you know about JurnAI. It is a virtual-friend like AI assistant that you can talk to via your personal diaries. You can vent out your feelings or rant about your day securely in your notion diary. It will then send you a personalised heartwarming message based on what you wrote last night via email. It is not another AI assistant but a friend who understands you, who you can talk to. Interested? Try from here.

With more than 20k+ impressions across social media, and notable mentions on product hunt and hacker news, JurnAI received a significant amount of traffic very quickly. Here’s an overview of the tech that powers it behind the scenes.

Technical Overview

At a very high level, It is an AI agent integrated within your notion workspace. However, as we go deep, we have a client interface to onboard users, a backend server to handle daily email pipeline and most importantly, a scalable setup to cater to increasing load.

Technical high level overview of how JurnAI works

This is a high level overview of how JurnAI works behind the scenes. There are multiple parts that comes together to deliver a smooth end-to-end user experience. Before we dive deep into some critical user flows, I would like to give an overview of the tools and platforms used while building this project. if you are planning to build a public-ready full-stack project, you can just refer to this.

  1. Frontend - Next.Js and ShadCN

  2. Backend:

    1. Server - A fastAPI Application

    2. Database - PSQL for storing user’s info. Redis for implementing rate-limiting strategy.

  3. Infrastructure:

    1. Frontend - Vercel.

    2. Server - Azure Web App and Railway.

    3. Database - Neon for PSQL and Railway for Redis.

    4. Security - Cloudflare - for protection from DDoS.

  4. AI Tooling:

    1. Google GenAI Python SDK

    2. Model - Gemini 2.5 PRO

  5. Automation - Cron Job

  6. Emailing Client - Mailgun

  7. Domain Provider - GoDaddy

  8. Analytics - Google Analytics

  9. Other Tools - Notion Python SDK

The sections below focuses on technical aspects of some very interesting problems I ran into while building JurnAI and how I solved them. Keep reading to understand how things work under the hoods for one of the most loved self-help assistant.

A Secure User Onboarding Flow

For the project to work, I needed temporary read-only access to user’s last night diary entries. Notion makes this process a breeze, thank to their public integrations. It allows users to give access to their workspace via a one-time authorization. The access can be revoked anytime by the user making it a go-to option.

Flowchart illustrating the user onboarding process for JurnAI. It shows the user accessing the JurnAI website, which interfaces with Notion Integration for authorization, a backend server for authentication and data fetching, and a database for integration token retrieval.

While onboarding a new user, the process is as follows:

  1. User clicks on the Connect with Notion button on our website.

  2. The notion integration site opens up in a new tab. Once the user is done with authorization, they are redirected to the main site with a temp auth code as a query param.

  3. The server receives this code, makes a request to notion via their sdk to generate an access token. Upon successful authentication, basic user details like email is returned from notion along with the access token. This token is then encrypted and stored securely in our database.

  4. We can then query user’s diary entries by passing this authentication token along with their notion database id. This ensures, that no other content except what is shared by the user explicitly can be accessed by us.

A critical aspect of the above interaction was to handle client side error sanitization. The whole process of authenticating with notion, and then creating database entries at our end could end up taking some time. To address this, I added a loader animation on client side, keeping user updated about the progress.

Additionally, I found it important to ensure system-level errors originating during failed authentication are not exposed to user. For e.g. using an invalid code, resulted in notion authentication failing. In such cases, a generic server error asking user to retry was propagated instead, abstracting the server logic from user.

Automating a Daily Content Generation Pipeline

At night your write your diary entry. In the morning, at exactly 8 AM, you receive a personalized AI generated mail based on what you wrote last night. Now imagine doing this at scale for thousands of users.

Diagram of a feedback generation pipeline showing a sequence of processes involving a Cronjob, backend server, database, Notion integration, LLM, mail client, and user. Steps include scheduling tasks, fetching tokens, retrieving content, generating feedback, drafting mail, and sending mail, culminating in the user receiving the mail.

This was a hefty task, both in terms of compute and resource utilization because of multiple steps involved. This is how I implemented it.

  1. Scan the database to search for active users. For each user, the following steps needs to be executed.

    1. Fetch the latest diary entry generated in the last 24 hours.

    2. Generate AI reflections based on this entry via integrated GenAI service.

    3. Draft a mail template using the above response.

    4. Trigger Mail via the mailing client.

  2. AI content generation being a time-consuming task, took around 15s on an average for generating a single message. Due to Python's Global Interpreter Lock (GIL), achieving true parallelism for CPU-bound tasks is complex. However, since our pipeline is I/O-bound (waiting for Notion, AI models, and email services), we can use concurrency to great effect.

  3. In order to optimize the above flow to work at scale, I divided the list of available users in smaller batches. For each batch, schedule an asynchronous task to be executed for each user. These tasks are then executed concurrently. Once a batch is completed, the execution moves to next batch.

This approach served as the perfect balance between efficiency and resource-optimization. I could process a large number of users together without adding stress on my servers or getting rate-limited from the external services. Since we are dealing with multiple I/O calls, it was important to ensure we used asynchronous calls everywhere to prevent thread blocking. Do you have a better approach for handling this tricky situation? I am all ears. Feel free to comment or reach out to me.

Creating a Cron Job for True Automation

The final piece in this jigsaw puzzle was to automatically run the above pipeline at a scheduled time. For this I created an endpoint, which when hit, triggered the above workflow in the background. I created a cron job that runs at a fixed time and hits the above endpoint. Think of cron jobs as a bot who performs the action specified by you at the time specified by you.

A cartoon robot is sitting at a desk in front of a computer. In the first panel, it's relaxed, playing solitaire and holding a coffee cup, with the clock showing 9:00. In the second panel, the robot looks panicked as the clock shows 3:10, and a "send mail" prompt appears on the screen over the solitaire game.

Additionally, we can also create an Azure Function App for performing the above action. If you want to know how they work, you can check out this blog.

Multi Server Setup to Increase Resilience

I deployed by server side code on two different service providers. I know it sounds a bit unconventional, but it was a strategic decision to balance cost, performance, and resilience.

I was serving two critical operations via my server - user onboarding and daily content generation pipeline. Both of these operations were critical and had different SLA requirements. While the user onboarding part was not resource intensive, it needed to be up all the time. the daily content generation pipeline was a compute heavy job and it used to run at a particular time in the background. To meet the varying demands I decided to maintain two replicas of my server - one deployed on azure and the other on railway.

This allowed me to configure computation needs as per my requirements. For eg, It would make more sense to me to spin-down my background server when it is inactive, leading to some cold-start delays at max. This helped me save on compute credits. At the same time, I could keep my foreground server active all the time, but scale it down to lower computational requirements. Since both the servers read from the same source code, It did not cause any difficulty in my development experience. Having dual-server setup also helped provide a backup option, incase any one of them was to suffer from production outage.

Key Things Before Launching Your Project

Imagine you make your project public and it goes viral. This is your moment to shine and suddenly your server usage spikes up. Your credits starts running out exponentially. Unfortunately, your potential users are left with a product that does not work. Don’t let your momentum fade away because you were too lazy to implement basic security features. Continue reading to avoid making mistakes that I did.

  1. CORS (Cross Origin Resource Sharing)

    This prevents requests from unrecognized origins from reaching your server via browser. It can be configured easily in your server code. Having this in place, ensures only your website can access the server directly, thus reducing chances of server spamming.

  2. Cloudflare Edge Level Protection

    When I made my project public, the last thing I expected was a bot attack. My server was hit with thousands of requests in a few seconds. Though, nothing of value was lost, it surely increased the compute usage to a minor extent. This could have been prevented easily by moving the domain behind cloudflare protection. Once behind, it ensures that all the requests are routed via their secure cloudflare servers. They also have inbuilt support to limit the max number of requests that can be made in a time frame. Additionally it also allows for blocking traffic from suspicious IPs. All of this is handled directly by cloudflare making the process seamless for you. I cannot stress this more, but please audit your app before making it public, to ensure no “AI Vibe” make it in the hands of bad actors.

    Graph showing threat data over the past 30 days. Total threats: 296. Top country: France. Top threat type: Bad browser. Peaks on August 31 and September 5 indicate higher threat activity.

  3. Application Level Rate Limiting

    Cloudflare handles rate limiting at edge level, that is before requests even make it to your server. Incase you are not able to add that, you can consider adding application level rate-limiting to critical routes. FastAPI comes with an inbuilt library for handling this. In order to consistently track and limit incoming traffic, we need to use an in-memory database like redis. This keeps a count of number of requests made by a particular user in a given timeframe. Storing it temporary in redis, which is an in-memory database, ensures a fast data retrieval speed essential to ensure no notable increase in overall latency of the request.

One Small Step, Many Benefits - Integrating Analytics

When taking a leap of faith from localhost to making your project public, you can consider integrating analytics in your website. This helps you in monitoring user behaviour, incoming traffic and better understand what works and what does not. I personally prefer using google analytics because of the ease of setup and support for creating custom events.

If you are looking for a simpler solution to monitor incoming traffic, you can also use vercel provided analytics. It requires a one-time setup and costs you on the basis of your usage.

Conclusion

That’s it for now. We've covered the core technologies behind JurnAI, navigated some challenging engineering hurdles, and outlined the essential security checks that can make or break a public launch. I hope this article helps you before your next public launch. Till then, feel free to checkout JurnAI using this link.

If you have any thoughts or questions, please connect with me on X. And if you found this article useful, don't forget to leave a like! Finally, let's make this a two-way conversation. What's your #1 tip for developers preparing to go public with a project?

0
Subscribe to my newsletter

Read articles from Lakshay Gupta directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Lakshay Gupta
Lakshay Gupta

My friends used to poke me for writing long messages while texting. I thought why not benefit from it? I have led communities, ran startups, built viral apps, and made youtube vlogs.. Yes, I am an engineer.