I Built an AI Video Generator as a Solo Dev — It Works, But I Need Your Advice

rank inforank info
3 min read

Hi everyone! I'm an indie developer and I recently launched a new tool I’ve been working on:
👉 https://veo3.im/

The idea behind this project is pretty simple: help people easily create visually engaging videos with minimal effort. It’s mainly geared toward content creators, e-commerce product demos, and short-form video marketing. The goal is for users to go from script to video in just a few clicks — subtitles, background music, and automatic editing all included.


🚀 Why I Built This

I've always been intrigued by "text-to-video" and AI-assisted content creation. But most tools I’ve tried are either too complex for beginners or too limited in flexibility.

So I decided to build something more lightweight and beginner-friendly, while still giving users enough creative control to make something useful and shareable.

The platform just went live. The core functions are working — but as you’ll see below, I’m running into several challenges that I’d love the community’s input on.


✨ Current Features

  • Automated Video Creation: Paste in a script, and the system finds relevant visuals, edits them together.

  • AI Voiceover & Background Music: Choose a voice and tone — the system narrates your text and adds BGM.

  • Subtitles: Auto-generated and synced with the voiceover.

  • Smart Editing: Pacing, transitions, and clip durations are handled automatically.

  • In-browser Preview & Export: No downloads. All editing is browser-based and fast.


🤔 Open Questions – Would Love Your Feedback!

Even with a working MVP, I’m facing a few technical and product questions that I hope some of you can help with:

1. Personalization vs Simplicity

How can I offer more user control (custom covers, pacing, branding) without overwhelming users? Would a smart template + memory system be a good approach?

2. Subtitle & Voice Sync Accuracy

I’m currently using a simple speech-to-text system for subtitle timing. It struggles with fast speech.
Would forced alignment tools like Montreal Forced Aligner (MFA) improve this? Or should I consider a more end-to-end speech model?

3. Responsive Video Editing UI

It works fine on desktop, but tablet/mobile editing is cramped. Any advice or recommended frameworks for video preview + timeline editing that scale well on small screens?

4. Scaling Video Processing

Right now, the server compresses and queues videos. But when many users upload at once, things slow down.
Any tips for efficient queue systems or serverless video processing setups?

5. SEO for Tool-Based Platforms

Since the platform generates preview pages dynamically, SEO is tough. Should I generate static meta tags per video? Or create search-friendly aggregation pages?


🙏 I’d Love to Hear From You

  • Is the product experience smooth enough so far?

  • If you were a content creator, would this be useful to you?

  • What’s missing, or needs improvement?

Feel free to try it out here: 👉 https://veo3.im/
I’d really appreciate your feedback, ideas, or just general thoughts. Building in public has been a great learning experience, and I’m excited to keep improving this project.

Thanks for reading and supporting indie projects! 🚀

1
Subscribe to my newsletter

Read articles from rank info directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

rank info
rank info