How I Built a Robots.txt Generator & Tester with Zero Frameworks for My SEO works

Riad al AshekinRiad al Ashekin
4 min read

Hey devs! 👋

We've all been there. You're launching a new site or working on SEO, and you need to deal with robots.txt. It's a simple file, but it's deceptively easy to make a mistake that could hide your entire site from Google.

If you’re not fully familiar with what this file does, I’ve already written the Ultimate Guide to Robots.txt — it’s worth reading before diving into building your own tool.

After manually typing User-agent: * and Disallow: /admin/ one too many times, I decided to build a better way. I ended up creating two powerful, single-page tools to solve this problem for good:

🤖 An Advanced Robots.txt Generator
🔬 A Live Robots.txt Tester

In this post, I'll walk you through how I built them using just HTML, Tailwind CSS, and vanilla JavaScript, and how you can build and share your own practical micro-tools.


The Philosophy: Keep It Simple

My main goal was to create tools that were fast, reliable, and had zero dependencies. No React, no Vue, no build steps. Just clean, modern, vanilla JavaScript. This approach keeps the tools lightweight and easy to maintain.

The stack is straightforward:

  • HTML: For the structure and content.

  • Tailwind CSS: For rapid, responsive, and clean UI design directly from a CDN.

  • Vanilla JavaScript: For all the logic, from state management to DOM manipulation.


Part 1: Building the Robots.txt Generator

The Generator needed to be intuitive for beginners but powerful enough for pros. The solution was a dual-mode interface.

Key Features:

  • Simple Mode: A form-based wizard with templates for common platforms like WordPress, Shopify, and Laravel. Users can dynamically add Allow and Disallow rules without knowing the syntax.

  • Advanced Mode: A simple <textarea> for power users to write or paste their rules directly.

  • Live Preview: A preview pane that updates in real-time with every change.

If you’d like to use a ready-made tool while following along, try my Free Robots.txt Generator — it works exactly as described here.


The Core Logic: State-Driven Design

The most important architectural decision was to use a central state object. Instead of constantly reading from input fields (which gets messy), every user action updates this single object.

javascriptCopyEdit// A simplified look at the state object
let state = {
  userAgents: ['*'],
  crawlDelay: '',
  disallowPaths: ['/admin/', '/private/'],
  allowPaths: [],
  sitemap: 'https://example.com/sitemap.xml'
};

A single function, generateRobotsTxt(), is responsible for reading this state object and rendering the final output into the preview pane.

javascriptCopyEditconst generateRobotsTxt = () => {
  let content = '';

  state.userAgents.forEach(agent => {
    content += `User-agent: ${agent}\n`;
    state.disallowPaths.forEach(path => {
      content += `Disallow: ${path}\n`;
    });
    // ...and so on for other rules
  });

  previewCode.textContent = content.trim();
};

This makes the tool's logic clean and predictable. Any change (like clicking "Add Path" or selecting a template) simply updates the state and calls generateRobotsTxt() to refresh the view.


Part 2: Building the Robots.txt Tester

A generator is great, but how do you know if an existing file works? For that, I built the Robots.txt Tester. The real challenge here wasn't the UI, but correctly implementing the official Robots Exclusion Protocol logic.


The Core Logic: The "Longest-Match" Rule

According to Google, when multiple rules match a URL, the one with the most specific path (longest match) wins.

For example, given these rules:

bashCopyEditDisallow: /folder/
Allow: /folder/page.html

The URL /folder/page.html is allowed because /folder/page.html (20 characters) is longer and more specific than /folder/ (8 characters).

My testUrl() function implements this by checking all matching rules and keeping track of the one with the highest specificity (path length).

javascriptCopyEditfunction testUrl(url, userAgent, robotsData) {
  let bestMatch = { allowed: true, specificity: -1, rule: 'None' };

  for (const rule of rules) {
    if (url.startsWith(rule.path)) {
      const specificity = rule.path.length;

      if (specificity > bestMatch.specificity) {
        bestMatch = {
          allowed: rule.type === 'allow',
          specificity: specificity,
          rule: `${rule.type}: ${rule.path}`
        };
      }
    }
  }
  return bestMatch;
}

This small piece of logic is the brain of the entire tool and ensures its results are accurate.


Building is only half the battle; sharing is the other half.

Since these tools are just static HTML, CSS, and JS files, deployment was a breeze. I used [Netlify / Vercel / GitHub Pages - choose one], which offers free hosting for projects like this.

If you’d like a deeper understanding of how robots.txt impacts SEO, the Ultimate Guide to Robots.txt covers everything — from syntax to advanced SEO best practices.


Final Thoughts

This was a fun project that solved a real-world problem for me and, hopefully, for others. It’s proof that you don't always need a heavy framework to build something powerful and useful.

What are some other simple dev tasks you think could be turned into a handy web tool? Let me know in the comments!

Thanks for reading, and happy coding! 🚀

0
Subscribe to my newsletter

Read articles from Riad al Ashekin directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Riad al Ashekin
Riad al Ashekin