Robots.txt Best Practices for Astro + Cloudflare Workers

Your site's robots.txt file tells search engine crawlers which URLs they can and cannot access. It sounds simple, but a misconfigured robots.txt can accidentally block your entire site from Google — or expose sensitive admin pages to indexing.

Common robots.txt Mistakes

The most common mistakes we see with Astro + Cloudflare Workers sites:

Blocking CSS and JS files — Google needs these to render pages properly. A single "Disallow: /" can hide your entire site.

Forgetting to reference the sitemap — Without a Sitemap directive, crawlers may not discover all your pages.

Leaving admin paths open — If your admin panel is at /_emdash/admin, you want to block it from search results.

What a Good robots.txt Looks Like

A well-configured robots.txt for an EmDash site should allow crawlers full access to your content while blocking admin and API paths. The Auto Blog/SEO plugin generates this automatically from your site configuration.

The Free tier includes a dynamic robots.txt endpoint that:

Allows all search engine bots to crawl your content

Explicitly blocks admin and API routes

References your sitemap URL so crawlers find everything

Updates automatically when you add new sections or pages

How to Verify Your robots.txt

Visit your-site.com/robots.txt in a browser. Then use Google's Robots Testing Tool to verify that your important pages are accessible. With EmDash Auto Blog/SEO, this is set up correctly from the start.

Robots.txt Best Practices for Astro + Cloudflare Workers

Common robots.txt Mistakes

What a Good robots.txt Looks Like

How to Verify Your robots.txt

Related Posts

How to Generate a Sitemap for Your Cloudflare Workers Site

Content Scheduling for Developers: Set It and Forget It

How to Automate Blog Content with AI (Without Losing Quality)