Your site's robots.txt file tells search engine crawlers which URLs they can and cannot access. It sounds simple, but a misconfigured robots.txt can accidentally block your entire site from Google — or expose sensitive admin pages to indexing.
Common robots.txt Mistakes
The most common mistakes we see with Astro + Cloudflare Workers sites:
Blocking CSS and JS files — Google needs these to render pages properly. A single "Disallow: /" can hide your entire site.
Forgetting to reference the sitemap — Without a Sitemap directive, crawlers may not discover all your pages.
Leaving admin paths open — If your admin panel is at /_emdash/admin, you want to block it from search results.
What a Good robots.txt Looks Like
A well-configured robots.txt for an EmDash site should allow crawlers full access to your content while blocking admin and API paths. The Auto Blog/SEO plugin generates this automatically from your site configuration.
The Free tier includes a dynamic robots.txt endpoint that:
Allows all search engine bots to crawl your content
Explicitly blocks admin and API routes
References your sitemap URL so crawlers find everything
Updates automatically when you add new sections or pages
How to Verify Your robots.txt
Visit your-site.com/robots.txt in a browser. Then use Google's Robots Testing Tool to verify that your important pages are accessible. With EmDash Auto Blog/SEO, this is set up correctly from the start.