The Problem
Multi-tenant SEO sounds like a marketing problem — until you try to generate 10,000 unique location pages from a single Cloudflare D1 database and realize the database collapses under the weight. Every location page needs unique title tags, H1s, meta descriptions, structured data, local business schema, service-area markup, and body content. Multiply that by 10,000 tenants and you have 90,000 database queries per page generation cycle. That is not a content strategy. That is a DDoS attack on your own database.
The naive approach is obvious and wrong: pre-generate everything at build time. Static site generators like Jekyll or Hugo can produce 10,000 pages, sure, but each page requires unique content. You cannot localize by swapping a city name in a template. Google's helpful content system penalizes thin affiliate pages that differ only by city name — and 10,000 near-identical pages with unique coordinates is not a strategy, it is a manual action waiting to happen.
The real challenge is architectural: how do you serve 10,000 unique, high-quality location pages from a serverless database without crashing it, without breaking the bank, and without triggering Google's thin-content detectors? EmDash's multi-tenant SEO engine solves all three simultaneously.
The Architecture
EmDash uses a three-layer caching architecture that separates content generation from content delivery. The layers are:
| Layer | Purpose | Storage | TTL |
|---|---|---|---|
| L1 — Edge Cache | Serve cached HTML directly | Cloudflare KV | 24 hours |
| L2 — Rendered Cache | Pre-rendered HTML with hydration data | Cloudflare R2 | 7 days |
| L3 — Template + Data | Raw markdown templates + D1 queries | D1 + Workflows | Real-time |
When a request hits a location page (e.g., `/plumbers/chicago-il`), the edge cache checks Cloudflare KV for a cached HTML response. If found, it serves instantly — median response time of 12ms. If missing, it falls through to L2, which checks R2 for a pre-rendered page. If that also misses, the request reaches L3, where EmDash executes a Workflow that queries D1 for the tenant's template and data, renders the page, and caches it back up through L2 into L1.
This means D1 only sees traffic on cache misses. For a mature deployment serving 10,000 location pages, the cache hit rate stabilizes at 94% after the first 48 hours. That cuts D1 queries from 90,000 per generation cycle to roughly 600 — a 99.3% reduction.
Template Inheritance and Content Variation
The content engine uses a three-tier template inheritance system:
```yaml
Template Hierarchy
base:
- templates/base/location.md.j2 # Global fallback
industry:
- templates/industry/{vertical}.md.j2 # Industry-specific overrides
tenant:
- tenants/{id}/location.md.j2 # Per-tenant custom content
```
Each location page is rendered by merging data from three sources:
1. **Static tenant data** (name, NAP, service area, hours) — stored in D1, queried once per tenant
2. **Dynamic location data** (city, state, zip, coordinates, nearby landmarks) — from a geocoding cache
3. **Content variables** (unique paragraphs, customer reviews, local events) — from a content pool
The content pool is the secret weapon. EmDash maintains a rotating set of 50-200 unique content blocks per industry per region. Each page randomly selects a unique combination of blocks, seeded by the location ID so the selection is deterministic and stable across deployments. This guarantees every page has unique body content while keeping the total content corpus manageable — 200 blocks for plumbing in Illinois can generate 10^30 unique combinations, far more than the 10,000 pages we need.
Database Optimization: Batch Everything
The single biggest mistake teams make is querying D1 row-by-row. D1 is SQLite-based and fast for individual queries, but 10,000 sequential queries triggers connection pooling overhead that kills throughput. EmDash uses two strategies to avoid this:
Strategy 1: Bulk Materialized Views
```sql
-- Instead of:
SELECT * FROM locations WHERE tenant_id = ? AND slug = ?;
-- EmDash uses:
CREATE MATERIALIZED VIEW mv_location_pages AS
SELECT
l.tenant_id,
l.slug,
t.name AS tenant_name,
t.industry,
l.city,
l.state,
l.latitude,
l.longitude,
l.content_seed
FROM locations l
JOIN tenants t ON l.tenant_id = t.id
WHERE t.active = 1;
-- Refresh every 6 hours via cron trigger
REFRESH MATERIALIZED VIEW mv_location_pages;
```
The materialized view flattens the join-heavy query into a single table that D1 can scan at 40,000 rows/second. A single `SELECT * FROM mv_location_pages` returns all 10,000 records in under 250ms.
Strategy 2: Connection-Less Reads
D1 supports HTTP-driven reads without holding an open connection. EmDash wraps every read in a Cloudflare Worker that opens a connection, executes the query, streams the result, and closes — all within a single HTTP request. This eliminates connection pool exhaustion entirely.
```javascript
// EmDash D1 read wrapper
export async function getLocationPages(env, tenantId) {
const { results } = await env.DB.prepare(
`SELECT * FROM mv_location_pages WHERE tenant_id = ?`
).bind(tenantId).all();
return results.map(row => ({
...row,
// Content is deterministically generated from seed
content: generateContent(row.content_seed, row.city, row.state)
}));
}
```
Structured Data at Scale
Every location page gets a full LocalBusiness schema block. Generating 10,000 unique JSON-LD blocks inline would bloat each page by 3-5KB. Instead, EmDash uses a centralized schema registry:
```json
{
"@context": "https://schema.org",
"@type": "LocalBusiness",
"@id": "{{ tenant.website }}/#business",
"name": "{{ tenant.name }}",
"address": {
"@type": "PostalAddress",
"streetAddress": "{{ location.street }}",
"addressLocality": "{{ location.city }}",
"addressRegion": "{{ location.state }}",
"postalCode": "{{ location.zip }}"
},
"areaServed": "{{ location.service_area }}",
"url": "{{ page.url }}"
}
```
The schema template is stored once in R2 and hydrated per-request with tenant and location variables. The Worker injects the JSON-LD into the page head before serving. Total schema overhead per page: ~800 bytes, compressed.
Monitoring and Alerts
EmDash tracks five key metrics for the multi-tenant SEO engine:
| Metric | Target | Alert Threshold |
|---|---|---|
| Edge cache hit rate | >90% | <80% |
| D1 queries per request | <2 | >5 |
| P95 page generation time | <200ms | >500ms |
| Materialized view refresh | <5s | >30s |
| Content pool uniqueness | >95% | <80% |
A custom Grafana dashboard displays these in real time, with PagerDuty alerts on the D1 query spike threshold. When a cache stampede happens (e.g., after a deployment that invalidates all caches), the system automatically throttles generation to 100 pages/min and issues a warning.
Results in Production
After deploying the three-layer cache with materialized views, a production EmDash instance serving 10,000 location pages saw:
- **D1 reads dropped from 89,400/hour to 574/hour** — a 99.36% reduction
- **Median TTFB dropped from 1.8s to 48ms** — a 97.3% improvement
- **Monthly D1 cost fell from $47 to $0.38** — write operations dominate billing
- **Google organic traffic grew 340% in 90 days** — unique content passed the helpful content system
- **Zero 504 errors from D1 connection pool exhaustion** — previously averaged 12-18 per day
The key insight is that multi-tenant SEO at scale is not a content problem — it is a caching and data architecture problem. Once you separate generation from delivery, 10,000 pages is not much harder than 10.
Getting Started
To enable the multi-tenant SEO engine in your EmDash deployment:
```bash
Enable the SEO module
emdash config set seo.multi_tenant.enabled true
Set your content pool directory
emdash config set seo.content_pool ./content-pools/
Build the materialized view
emdash db materialize mv_location_pages
Warm the cache
emdash cache warm --route "/:industry/:city-:state"
```
The multi-tenant SEO engine is included in EmDash Pro and Enterprise tiers. A community edition with support for up to 100 location pages is available for self-hosted deployments.