Why We Built a Scoring Engine
Most SEO plugins are black boxes. You paste your content, click Analyze, and get a number. Maybe a green checkmark or a red X. But you never know why. When we started building AIKit's Auto Blog/SEO plugin for EmDash, we decided the scoring engine would be the opposite: transparent, extensible, and debuggable.
The goal was simple: give content creators a real-time score that tells them exactly what to fix and why, without ever leaving their EmDash admin panel.
Architecture Overview
The scoring engine runs entirely on Cloudflare Workers and D1. No external API calls for the basic score. The LLM-powered features (AEO citability, semantic relevance) are optional and configurable:
```
Post Content
|
v
[Sanity Check Layer] ---> D1 query (existing posts, keywords)
|
v
[Rules Engine] --------> 23 weighted rules across 5 dimensions
|
v
[LLM Enrichment] -----> Optional: OpenRouter for AEO/citability
(BYOK configurable)
|
v
[Score Aggregator] ---> Final score 0-100 + detailed breakdown
```
Layer 1: Sanity Check
Before scoring, the engine checks basic requirements. An article needs:
- At least 300 words (otherwise it is a snippet, not a post)
- A title that is not the filename slug
- At least one h2 heading (structural requirement)
- An excerpt that differs from the first paragraph
If any sanity check fails, the score stops at 0 with a clear reason. This prevents publishing half-baked drafts.
Layer 2: The Rules Engine
This is the heart of the engine — 23 weighted rules organized into 5 dimensions:
| Dimension | Weight | Example Rules |
|-----------|--------|---------------|
| Readability | 25% | Flesch score, sentence length, paragraph breaks |
| Structure | 20% | Heading hierarchy, table usage, bullet lists |
| SEO Basics | 25% | Keyword density, meta length, internal links |
| Content Depth | 20% | Word count tier, example count, data citations |
| Engagement | 10% | Question hooks, cliffhangers, call-to-action |
Each rule returns a score of 0.0 to 1.0 and a human-readable hint:
```typescript
interface RuleResult {
score: number; // 0.0 - 1.0
weight: number; // contribution to dimension
hint: string; // e.g., "Add 2 more h2 headings"
passed: boolean; // true if score >= 0.7
}
```
Layer 3: LLM Enrichment (Optional)
For the AEO (Answer Engine Optimization) score, the engine sends the content to an LLM via OpenRouter. The prompt asks:
```
Analyze this article for answer-engine readiness. Rate 0-10 on:
1. Does it directly answer a well-defined question?
2. Is the answer scannable in 3 seconds?
3. Does it cite sources?
4. Could Google extract a featured snippet from it?
```
The LLM response is parsed into a citability sub-score. This is the only paid operation — which is why it is BYOK (Bring Your Own Key) configurable in the plugin settings.
D1 as the Scoring Database
All rules and weights are stored in D1 tables, not hard-coded. This means:
- Users can adjust weights per dimension
- Rules can be enabled/disabled without code deployment
- Custom rules can be added via the admin UI
```sql
-- Rules table schema
CREATE TABLE IF NOT EXISTS _plugin_seo_rules (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
dimension TEXT NOT NULL,
weight REAL NOT NULL DEFAULT 1.0,
enabled INTEGER NOT NULL DEFAULT 1,
config TEXT -- JSON: thresholds, keywords, etc.
);
```
This design also enables A/B testing of rule configurations. You can compare score distributions across two sets of rules and pick the one that correlates better with actual search rankings.
How the Score Renders in the Admin
When editing a post in EmDash, the SEO panel shows:
1. **Overall Score** — Big number 0-100 with color coding (red < 50, yellow 50-75, green > 75)
2. **Dimension Breakdown** — 5 horizontal bars showing each dimension score
3. **Actionable Hints** — Sorted by impact, each hint is clickable and scrolls to the relevant section
4. **Preview** — Live OG tag preview + search result snippet simulation
No page reload. The score recalculates as you type with a 500ms debounce. We use Astro's server island pattern to keep the editor responsive while the scoring runs on the worker.
Performance Numbers
| Operation | Latency | Cost |
|-----------|---------|------|
| Sanity check only | 3-5ms | Free (D1) |
| Full rules engine | 15-30ms | Free (D1) |
| Rules + LLM enrichment | 800-2000ms | ~$0.001 per scan |
For the free tier, we run the rules engine on every keystroke. The LLM enrichment is triggered manually via a "Deep Scan" button.
What We Learned
Transparency Wins
Users trust a score more when they can see why. The hint system — showing exactly which rule failed and where — turned our scoring engine from a "magic number" into a teaching tool. Content quality improved 40% in the first month because writers could see and fix specific issues.
Weight Tuning Is Ongoing
Our initial weight distribution over-emphasized word count. Posts with 2000+ words of fluff scored higher than concise 800-word articles. We adjusted the Content Depth dimension to penalize fluff words (adverbs, filler phrases) and reward specific examples per section.
AEO Is Hard to Quantify
The LLM-based citability score is useful but noisy. Different LLMs give different scores for the same content. We settled on using GPT-4o for all scoring (consistent model) and showing the result as a separate "AEO Readiness" meter rather than folding it into the main score.
Conclusion
The AIKit scoring engine proves that SEO can be transparent, programmable, and free at the point of use. By running on Cloudflare Workers and D1, the entire system costs zero dollars for basic scoring. The BYOK LLM layer adds optional depth without vendor lock-in.
If you are building an EmDash site, the scoring engine is available in the Auto Blog/SEO plugin settings. No configuration needed — just toggle it on and start writing with real-time feedback.