Case Study: 3 Months of Dog-Fooding Our Own Plugin

In January 2026, we made a decision: every piece of content on the AIKit blog would be generated by our own auto-blog-seo EmDash plugin. No human-written blog posts. No manual drafting. No outsourcing. The plugin we built for our users would be the only author on this site.

Three months and 12 posts later, here's the unvarnished truth — what worked, what broke, and what we learned about eating our own dog food.

Why Dog-Food?

Dog-fooding — using your own product internally — is the oldest quality assurance trick in the book. If your plugin breaks your own site, you fix it fast. If your workflow is painful, you improve it. If the output is low quality, you feel the shame directly.

| Benefit | Description |

|---------|-------------|

| Real-world testing | Users find edge cases you never imagined |

| Feature prioritization | You feel the pain of missing features first |

| Marketing credibility | "We use what we sell" builds trust |

| Rapid iteration | Bug reports come from your own team at 2AM |

| Dog-food culture | Sets expectation that QA is everyone's job |

But there's a catch: if your product genuinely isn't ready, dog-fooding backfires. You're publishing low-quality output under your own brand name.

The Setup

Our auto-blog-seo plugin generates blog posts from a content calendar, research topics via web search, and produces structured JSON that EmDash inserts directly into Cloudflare D1. The pipeline:

```

Content Calendar (md) → Hermes Agent (AI write) → JSON queue → D1 insert → Publish

```

Key constraints:

- Posts must be 800-1500 words with proper headings, code blocks, and tables

- SEO-optimized with natural keyword placement

- Each post reviewed only for factual accuracy (not rewritten)

- No human copyediting — what the AI writes is what publishes

Month 1: The Honeymoon

**Posts published:** 4

**Issues:** 2

**Score:** B+

The first month was surprisingly smooth. The plugin handled straightforward tutorial content well — posts about SEO mistakes, WordPress migration, and Open Graph tags came out clean. The code blocks were syntactically correct, the tables were well-formed, and the excerpts were punchy.

**What went wrong:**

1. **Over-optimized keywords.** The first post had the phrase "SEO mistakes" 14 times in 900 words. Google wouldn't penalize it (thin content? maybe), but it read like a keyword-stuffed 2012 blog. We tuned the prompt to prioritize natural language.

2. **Generic advice.** The second post about auto-generating blog posts was technically accurate but read like every other "how AI writes content" article. It lacked the specific, opinionated details that make dog-fooding content valuable.

**Lesson:** Direct the AI to include specific numbers from our own experience. Vague advice is worse than no advice.

Month 2: The Reality Check

**Posts published:** 4

**Issues:** 5

**Score:** B-

Month 2 was where dog-fooding earned its reputation. Two posts needed significant factual corrections before publishing — one about Cloudflare Workers pricing (we cited wrong plan tiers) and one about schema markup (deprecated recommendation).

**Biggest failure:** The plugin generated a post about SEO score vs AEO citability. The concept was sound, but the AI hallucinated a nonexistent "AEO scoring tool" and attributed it to a real company. If we hadn't caught it during review, that would've been a credibility disaster.

| Issue | Root Cause | Fix Applied |

|-------|-----------|-------------|

| Wrong Cloudflare pricing | AI used outdated web data | Added web search freshness constraint |

| Hallucinated AEO tool | AI conflated concepts | Added fact-check step to pipeline |

| Deprecated schema rec | AI trained on pre-2025 data | Added schema.org version filter |

| Flat headline | AI defaults to safe titles | Added headline scoring rubric |

**Lesson:** AI-generated content about technical topics needs a fact-checking layer. We added a verification step that cross-references claims against official documentation URLs before inserting into D1.

Month 3: The System Matures

**Posts published:** 4

**Issues:** 1 (minor formatting glitch in a code block)

**Score:** A-

By month 3, the pipeline had stabilized. The fact-checking step caught two potential hallucinations before they reached the queue. The code blocks rendered correctly. The keyword density settled into a natural rhythm.

**What improved:**

1. **Voice consistency.** The posts started sounding like they were written by the same person. Earlier posts had wild swings in tone — one was casual, the next was academic. Prompt tuning with explicit voice guidelines fixed this.

2. **Content depth.** Early posts were 800-900 words with surface-level coverage. By month 3, posts were averaging 1200 words with deeper technical dives, real code examples, and actionable checklists.

3. **Reader engagement.** Time on page increased from 1:45 (month 1 average) to 3:12 (month 3 average). The longer, more specific posts kept readers reading.

Key Metrics: 3 Months of Dog-Fooding

Here's what the numbers looked like:

|--------|---------|---------|---------|

| Posts published | 4 | 4 | 4 |

| Average word count | 856 | 1,047 | 1,212 |

| Factual errors caught | 1 | 3 | 2 |

| Hallucinations caught | 0 | 1 | 2 |

| Posts requiring rewrite | 1 | 2 | 0 |

| Avg time on page | 1:45 | 2:18 | 3:12 |

| Organic impressions | 840 | 2,100 | 4,700 |

5 Lessons for Anyone Dog-Fooding an AI Content Tool

1. You Can't Skip Human Review (Yet)

Dog-fooding doesn't mean fully autonomous. We still review every post for factual accuracy. The difference is that review now takes 5 minutes instead of 2 hours — we're checking, not writing.

2. Hallucinations Decrease With Prompt Engineering

Most hallucinations happen when the prompt is too vague. Specific constraints ("only cite sources from the official docs", "use version X or newer") dramatically reduce fabrication.

3. The Second Month Is the Hardest

The first month has novelty energy. By month two, you've exhausted the easy topics and the AI starts generating iffy content on edge-case subjects. Push through — month three is where the compounding improvements show.

4. Specificity Is the Only Differentiator

Generic AI content is everywhere. The only reason to read *your* post is if it contains specific numbers, specific architecture decisions, and specific failures. Embrace the ugly details.

5. Build Feedback Into the Pipeline

Every time you catch an error, log it. Every time you rewrite a section, log the change. Every time a reader bounces in under 30 seconds, log the post. That feedback loop is the actual product you're building — the blog posts are just the output.

What's Next

We're continuing the dog-fooding indefinitely. Month 4 adds two new features:

1. **Automated fact-checking** via web search cross-referencing for every factual claim

2. **Style scoring** — the AI scores each post against our editorial guidelines before queueing

Dog-fooding isn't a one-month experiment. It's a permanent feedback loop. If the auto-blog-seo plugin can sustain a real blog with real readers for 12 months, it will be ready for anyone.

Case Study: 3 Months of Dog-Fooding Our Own Plugin

Why Dog-Food?

The Setup

Month 1: The Honeymoon

Month 2: The Reality Check

Month 3: The System Matures

Key Metrics: 3 Months of Dog-Fooding

5 Lessons for Anyone Dog-Fooding an AI Content Tool

1. You Can't Skip Human Review (Yet)

2. Hallucinations Decrease With Prompt Engineering

3. The Second Month Is the Hardest

4. Specificity Is the Only Differentiator

5. Build Feedback Into the Pipeline

What's Next

Related Posts

Bulk Blog Generation: When (And When Not) To Use AI

Content Scheduling 101: Automate Your Blog With EmDash

How We Built a Sandbox Plugin System on Cloudflare Workers