The Publishing Bottleneck

When AIKit launched, publishing a single blog post required: writing 1000+ words, formatting Portable Text JSON, running the publisher script, verifying the live URL, and updating the content calendar. At 3 posts per week, this consumed 6-8 hours of human time. Scaling to 12 posts per week would have required a full-time content manager.

Instead, we built an autonomous pipeline that handles the entire lifecycle -- from topic generation to live publication -- without human intervention. Here is how the architecture works and what it took to get there.

Pipeline Architecture

The pipeline spans four stages, each automated by a dedicated Python script and cron trigger:

```

Calendar --> Generator --> Queue --> Publisher --> Live

^ |

+-------- Stats Feedback ----------------+

```

1. **Calendar**: A markdown file listing topics, themes, and rotation rules. Updated automatically after each publishing round.

2. **Generator**: A cron-driven Python script that reads the calendar, generates body text using LLM prompts, and writes validated JSON queue files.

3. **Queue**: A directory of `NN-slug.json` files sorted alphabetically. Each file contains title, body_text, excerpt, category, and tags.

4. **Publisher**: A wrapper script that takes the first queue file, calls `blog-publisher.py` to insert into D1, archives the file, and updates the calendar.

The Generator: Where Content Happens

The generator is the most complex component. It uses a theme rotation to ensure diverse content: Content/Growth, Marketing Automation, Sales Channel, and Hybrid Dev+Marketing. Each theme rotates through five project focuses: AIKit, CCFish, AiSalonHub, PlayableAdStudio, and DeFiKit.

Generation follows a multi-pass pattern. The first pass produces around 400-600 words on a topic. A second pass adds technical depth sections. A third pass covers implementation details and edge cases. Each pass is validated for word count and content type before the next begins.

```python

def generate_post(theme, project):

First pass: core content

body = generate_first_pass(theme, project)

if len(body.split()) < 800:

Second pass: technical deep-dive

body += expand_technical_section(theme, project)

if len(body.split()) < 800:

Third pass: edge cases and extensions

body += expand_use_cases()

return validate(body)

```

Queue Management

The queue directory maintains 2-5 pending posts at all times. When the publisher successfully processes a file, it checks the queue count. If only 1 or fewer posts remain, the generator creates 2 new ones immediately. This just-in-time refill ensures the pipeline never stalls while keeping the queue small enough to avoid wasted content that becomes stale.

Each queue file goes through validation before acceptance:

```bash

Validate JSON structure

python3 -c "import json; json.load(open('queue/NN-slug.json'))"

Verify body_text is string, not array

python3 -c "d=json.load(open('queue/NN-slug.json')); assert isinstance(d['body_text'], str)"

Check word count is 800-1500

python3 -c "d=json.load(open('queue/NN-slug.json')); w=len(d['body_text'].split()); assert 800<=w<=1500, f'{w} words'

```

The Publisher: Zero-Config D1 Insertion

The publisher script converts markdown body text to Sanity Portable Text JSON format -- the format EmDash uses for rich content. It generates a ULID-style ID, computes a URL-safe slug, and executes a four-step D1 insert sequence:

1. Insert the post into `ec_posts` with null revision references

2. Insert a content revision into `revisions` referencing the post ID

3. Update the post with the revision IDs to resolve the circular foreign key

4. Insert SEO metadata into `_emdash_seo` for search engine visibility

The entire sequence takes under 2 seconds. The post is immediately live at `ai-kit.net/blog/<slug>` -- no build step, no redeploy, no cache purge needed.

Handling Failures

No pipeline runs perfectly every time. Common failure modes and their fixes:

- **Slug collision**: The D1 table has a UNIQUE(slug, locale) constraint. The publisher checks D1 before inserting and archives duplicates instead of crashing.

- **Auth errors**: Cloudflare multi-account setups need explicit `CLOUDFLARE_ACCOUNT_ID`. The publisher propagates the env var to subprocesses.

- **JSON quoting**: LLM-generated body text often contains unescaped double quotes. The generator uses `json.dump()` instead of string templates to avoid this.

- **Content calendar corruption**: The calendar file accumulates line-number prefixes when `read_file()` output is written back. The pipeline checks for this before every edit.

Results at 12 Posts Per Week

After running the pipeline for 4 weeks:

| Metric | Manual (3/week) | Automated (12/week) | Improvement |

|--------|----------------|-------------------|-------------|

| Posts published | 12 per month | 48 per month | 4x volume |

| Human hours | 6-8 hours/week | 0 hours | Fully automated |

| Time to publish | 15 min manual | <2 seconds | 450x faster |

| Error rate | ~5% (typos, formatting) | <1% (schema edge cases) | 5x improvement |

Key Takeaways

A fully automated content pipeline is achievable without a massive engineering budget. AIKit uses tools already available to any indie developer: cron, Python, D1, and markdown files. The key is designing each stage to fail safely, validating aggressively, and never trusting LLM output without type and length checks. With this pipeline, a solo developer can maintain the content output of a five-person marketing team.