What We Built
This is a step-by-step guide to replicating AIKit's blog automation pipeline. By the end, you will have a system that generates blog posts from a content calendar, publishes them to a D1-backed EmDash site, and cross-posts to Telegram -- all triggered by a cron job. The total setup time for an existing EmDash site is about 2 hours, including script tweaks and auth configuration.
Prerequisites
Before starting, you need:
- An EmDash CMS site deployed to Cloudflare Pages (or any Astro site using D1)
- wrangler CLI authenticated with D1 write permissions
- A queue directory: `~/cmo/content/queue/` and `~/cmo/content/queue/published/`
- Python 3.9+ with json, os, and re standard libraries
- A Cloudflare zone with D1 database created and bound to your Pages project
Step 1: Set Up the D1 Database
EmDash uses ec_posts as its primary content table. The key columns are id (ULID), slug (unique with locale), title, content (Portable Text JSON), status, and revision references. Author must exist in the users table -- AIKit uses author ID 01KNB83V4HRH6VFG6W38QFTQS5 (Tony Nguyen).
To insert a post programmatically, you run four sequential D1 queries. The circular foreign key between ec_posts and revisions requires a specific order: insert the post with NULL revision refs, insert the revision, then update the post with the revision IDs.
```sql
-- Step 1: Insert post with NULL revision refs
INSERT INTO ec_posts (id, slug, title, content, status, author_id, locale)
VALUES ('01...', 'my-post-slug', 'My Post', '{}', 'published', '01...', 'en');
-- Step 2: Insert revision referencing the post
INSERT INTO revisions (id, collection, entry_id, data, author_id)
VALUES ('01...', 'posts', '01...', '{}', '01...');
-- Step 3: Update post with revision refs
UPDATE ec_posts SET live_revision_id='01...', draft_revision_id='01...' WHERE id='01...';
-- Step 4: Insert SEO metadata
INSERT INTO _emdash_seo (collection, content_id, seo_title, seo_description)
VALUES ('posts', '01...', 'My Post', 'My excerpt');
```
Step 2: Build the Queue-Based Publisher
The queue system is surprisingly low-tech: JSON files in a directory. Each file contains title, body_text (markdown), excerpt, category, and tags. The blog-publisher.py script reads these files and converts the body_text to Portable Text format for D1 storage.
Portable Text conversion is straightforward: lines starting with ## become h2 blocks, lines with ### become h3 blocks, and everything else becomes normal paragraph blocks. Each block gets a unique _key (p1, h1, p2, h2, etc.). Empty lines are skipped. The resulting JSON array is serialized into the ec_posts.content column.
```python
def text_to_portable_text(body):
blocks = []
key_counter = {"p": 1, "h": 1}
for line in body.strip().splitlines():
stripped = line.strip()
if not stripped:
continue
if stripped.startswith("## "):
style, prefix = "h2", "h"
elif stripped.startswith("### "):
style, prefix = "h3", "h"
else:
style, prefix = "normal", "p"
...
```
Step 3: Set Up the Content Calendar
The content calendar is a markdown file with weekly planning sections and per-run Post-Calendar sections. Each section is a table with columns: Number, Title, Slug, Category, and Status. The calendar serves as both the editorial plan and the publishing record -- updated programmatically after every publishing run.
A theme rotation system ensures content variety. On each run, the pipeline computes DAY_OF_YEAR modulo 4 to pick a theme (Content/Growth, Marketing Automation, Sales Channel, or Hybrid Dev+Marketing) and HOUR modulo 5 to pick a project focus. This prevents the pipeline from generating repetitive content even after hundreds of posts.
Step 4: Automate Content Generation
When the queue runs low (1 or fewer posts remaining), the pipeline generates 2 new posts. The generation happens inside execute_code using Python string parts-lists to avoid quoting issues with JSON serialization. Body text targets 800-1500 words with real technical depth.
Key generation rules:
- Each section needs 2-3 substantive paragraphs from the start -- minimal content requires expansion passes
- Code blocks use real syntax highlighting markers (Python, SQL, bash, TypeScript)
- Tables are used for structured data comparisons (cron schedules, configuration options)
- Quotes use Unicode smart quotes to avoid JSON escaping issues in queue files
- Files are validated after generation: JSON parse check, body_text type check (must be str, not list), word count check (800-1500)
Step 5: Set Up the Cron Pipeline
The final step is scheduling. A cron job runs Mon/Wed/Fri at 6AM CT with three responsibilities: publish the next queue post, cross-post to Telegram, and refill the queue if needed. Delivery is set to local to avoid spamming the user's main chat.
```bash
Publish from queue (must run from EmDash project dir)
cd ~/Projects/AIKitLLC/EmDash
export CLOUDFLARE_ACCOUNT_ID=your_zone_id
python3.9 ~/cmo/scripts/queue-publisher.py
Announce on Telegram
send_message(target='telegram:-1003988624465', message='...')
```
The entire pipeline costs near-zero in Cloudflare compute and D1 storage. No servers, no CI/CD, no deployment pipeline. Content goes from topic to live URL in under 2 minutes.
Key Takeaways
- A queue of JSON files is simpler and more debuggable than a database queue for content publishing -- you can inspect, edit, and reorder files with a text editor
- D1's eventual consistency is fast enough for blog publishing -- posts appear live within seconds of INSERT
- The markdown calendar + cron + LLM generation triad replaces a traditional editorial workflow with near-zero marginal cost
- Serverless multi-channel distribution costs effectively zero at AIKit's scale -- each additional channel is just a few lines of Python
- Start simple, add channels incrementally: blog first, then Telegram, then Dev.to, then email
- The theme rotation is critical for long-term content quality -- without it, the pipeline generates shallow repetitive content