Summary
A command-line YouTube content production pipeline that takes a video transcript (or a topic prompt) and fans it out to 18 generators producing every adjacent asset a creator needs to ship: titles, descriptions, SEO tags, hooks, thumbnail briefs, AI-generated thumbnails and banners, content briefs, section illustrations, and downstream products. One transcript in, a publish-ready folder out. Built as a plug-in architecture so adding the 19th generator is a 7-step recipe.
The Problem
Shipping a YouTube video isn't just shooting the video. Every upload needs a title that survives the algorithm, a description that hits a known SEO pattern, hashtags, a thumbnail brief, the actual thumbnail image, a section-illustration set for the body of the video, sometimes a content brief for repurposing, and product descriptions if you're selling alongside the video. Doing all of that by hand for each upload is the reason most creators ship sporadically.
I wanted a command — ytcc generate transcript.md — that produced every one of those assets, on-brand, in under a minute. With the freedom to skip generators, run only specific ones, or regenerate from a saved analysis without paying for the analysis step twice.
The Approach
The pipeline is built around a ContentContext dataclass and a registry of generators. The transcript first goes through a single frontier-model analysis pass that produces a structured JSON with topic, outline, hook candidates, and audience signal. That analysis becomes part of the context every downstream generator consumes — no generator re-analyzes the video, they all read the cached JSON.
transcript.md → Frontier-model analysis (one call)
↓
ContentContext (transcript + analysis + brand)
↓
┌─────────────────┼─────────────────┐
↓ ↓ ↓
Text generators Image generators Product generators
(titles, SEO, (thumbnails, (descriptions,
descriptions, banners, briefs,
hooks, briefs) illustrations) splits)
Generators register themselves via a decorator and inherit from a BaseGenerator class with a single generate(context) → Path method. Each one declares its prompt file, output extension, and output directory in config. Adding a new generator is 7 mechanical steps with no framework code to touch. That plug-in shape is the reason the pipeline grew to 18 generators without becoming unmaintainable.
What I Built
- One-pass analysis layer — single frontier-model call produces a structured JSON consumed by every downstream generator; saves cost and keeps every asset internally consistent
- 18 plug-in generators — text, image, and product generators each with their own prompt template and output schema
- Image generation pipeline — multimodal model produces thumbnails and banners; ImageMagick handles composition and text overlay
- Brand context module — channel name, audience, tone, default hashtags, SEO tag pool centralized so every generator stays on-brand
--only/--skip/regenerateflags — run specific generators, skip expensive ones, or re-run from cached analysis without paying for the analysis again- Topic mode —
ytcc concept "<topic>"skips the transcript and generates from a prompt for fast pre-production
Engineering Highlights
- One analysis, many generators. The most expensive step is the structured analysis pass. Every other generator reads the cached JSON and never re-analyzes. Drops cost and runtime and — more importantly — keeps the title, description, thumbnail brief, and hooks all working off the same understanding of the video.
- Plug-in architecture as a productivity multiplier. A new generator is one Python file, one decorator, one prompt template, one config entry. The shape lets me add capability faster than I add maintenance burden.
- Dual-vendor AI by design. Text generation runs on a frontier reasoning model; image generation runs on a separate multimodal vendor. The vendor choice is configured per generator, abstracted behind environment variables. Switching either vendor is a config change, not a refactor.
- Title format codified, not "creative." The channel has a tested title formula that performs. The titles generator enforces the formula via the prompt instead of asking the model to be clever. Boring + tested beats clever + unproven on YouTube algorithms.
Outcome
Daily content output for a sports betting channel goes from "an afternoon of asset prep" to a single command. Ships every weekday on schedule. Adding a new asset type — a new generator — takes under an hour. The pipeline has grown from a few generators to 18 without rewrites.
Tech footprint
- Frontend — Click CLI with Rich console output; no web UI
- Backend — Python 3.8+ pipeline with a generator registry pattern
- AI — frontier reasoning model for text and analysis, separate multimodal model for thumbnails and banners (vendor-abstracted)
- Image processing — ImageMagick (Wand) for composition and text overlay
- Config —
python-dotenv, centralized brand context module - Output — markdown, JSON, PNG into per-generator directories