structured-workflows

Phase Zero — The Setup

Before the first command, the system learns your world

The framework doesn't start with a task. It starts with deep context acquisition. A setup process scans your environment — codebase, content library, document corpus, project history — and generates the specialist agents, skills, rules, and quality gates tailored to your domain. This isn't a generic workflow. It's an expert system built from your actual work.

01 — SCAN

Environment Analysis

The system reads your codebase, content library, or document corpus. It identifies patterns, conventions, tools, and standards already in use.

02 — GENERATE

Specialist Agents

Domain-specific agents are created automatically: security reviewer, brand compliance, legal risk assessor, architecture critic — whatever your domain requires.

03 — ENCODE

Skills & Standards

Your conventions, quality criteria, and institutional knowledge become encoded rules that every future task is evaluated against. The system knows what "good" looks like here.

Key Insight

This is what separates a system from a tool. After setup, the AI isn't a general-purpose assistant — it's a domain expert that understands your specific codebase, your brand voice, your legal framework, your architectural patterns. Every subsequent task benefits from this accumulated context.

The Universal Pattern

Five phases. Two human touchpoints. Unlimited domains.

The framework decomposes all structured work into the same skeleton. The human provides intent and approval. The system handles decomposition, specialist review, execution, verification, and delivery.

Human → Define

Express intent in natural language

One sentence. The system expands it into a fully-structured brief with requirements, constraints, dependencies, and success criteria. This is the only free-form input in the entire process.

System → Prepare

Design → Plan → Adversarial Review → Final Plan

Specialist agents run a design session, produce a detailed plan, then a separate adversarial reviewer attacks the plan for gaps, risks, and assumptions. The plan is revised before any execution begins. Errors caught here cost nothing.

System → Execute

Implement the approved plan step by step

The system implements the approved plan, running verification checks at each stage. Sub-agents handle parallel workstreams. Automated compliance checks catch issues during implementation.

System → Review

Verify deliverables against the original brief

Reviewer agents trace every requirement back to a deliverable. Success criteria are checked. Domain-specific quality gates run. The system separates "did we build it?" from "should we ship it?" — catching gaps before they reach the human.

Human → Ship

Approve the reviewed output and deliver

By this point, the deliverable has been through design review, adversarial plan review, step-by-step execution verification, and requirements traceability. The human approves a polished, pre-verified result — then the system handles delivery and cleanup.

Why This Works

Each stage is cheaper to fail at than the next one. Catching a bad assumption in the design session costs zero. Catching it during execution costs rework. Catching it after delivery costs reputation. The framework front-loads review and pushes errors left — the same principle behind TDD, design reviews, and every mature engineering process.

Domain Implementations

Same skeleton. Different specialists.

The pattern is universal — the implementation is specific. Below are three domain implementations showing how the same five-phase structure adapts to completely different types of work. The verbs change, the agents change, the outputs change — but the architecture is identical.

Software Engineering

Marketing experimental

Legal experimental

Software Engineering Implementation

The primary domain — battle-tested across real projects. In one A/B comparison, structured-workflows produced a PR with -1,319 net lines and CI passing, where ad-hoc prompting added +2,452 net lines and failed CI.

SETUP: /setup scans codebase → generates specialist agents (security reviewer, architecture critic, test strategist), project-specific skills synthesised from your actual conventions, quality gates, and an @import hub wiring everything into Claude Code's context.

Human

/define "replace the Ollama proxy layer with LiteLLM"

→ GitHub issue with requirements, acceptance criteria, constraints, dependencies, and non-goals

↓

Human

/prepare 728

System

Design session → Implementation plan → Adversarial plan review → Revised plan

→ Architecture-critic validates provider boundaries. Security-reviewer flags telemetry lockdown. Test-strategist designs coverage strategy. Plan posted to issue for human approval.

↓

Human

/execute 728

System

Creates branch → Implements plan steps → Quality gates at each boundary → Opens PR

→ Feature branch with new LiteLLM provider, old Ollama code removed, 65 files changed, -1,319 net lines, CI green. PR links to source issue with execution summary.

↓

System → /review 728

Requirements traceability → Acceptance criteria scoring → Pattern-completeness scan → Disposition framework

→ Every acceptance criterion scored (MET / PARTIALLY MET / NOT MET). Findings classified: MUST FIX, FIX NOW, CREATE ISSUE. Pre-existing issues separated from new ones. Review verdict posted to PR.

↓

Human

/ship 728

→ Final gates, PR merged, branch deleted, issue closed. Done.

Marketing Campaign Implementation — experimental

Same pipeline, different specialists. The template generates domain-specific agents but hasn't been battle-tested on production marketing projects. Feedback welcome.

SETUP: /setup scans brand assets folder → generates brand voice guide, specialist agents (audience analyst, copy strategist, compliance reviewer, channel expert), content templates derived from past campaigns, tone rules, and competitor positioning context.

Human

/define "launch campaign for our new API product targeting enterprise devops teams"

→ Campaign brief with audience segments, channel strategy, messaging pillars, budget constraints, success metrics, competitive positioning

↓

Human

/prepare 12

System

Audience research → Creative strategy → Messaging review → Content plan

→ Audience analyst validates segments with market data. Copy strategist develops messaging angles. Brand compliance agent checks tone, terminology, and positioning against guidelines. Channel expert assigns content types to platforms. Final plan posted for approval.

↓

Human

/execute 12

System

Drafts all assets → Brand compliance check → A/B variant generation → Stages for review

→ Landing page copy, email sequences, social posts per platform, blog post, ad copy variants. Each asset checked against brand voice rules and compliance requirements. Assets staged in output folder with review notes.

↓

System → /review 12

Requirements traceability → Brand compliance audit → Channel format check → Stakeholder summary

→ Every brief objective traced to deliverable assets. Brand voice consistency verified across all channels. Format requirements validated per platform. Review report with verdict and any flagged issues.

↓

Human

/ship 12

→ Assets delivered, campaign summary generated, cleanup complete. Done.

Legal Review Implementation — experimental

Structured decomposition applied to contract review and compliance work. Early stage — we'd love input from legal practitioners.

SETUP: /setup scans legal templates folder → generates standard terms baseline, specialist agents (risk assessor, compliance checker, precedent analyst, negotiation strategist), clause library from existing contracts, jurisdiction rules, and approval thresholds by risk level.

Human

/define "review vendor MSA for standard terms compliance, flag IP and liability concerns"

→ Review ticket with risk categories, compliance checklist, reference terms, jurisdiction considerations, stakeholder requirements

↓

Human

/prepare 7

System

Clause analysis → Risk assessment → Precedent comparison → Findings report

→ Risk assessor flags deviations from standard terms. Compliance checker validates regulatory requirements. Precedent analyst compares against similar past agreements. Findings report with severity tiers posted for review.

↓

Human

/execute 7

System

Generates redline → Amendment suggestions → Negotiation brief → Stakeholder summary

→ Redlined document with tracked changes, plain-language amendment rationale for each change, negotiation talking points by priority, executive summary for non-legal stakeholders.

↓

System → /review 7

Terms compliance verification → Risk rating validation → Precedent cross-check → Completeness audit

→ All standard terms deviations accounted for. Risk ratings justified against precedent. Amendment rationale verified for legal soundness. Review report with verdict and escalation flags for senior review if needed.

↓

Human

/ship 7

→ Clean version generated, cover memo attached, deliverables packaged. Done.

Roadmap — The Recursive Advantage

The system improves the system

This is the direction, not the current state. The foundation is built — setup, execution, review, and shipping all work today. What comes next is the feedback loop:

Which plans survived adversarial review unchanged? Those patterns should become defaults. Which execution steps needed human correction? Those should trigger new quality rules. Which specialist agents were most useful? Those should get refined. Which were never invoked? Those should get pruned.

Today, the setup process can be re-run to incorporate new codebase changes and reconcile with user customizations. The longer-term vision is fully autonomous improvement — the system observing its own outcomes and tuning itself. That's the compound advantage that ad-hoc prompting can never match.

The Bigger Picture

This framework isn't about making AI do tasks faster. It's about encoding how expert work actually happens — the decomposition, the specialist review, the quality gates, the institutional knowledge — into a repeatable system. The AI is the execution engine. The framework is the engineering discipline. Together, they turn a single sentence of human intent into a production-ready deliverable, regardless of domain.

Claude Code is powerful.
Your results are inconsistent.

AI tools are powerful. AI workflows are amateur.

Four commands to your first specialist-reviewed plan

Before the first command, the system learns your world

Environment Analysis

Specialist Agents

Skills & Standards

Five phases. Two human touchpoints. Unlimited domains.

Same skeleton. Different specialists.

Software Engineering Implementation

Marketing Campaign Implementation — experimental

Legal Review Implementation — experimental

The system improves the system

Claude Code is powerful.Your results are inconsistent.

AI tools are powerful. AI workflows are amateur.

Four commands to your first specialist-reviewed plan

Before the first command, the system learns your world

Environment Analysis

Specialist Agents

Skills & Standards

Five phases. Two human touchpoints. Unlimited domains.

Same skeleton. Different specialists.

Software Engineering Implementation

Marketing Campaign Implementation — experimental

Legal Review Implementation — experimental

The system improves the system

Claude Code is powerful.
Your results are inconsistent.