Claude Code Plugin

Claude Code is powerful.
Your results are inconsistent.

structured-workflows is the missing system layer between your intent and a shipped PR. It scans your project, generates context-aware specialist agents, and runs a pipeline that turns a one-sentence intent into a reviewed, tested pull request.

-1,319
net LOC
Passed
CI status
Removed
dead code
37 min
time to PR

Same codebase. Same issue. Same model. Different system.


The Problem

AI tools are powerful. AI workflows are amateur.

Every team has access to the same AI tools. The difference between 10% utilisation and 10× productivity isn't the model — it's whether there's a system between the human and the tool.

Most AI-assisted work today looks like this: a person types a long prompt, gets output, manually reviews it, finds problems, types another prompt, gets more output, manually reviews again. Each interaction starts from zero. There's no accumulated context, no quality gates, no specialist review. The human is the entire workflow.

Without a system
Each task starts from a blank slate
Human is the only quality gate
No specialist review of output
Same mistakes happen repeatedly
Output quality depends on prompt quality
With structured execution
System knows your domain, standards, and context
Multiple automated quality gates
Specialist agents review at each stage
Knowledge compounds across sessions
Output quality depends on system quality

Get Started

Four commands to your first specialist-reviewed plan

# Install the plugin
$ /plugin install structured-workflows@ai-workflow

# Scan your project and generate specialists
$ /structured-workflows:setup

# Start your first task
$ /structured-workflows:define "your task description"

# Specialists design, plan, and review before code is written
$ /structured-workflows:prepare 1

Requires Claude Code. Currently works best with coding projects. Marketing and legal domains are experimental.


Phase Zero — The Setup

Before the first command, the system learns your world

The framework doesn't start with a task. It starts with deep context acquisition. A setup process scans your environment — codebase, content library, document corpus, project history — and generates the specialist agents, skills, rules, and quality gates tailored to your domain. This isn't a generic workflow. It's an expert system built from your actual work.

01 — SCAN

Environment Analysis

The system reads your codebase, content library, or document corpus. It identifies patterns, conventions, tools, and standards already in use.

02 — GENERATE

Specialist Agents

Domain-specific agents are created automatically: security reviewer, brand compliance, legal risk assessor, architecture critic — whatever your domain requires.

03 — ENCODE

Skills & Standards

Your conventions, quality criteria, and institutional knowledge become encoded rules that every future task is evaluated against. The system knows what "good" looks like here.

Key Insight

This is what separates a system from a tool. After setup, the AI isn't a general-purpose assistant — it's a domain expert that understands your specific codebase, your brand voice, your legal framework, your architectural patterns. Every subsequent task benefits from this accumulated context.


The Universal Pattern

Five phases. Two human touchpoints. Unlimited domains.

The framework decomposes all structured work into the same skeleton. The human provides intent and approval. The system handles decomposition, specialist review, execution, verification, and delivery.

Human → Define
Express intent in natural language
One sentence. The system expands it into a fully-structured brief with requirements, constraints, dependencies, and success criteria. This is the only free-form input in the entire process.
System → Prepare
Design → Plan → Adversarial Review → Final Plan
Specialist agents run a design session, produce a detailed plan, then a separate adversarial reviewer attacks the plan for gaps, risks, and assumptions. The plan is revised before any execution begins. Errors caught here cost nothing.
System → Execute
Implement the approved plan step by step
The system implements the approved plan, running verification checks at each stage. Sub-agents handle parallel workstreams. Automated compliance checks catch issues during implementation.
System → Review
Verify deliverables against the original brief
Reviewer agents trace every requirement back to a deliverable. Success criteria are checked. Domain-specific quality gates run. The system separates "did we build it?" from "should we ship it?" — catching gaps before they reach the human.
Human → Ship
Approve the reviewed output and deliver
By this point, the deliverable has been through design review, adversarial plan review, step-by-step execution verification, and requirements traceability. The human approves a polished, pre-verified result — then the system handles delivery and cleanup.
Why This Works

Each stage is cheaper to fail at than the next one. Catching a bad assumption in the design session costs zero. Catching it during execution costs rework. Catching it after delivery costs reputation. The framework front-loads review and pushes errors left — the same principle behind TDD, design reviews, and every mature engineering process.


Domain Implementations

Same skeleton. Different specialists.

The pattern is universal — the implementation is specific. Below are three domain implementations showing how the same five-phase structure adapts to completely different types of work. The verbs change, the agents change, the outputs change — but the architecture is identical.

Software Engineering
Marketing experimental
Legal experimental

Software Engineering Implementation

The primary domain — battle-tested across real projects. In one A/B comparison, structured-workflows produced a PR with -1,319 net lines and CI passing, where ad-hoc prompting added +2,452 net lines and failed CI.

SETUP: /setup scans codebase → generates specialist agents (security reviewer, architecture critic, test strategist), project-specific skills synthesised from your actual conventions, quality gates, and an @import hub wiring everything into Claude Code's context.

Human
/define "replace the Ollama proxy layer with LiteLLM"
→ GitHub issue with requirements, acceptance criteria, constraints, dependencies, and non-goals
Human
/prepare 728
System
Design session → Implementation plan → Adversarial plan review → Revised plan
→ Architecture-critic validates provider boundaries. Security-reviewer flags telemetry lockdown. Test-strategist designs coverage strategy. Plan posted to issue for human approval.
Human
/execute 728
System
Creates branch → Implements plan steps → Quality gates at each boundary → Opens PR
→ Feature branch with new LiteLLM provider, old Ollama code removed, 65 files changed, -1,319 net lines, CI green. PR links to source issue with execution summary.
System → /review 728
Requirements traceability → Acceptance criteria scoring → Pattern-completeness scan → Disposition framework
→ Every acceptance criterion scored (MET / PARTIALLY MET / NOT MET). Findings classified: MUST FIX, FIX NOW, CREATE ISSUE. Pre-existing issues separated from new ones. Review verdict posted to PR.
Human
/ship 728
→ Final gates, PR merged, branch deleted, issue closed. Done.

Marketing Campaign Implementation — experimental

Same pipeline, different specialists. The template generates domain-specific agents but hasn't been battle-tested on production marketing projects. Feedback welcome.

SETUP: /setup scans brand assets folder → generates brand voice guide, specialist agents (audience analyst, copy strategist, compliance reviewer, channel expert), content templates derived from past campaigns, tone rules, and competitor positioning context.

Human
/define "launch campaign for our new API product targeting enterprise devops teams"
→ Campaign brief with audience segments, channel strategy, messaging pillars, budget constraints, success metrics, competitive positioning
Human
/prepare 12
System
Audience research → Creative strategy → Messaging review → Content plan
→ Audience analyst validates segments with market data. Copy strategist develops messaging angles. Brand compliance agent checks tone, terminology, and positioning against guidelines. Channel expert assigns content types to platforms. Final plan posted for approval.
Human
/execute 12
System
Drafts all assets → Brand compliance check → A/B variant generation → Stages for review
→ Landing page copy, email sequences, social posts per platform, blog post, ad copy variants. Each asset checked against brand voice rules and compliance requirements. Assets staged in output folder with review notes.
System → /review 12
Requirements traceability → Brand compliance audit → Channel format check → Stakeholder summary
→ Every brief objective traced to deliverable assets. Brand voice consistency verified across all channels. Format requirements validated per platform. Review report with verdict and any flagged issues.
Human
/ship 12
→ Assets delivered, campaign summary generated, cleanup complete. Done.

Roadmap — The Recursive Advantage

The system improves the system

This is the direction, not the current state. The foundation is built — setup, execution, review, and shipping all work today. What comes next is the feedback loop:

Which plans survived adversarial review unchanged? Those patterns should become defaults. Which execution steps needed human correction? Those should trigger new quality rules. Which specialist agents were most useful? Those should get refined. Which were never invoked? Those should get pruned.

Today, the setup process can be re-run to incorporate new codebase changes and reconcile with user customizations. The longer-term vision is fully autonomous improvement — the system observing its own outcomes and tuning itself. That's the compound advantage that ad-hoc prompting can never match.

The Bigger Picture

This framework isn't about making AI do tasks faster. It's about encoding how expert work actually happens — the decomposition, the specialist review, the quality gates, the institutional knowledge — into a repeatable system. The AI is the execution engine. The framework is the engineering discipline. Together, they turn a single sentence of human intent into a production-ready deliverable, regardless of domain.