Spec-Driven AI Agent: Why Explicit Contracts Change What Your Agent Produces
Give an AI coding agent a prompt and it optimizes for completing the task. Give it a spec and it optimizes for satisfying a contract. The difference is visible in the first review.
Codeberg-first
Ralph Workflow is free and open source. Inspect the primary repo on Codeberg before you install — or jump to the GitHub mirror.
Give an AI coding agent a prompt and it optimizes for completing the task. Give it a spec and it optimizes for satisfying a contract. The difference is visible in the first review.
Prompt vs Spec: A Concrete Example
Prompt: "Build a REST API for a todo list."
Spec: "Build a REST API for a todo list. Use FastAPI. Endpoints: GET /todos, POST /todos, DELETE /todos/:id. Return 404 for missing IDs. On POST validate title is a non-empty string. Run pytest and confirm all tests pass. Return a diff bounded to these items only."
The prompt leaves everything to interpretation. The spec leaves almost nothing to interpretation — and that is the point.
What a Spec-Driven Agent Does Differently
A spec-first agent: - Builds against acceptance criteria instead of implied intent - Catches its own deviations before the human reviewer does - Leaves a diff that traces directly to spec items - Can be evaluated mechanically: did the diff satisfy the spec?
The Verify Step Catches What the Build Step Misses
The verify pass is not "review the code." It is: 1. Run the spec items against the actual diff 2. Run the tests 3. Report what is satisfied and what is not
If the verify step fails, the loop goes back to the specific spec item that was not met — not to a generic retry.
Spec-Driven Is Not New. The Loop Structure Is.
Spec-driven development has been a best practice for decades. The new part is applying it to AI coding agents: a CLI that enforces spec-first phases, runs the verify step automatically, and loops until the diff satisfies the spec.
Ralph Workflow runs your existing AI coding agents through spec-first phases on your own machine, with automated verification after each phase, so you wake up to a result you can actually review.
Try it on Codeberg: RalphWorkflow/Ralph-Workflow — star, fork, and open issues there. GitHub mirror: Ralph-Workflow/Ralph-Workflow.
Where Ralph Workflow Fits
Ralph Workflow is the operating system for autonomous coding: a free and open-source composable loop framework and AI orchestrator. It keeps the core loop simple, ships with a strong default workflow for writing software, and lets you use that default as-is or build your own workflow on top.
Related Posts
Spec-Driven AI Agents: Why Workflow Is the Unit of Work
Prompts start AI coding runs, but specs and workflow keep them on track. Why unattended agents need planning loops, analysis gates, verification, and git-backed handoffs.
AI Coding Workflow Automation: Why Loop Structure Matters More Than Model Choice
Most AI workflow automation tools optimize for agent capability and ignore the thing that actually determines whether a coding run ends well: the loop structure around the agent.
Claude Code Automation: Running Unattended Coding Sessions That Actually Finish
Claude Code is a powerful coding agent, but running it unattended requires more than a long timeout. Here's how a workflow layer around Claude Code turns automated sessions into reviewable, verified output.
Best evaluator path
Turn the idea into a real overnight test, not another saved tab.
Codeberg-first: open the primary repo, choose one bounded backlog task, run it tonight, and ask one question tomorrow morning — would I merge this? GitHub stays available as the mirror.
Open the primary Codeberg repo
Read the public source before you install anything.
Pick a first task
Use the guide to choose a bounded backlog item that is honest to review.
Install and run Ralph Workflow
Keep the machine awake, then decide in the morning whether the diff is good enough to merge.