Skip to main content
specs agents engineering prompts

Spec-Driven AI Agent: Why Explicit Contracts Change What Your Agent Produces

Give an AI coding agent a prompt and it optimizes for completing the task. Give it a spec and it optimizes for satisfying a contract. The difference is visible in the first review.

Codeberg-first

Ralph Workflow is free and open source. Inspect the primary repo on Codeberg before you install — or jump to the GitHub mirror.

Give an AI coding agent a prompt and it optimizes for completing the task. Give it a spec and it optimizes for satisfying a contract. The difference is visible in the first review.

Prompt vs Spec: A Concrete Example

Prompt: "Build a REST API for a todo list."

Spec: "Build a REST API for a todo list. Use FastAPI. Endpoints: GET /todos, POST /todos, DELETE /todos/:id. Return 404 for missing IDs. On POST validate title is a non-empty string. Run pytest and confirm all tests pass. Return a diff bounded to these items only."

The prompt leaves everything to interpretation. The spec leaves almost nothing to interpretation — and that is the point.

What a Spec-Driven Agent Does Differently

A spec-first agent: - Builds against acceptance criteria instead of implied intent - Catches its own deviations before the human reviewer does - Leaves a diff that traces directly to spec items - Can be evaluated mechanically: did the diff satisfy the spec?

The Verify Step Catches What the Build Step Misses

The verify pass is not "review the code." It is: 1. Run the spec items against the actual diff 2. Run the tests 3. Report what is satisfied and what is not

If the verify step fails, the loop goes back to the specific spec item that was not met — not to a generic retry.

Spec-Driven Is Not New. The Loop Structure Is.

Spec-driven development has been a best practice for decades. The new part is applying it to AI coding agents: a CLI that enforces spec-first phases, runs the verify step automatically, and loops until the diff satisfies the spec.

Ralph Workflow runs your existing AI coding agents through spec-first phases on your own machine, with automated verification after each phase, so you wake up to a result you can actually review.


Try it on Codeberg: RalphWorkflow/Ralph-Workflow — star, fork, and open issues there. GitHub mirror: Ralph-Workflow/Ralph-Workflow.

Where Ralph Workflow Fits

Ralph Workflow is the operating system for autonomous coding: a free and open-source composable loop framework and AI orchestrator. It keeps the core loop simple, ships with a strong default workflow for writing software, and lets you use that default as-is or build your own workflow on top.

Best evaluator path

Turn the idea into a real overnight test, not another saved tab.

Codeberg-first: open the primary repo, choose one bounded backlog task, run it tonight, and ask one question tomorrow morning — would I merge this? GitHub stays available as the mirror.

Open the primary Codeberg repo

Read the public source before you install anything.

Pick a first task

Use the guide to choose a bounded backlog item that is honest to review.

Install and run Ralph Workflow

Keep the machine awake, then decide in the morning whether the diff is good enough to merge.