TOML Workflow Configuration for AI Agents: A Complete Guide to pipeline.toml
Every serious AI agent workflow needs a config file you can version, diff, and reason about. Here's how Ralph Workflow's TOML pipeline config turns multi-agent orchestration into plain text you can edit in vim.
Codeberg-first
Ralph Workflow is free and open source. Inspect the primary repo on Codeberg before you install — or jump to the GitHub mirror.
The hardest part of multi-agent coding isn't writing the code. It's expressing what each agent should do, in what order, with which handoff artifacts — without this knowledge living entirely in your head or in a tangle of shell scripts.
Ralph Workflow solves this with TOML. Not YAML with its indentation voodoo. Not JSON with its unreadable nesting. Plain, version-control-friendly TOML that anyone on your team can read and edit.
This is a complete walkthrough of the TOML configuration surface — what each file does, how they compose, and how to customize the pipeline without breaking the default safety rails.
The eight-file config set
Ralph Workflow starts you with a clean layering of TOML files. Run ralph --init and review what lands:
| File | Location | Purpose |
|---|---|---|
ralph-workflow.toml |
~/.config/ |
User-global main config |
ralph-workflow-mcp.toml |
~/.config/ |
MCP server definitions |
ralph-workflow-pipeline.toml |
~/.config/ |
Pipeline phase defaults |
ralph-workflow-artifacts.toml |
~/.config/ |
Artifact contract defaults |
ralph-workflow.toml |
.agent/ |
Project-local override |
mcp.toml |
.agent/ |
Project-local MCP override |
pipeline.toml |
.agent/ |
Project-local pipeline phases |
artifacts.toml |
.agent/ |
Project-local artifact contracts |
Override precedence is straightforward: CLI flags > project-local (.agent/) > user-global (~/.config/) > bundled defaults. Want to change a phase for one project? Edit .agent/pipeline.toml. Want a change across all projects? Edit ~/.config/. The defaults in the installed package are always there as a reference.
Run ralph --regenerate-config to refresh all configs from bundled defaults — existing files get backed up to .bak before overwrite. Safe and reversible.
pipeline.toml: the heart of the workflow
The pipeline file defines your workflow as blocks with phases, transitions, and routing rules. Here's the default structure, annotated:
entry_block = "developer_iteration"
terminal_phase = "complete"
[loop_counters.development_analysis_iteration]
default_max = 10
[budget_counters.iteration]
default_max = 5
The pipeline knows two kinds of blocks:
- Group blocks (
kind = "group") — compose child blocks into a sequence with completion rules. The defaultdeveloper_iterationblock orchestrates planning → analysis → development → commit → analysis → final-commit → complete. - Individual blocks (
kind = "individual") — a single phase with one agent assignment, one transition map, and one role.
A phase block, explained
[blocks.development]
kind = "individual"
phase_name = "development"
[blocks.development.phase]
drain = "development"
role = "execution"
prompt_template = "developer_iteration.jinja"
[blocks.development.phase.transitions]
on_success = "development_commit_cleanup"
on_loopback = "development"
[blocks.development.phase.parallelization]
mode = "same_workspace"
max_parallel_workers = 8
max_work_units = 50
require_allowed_directories = true
Each phase block answers four questions:
- What agent runs it? — The
drainfield maps to a drain name, which maps to an agent chain inagents.toml. - What prompt does it use? —
prompt_templatepoints to a Jinja2 template. - Where does it go next? —
transitionsdefineson_success,on_loopback, andon_failuretargets. - Can it fan out? —
parallelizationconfigures same-workspace parallel execution with worker and work-unit caps.
Loop policy and decision routing
Phases don't just run once and move on. Analysis phases in particular have loop policies that let the agent decide whether work is done:
[blocks.development_analysis.phase.loop_policy]
iteration_state_field = "development_analysis_iteration"
[blocks.development_analysis.phase.decisions.completed]
target = "development_final_commit_cleanup"
reset_loop = true
[blocks.development_analysis.phase.decisions.request_changes]
target = "development"
reset_loop = false
The analysis agent can respond with completed, request_changes, or failed. The pipeline routes accordingly — back to development for more work, forward to final commit if done, or to a failure terminal if unrecoverable. The loop counter (development_analysis_iteration, capped at 10 by default) prevents infinite cycling.
Recovery policy
When things go wrong, the recovery section defines what happens:
[recovery]
cycle_cap = 200
failed_route = "failed_terminal"
terminal_failure_phase = "failed_terminal"
preserve_session_on_categories = ["agent"]
A hard cap of 200 cycles prevents runaway loops. Agent failures preserve the session so you can resume from a checkpoint rather than starting over.
Post-commit budget routing
After a development commit, the pipeline checks the budget counter:
[[post_commit_routes]]
target = "planning"
[post_commit_routes.when]
phase = "development_final_commit"
budget_state = "remaining"
[[post_commit_routes]]
target = "complete"
[post_commit_routes.when]
phase = "development_final_commit"
budget_state = "exhausted"
If the iteration budget has remaining capacity, the pipeline loops back to planning for another development pass. If the budget is exhausted, it terminates with complete. This is how multi-iteration feature development works without manual intervention.
agents.toml: who runs what
The agents file defines chains (ordered fallback lists of agents) and drain bindings (which chain each pipeline drain uses):
[agent_chains.development]
agents = ["claude", "opencode"]
max_retries = 3
retry_delay_ms = 1000
[agent_drains.development]
chain = "development"
drain_class = "development"
The development drain first tries claude. If it fails after 3 retries with 1-second backoff, it falls back to opencode. This is cost-aware routing: you can put a cheap model first with an expensive model as the fallback, or vice versa depending on the phase's requirements.
The ralph-workflow.toml main config adds another layer — agent chain definitions with explicit model families:
[agent_chains]
planning = ["claude/opus"]
development = [
"opencode/minimax/MiniMax-M2.7-highspeed",
"codex",
"claude/sonnet",
]
analysis = ["opencode/openai/gpt-5.4"]
commit = ["claude/haiku"]
Planning uses Claude Opus for reasoning quality. Development starts with MiniMax highspeed for cost efficiency, falls back to Codex CLI, then Claude Sonnet. Analysis uses GPT-5.4 for a second-opinion review. Commit messages get Haiku because they're short and mechanical.
ralph-workflow.toml: the main config
The main config controls global behavior. Every field is commented with its default:
[general]
# verbosity = 2
# max_retries = 3
# retry_delay_ms = 1000
# backoff_multiplier = 2.0
# max_backoff_ms = 60000
# agent_idle_timeout_seconds = 300.0
[general.workflow]
# checkpoint_enabled = true
Uncomment and edit only what you need to change. Everything else keeps its documented default.
Key settings worth knowing:
- agent_idle_timeout_seconds (300) — how long before a stalled agent is killed
- agent_idle_max_waiting_on_child_seconds (1800) — hard ceiling on cumulative child-process waiting
- execution_history_limit (1000) — caps memory usage during long runs
mcp.toml: tool servers
The MCP config defines external tool servers that agents can call:
[[servers]]
name = "brave"
command = "npx"
args = ["-y", "@anthropic/mcp-server-brave-search"]
env = { BRAVE_API_KEY = "..." }
[[servers]]
name = "github"
command = "npx"
args = ["-y", "@anthropic/mcp-server-github"]
env = { GITHUB_PERSONAL_ACCESS_TOKEN = "..." }
Each server gets a name, a launch command with arguments, and environment variables. Agents reference servers by name, and the runner manages lifecycle — start, health check, request routing, and cleanup.
The mcp_config_as_upstreams() function in the runtime merges .agent/mcp.toml with environment-variable-defined servers, so you can layer team-wide servers (env vars) with project-specific ones (TOML).
Customizing your pipeline
Here's a realistic example: you want to add a linter phase between build and verify.
# In .agent/pipeline.toml — project-local override
[blocks.developer_iteration]
kind = "group"
child_blocks = [
"planning",
"planning_analysis",
"development",
"development_commit_cleanup",
"development_commit",
"lint",
"development_analysis",
"development_final_commit_cleanup",
"development_final_commit",
"complete",
"failed_terminal",
]
[blocks.lint]
kind = "individual"
phase_name = "lint"
[blocks.lint.phase]
drain = "analysis"
role = "analysis"
prompt_template = "lint_check.jinja"
[blocks.lint.phase.transitions]
on_success = "development_analysis"
on_failure = "development"
[blocks.lint.phase.loop_policy]
iteration_state_field = "development_analysis_iteration"
The key constraint: every phase name in child_blocks must have a corresponding [blocks.<name>] definition with a phase.drain that maps to a defined chain in agents.toml.
Parallel fan-out
The development phase supports parallel execution out of the box:
[blocks.development.phase.parallelization]
mode = "same_workspace"
max_parallel_workers = 8
max_work_units = 50
require_allowed_directories = true
post_fanout_verification = false
When parallelization is enabled:
- The planning phase produces a work-units artifact
- The development phase fans out — each work unit gets an isolated namespace within the workspace, up to 8 concurrent workers and 50 total units
- require_allowed_directories = true prevents workers from writing outside their designated namespace
- post_fanout_verification = false skips post-fan-out serial verification (enable it for safety-critical projects)
The parallel coordinator manages structured concurrency with run_fan_out() — it spawns workers, tracks completions and failures, and emits worker_started, worker_completed, and worker_failed events for the display layer. If a worker fails, its work unit gets reported in the summary, and other workers continue.
Checkpoint and resume
Checkpointing is built into the pipeline runner, not bolted on:
[general.workflow]
checkpoint_enabled = true
When enabled, the pipeline saves checkpoints at key boundaries. If a run is interrupted, you resume from the last checkpoint instead of starting over. The checkpoint path is derived from the workspace scope, and the save logic handles race conditions gracefully.
Checkpoint events (CHECKPOINT_SAVED) are emitted into the pipeline event stream, so the display layer can surface when a checkpoint was saved and what state it preserves.
Why TOML
TOML is the right configuration language for AI workflows for the same reason it works for Rust's Cargo: it is explicit, greppable, and diff-friendly. Every team member can read a pipeline.toml and understand the workflow without learning a DSL. Version control diffs are clean because TOML is line-oriented. And the structured table syntax ([blocks.development.phase.parallelization]) matches the hierarchical nature of pipeline configuration naturally.
Compare this to JSON config, where nested objects become unreadable past three levels, or YAML, where a stray indentation error silently changes the workflow semantics. TOML tells you exactly what went wrong at the line level.
Getting started
pip install ralph-workflow
ralph --init
# Edit .agent/pipeline.toml to customize phases
# Edit .agent/agents.toml to customize agent chains
# Run the workflow
ralph --task "Implement the feature described in MY_TASK.md"
The config files live in plain text next to your code. Commit them. Review them in PRs. Treat your workflow configuration the same way you treat your infrastructure config — because that's what it is.
Primary repo (Codeberg): codeberg.org/RalphWorkflow/Ralph-Workflow Mirror (GitHub): github.com/Ralph-Workflow/Ralph-Workflow Docs: ralphworkflow.com/docs
Related Posts
Your First Overnight Task with Ralph Workflow: A Start-Here Guide
The realistic playbook for handing a real task to an AI coding agent, walking away, and coming back to something you can actually review and merge. No hype. Just what works.
What Is an AI Agent Workflow Composer? Composable Multi-Agent Pipelines Explained
An AI agent workflow composer turns single-agent coding sessions into multi-agent pipelines you can plan, build, verify, and review — without giving up control of your tools or process.
The Overnight Coding Agent Pattern: Run AI Code Generation While You Sleep
The overnight coding agent pattern decouples AI code generation from developer attention. Learn how to run multi-agent coding pipelines unattended and wake up to reviewable, tested output — not a chat log.
Best evaluator path
Turn the idea into a real overnight test, not another saved tab.
Codeberg-first: open the primary repo, choose one bounded backlog task, run it tonight, and ask one question tomorrow morning — would I merge this? GitHub stays available as the mirror.
Open the primary Codeberg repo
Read the public source before you install anything.
Pick a first task
Use the guide to choose a bounded backlog item that is honest to review.
Install and run Ralph Workflow
Keep the machine awake, then decide in the morning whether the diff is good enough to merge.