Skip to main content
configuration pipeline tutorial workflow toml

TOML Workflow Configuration for AI Agents: A Complete Guide to pipeline.toml

Every serious AI agent workflow needs a config file you can version, diff, and reason about. Here's how Ralph Workflow's TOML pipeline config turns multi-agent orchestration into plain text you can edit in vim.

Codeberg-first

Ralph Workflow is free and open source. Inspect the primary repo on Codeberg before you install — or jump to the GitHub mirror.

The hardest part of multi-agent coding isn't writing the code. It's expressing what each agent should do, in what order, with which handoff artifacts — without this knowledge living entirely in your head or in a tangle of shell scripts.

Ralph Workflow solves this with TOML. Not YAML with its indentation voodoo. Not JSON with its unreadable nesting. Plain, version-control-friendly TOML that anyone on your team can read and edit.

This is a complete walkthrough of the TOML configuration surface — what each file does, how they compose, and how to customize the pipeline without breaking the default safety rails.

The eight-file config set

Ralph Workflow starts you with a clean layering of TOML files. Run ralph --init and review what lands:

File Location Purpose
ralph-workflow.toml ~/.config/ User-global main config
ralph-workflow-mcp.toml ~/.config/ MCP server definitions
ralph-workflow-pipeline.toml ~/.config/ Pipeline phase defaults
ralph-workflow-artifacts.toml ~/.config/ Artifact contract defaults
ralph-workflow.toml .agent/ Project-local override
mcp.toml .agent/ Project-local MCP override
pipeline.toml .agent/ Project-local pipeline phases
artifacts.toml .agent/ Project-local artifact contracts

Override precedence is straightforward: CLI flags > project-local (.agent/) > user-global (~/.config/) > bundled defaults. Want to change a phase for one project? Edit .agent/pipeline.toml. Want a change across all projects? Edit ~/.config/. The defaults in the installed package are always there as a reference.

Run ralph --regenerate-config to refresh all configs from bundled defaults — existing files get backed up to .bak before overwrite. Safe and reversible.

pipeline.toml: the heart of the workflow

The pipeline file defines your workflow as blocks with phases, transitions, and routing rules. Here's the default structure, annotated:

entry_block = "developer_iteration"
terminal_phase = "complete"

[loop_counters.development_analysis_iteration]
default_max = 10

[budget_counters.iteration]
default_max = 5

The pipeline knows two kinds of blocks:

  • Group blocks (kind = "group") — compose child blocks into a sequence with completion rules. The default developer_iteration block orchestrates planning → analysis → development → commit → analysis → final-commit → complete.
  • Individual blocks (kind = "individual") — a single phase with one agent assignment, one transition map, and one role.

A phase block, explained

[blocks.development]
kind = "individual"
phase_name = "development"
[blocks.development.phase]
drain = "development"
role = "execution"
prompt_template = "developer_iteration.jinja"
[blocks.development.phase.transitions]
on_success = "development_commit_cleanup"
on_loopback = "development"
[blocks.development.phase.parallelization]
mode = "same_workspace"
max_parallel_workers = 8
max_work_units = 50
require_allowed_directories = true

Each phase block answers four questions:

  1. What agent runs it? — The drain field maps to a drain name, which maps to an agent chain in agents.toml.
  2. What prompt does it use?prompt_template points to a Jinja2 template.
  3. Where does it go next?transitions defines on_success, on_loopback, and on_failure targets.
  4. Can it fan out?parallelization configures same-workspace parallel execution with worker and work-unit caps.

Loop policy and decision routing

Phases don't just run once and move on. Analysis phases in particular have loop policies that let the agent decide whether work is done:

[blocks.development_analysis.phase.loop_policy]
iteration_state_field = "development_analysis_iteration"

[blocks.development_analysis.phase.decisions.completed]
target = "development_final_commit_cleanup"
reset_loop = true

[blocks.development_analysis.phase.decisions.request_changes]
target = "development"
reset_loop = false

The analysis agent can respond with completed, request_changes, or failed. The pipeline routes accordingly — back to development for more work, forward to final commit if done, or to a failure terminal if unrecoverable. The loop counter (development_analysis_iteration, capped at 10 by default) prevents infinite cycling.

Recovery policy

When things go wrong, the recovery section defines what happens:

[recovery]
cycle_cap = 200
failed_route = "failed_terminal"
terminal_failure_phase = "failed_terminal"
preserve_session_on_categories = ["agent"]

A hard cap of 200 cycles prevents runaway loops. Agent failures preserve the session so you can resume from a checkpoint rather than starting over.

Post-commit budget routing

After a development commit, the pipeline checks the budget counter:

[[post_commit_routes]]
target = "planning"
[post_commit_routes.when]
phase = "development_final_commit"
budget_state = "remaining"

[[post_commit_routes]]
target = "complete"
[post_commit_routes.when]
phase = "development_final_commit"
budget_state = "exhausted"

If the iteration budget has remaining capacity, the pipeline loops back to planning for another development pass. If the budget is exhausted, it terminates with complete. This is how multi-iteration feature development works without manual intervention.

agents.toml: who runs what

The agents file defines chains (ordered fallback lists of agents) and drain bindings (which chain each pipeline drain uses):

[agent_chains.development]
agents = ["claude", "opencode"]
max_retries = 3
retry_delay_ms = 1000

[agent_drains.development]
chain = "development"
drain_class = "development"

The development drain first tries claude. If it fails after 3 retries with 1-second backoff, it falls back to opencode. This is cost-aware routing: you can put a cheap model first with an expensive model as the fallback, or vice versa depending on the phase's requirements.

The ralph-workflow.toml main config adds another layer — agent chain definitions with explicit model families:

[agent_chains]
planning = ["claude/opus"]
development = [
  "opencode/minimax/MiniMax-M2.7-highspeed",
  "codex",
  "claude/sonnet",
]
analysis = ["opencode/openai/gpt-5.4"]
commit = ["claude/haiku"]

Planning uses Claude Opus for reasoning quality. Development starts with MiniMax highspeed for cost efficiency, falls back to Codex CLI, then Claude Sonnet. Analysis uses GPT-5.4 for a second-opinion review. Commit messages get Haiku because they're short and mechanical.

ralph-workflow.toml: the main config

The main config controls global behavior. Every field is commented with its default:

[general]
# verbosity = 2
# max_retries = 3
# retry_delay_ms = 1000
# backoff_multiplier = 2.0
# max_backoff_ms = 60000
# agent_idle_timeout_seconds = 300.0

[general.workflow]
# checkpoint_enabled = true

Uncomment and edit only what you need to change. Everything else keeps its documented default.

Key settings worth knowing: - agent_idle_timeout_seconds (300) — how long before a stalled agent is killed - agent_idle_max_waiting_on_child_seconds (1800) — hard ceiling on cumulative child-process waiting - execution_history_limit (1000) — caps memory usage during long runs

mcp.toml: tool servers

The MCP config defines external tool servers that agents can call:

[[servers]]
name = "brave"
command = "npx"
args = ["-y", "@anthropic/mcp-server-brave-search"]
env = { BRAVE_API_KEY = "..." }

[[servers]]
name = "github"
command = "npx"
args = ["-y", "@anthropic/mcp-server-github"]
env = { GITHUB_PERSONAL_ACCESS_TOKEN = "..." }

Each server gets a name, a launch command with arguments, and environment variables. Agents reference servers by name, and the runner manages lifecycle — start, health check, request routing, and cleanup.

The mcp_config_as_upstreams() function in the runtime merges .agent/mcp.toml with environment-variable-defined servers, so you can layer team-wide servers (env vars) with project-specific ones (TOML).

Customizing your pipeline

Here's a realistic example: you want to add a linter phase between build and verify.

# In .agent/pipeline.toml — project-local override

[blocks.developer_iteration]
kind = "group"
child_blocks = [
  "planning",
  "planning_analysis",
  "development",
  "development_commit_cleanup",
  "development_commit",
  "lint",
  "development_analysis",
  "development_final_commit_cleanup",
  "development_final_commit",
  "complete",
  "failed_terminal",
]

[blocks.lint]
kind = "individual"
phase_name = "lint"
[blocks.lint.phase]
drain = "analysis"
role = "analysis"
prompt_template = "lint_check.jinja"
[blocks.lint.phase.transitions]
on_success = "development_analysis"
on_failure = "development"
[blocks.lint.phase.loop_policy]
iteration_state_field = "development_analysis_iteration"

The key constraint: every phase name in child_blocks must have a corresponding [blocks.<name>] definition with a phase.drain that maps to a defined chain in agents.toml.

Parallel fan-out

The development phase supports parallel execution out of the box:

[blocks.development.phase.parallelization]
mode = "same_workspace"
max_parallel_workers = 8
max_work_units = 50
require_allowed_directories = true
post_fanout_verification = false

When parallelization is enabled: - The planning phase produces a work-units artifact - The development phase fans out — each work unit gets an isolated namespace within the workspace, up to 8 concurrent workers and 50 total units - require_allowed_directories = true prevents workers from writing outside their designated namespace - post_fanout_verification = false skips post-fan-out serial verification (enable it for safety-critical projects)

The parallel coordinator manages structured concurrency with run_fan_out() — it spawns workers, tracks completions and failures, and emits worker_started, worker_completed, and worker_failed events for the display layer. If a worker fails, its work unit gets reported in the summary, and other workers continue.

Checkpoint and resume

Checkpointing is built into the pipeline runner, not bolted on:

[general.workflow]
checkpoint_enabled = true

When enabled, the pipeline saves checkpoints at key boundaries. If a run is interrupted, you resume from the last checkpoint instead of starting over. The checkpoint path is derived from the workspace scope, and the save logic handles race conditions gracefully.

Checkpoint events (CHECKPOINT_SAVED) are emitted into the pipeline event stream, so the display layer can surface when a checkpoint was saved and what state it preserves.

Why TOML

TOML is the right configuration language for AI workflows for the same reason it works for Rust's Cargo: it is explicit, greppable, and diff-friendly. Every team member can read a pipeline.toml and understand the workflow without learning a DSL. Version control diffs are clean because TOML is line-oriented. And the structured table syntax ([blocks.development.phase.parallelization]) matches the hierarchical nature of pipeline configuration naturally.

Compare this to JSON config, where nested objects become unreadable past three levels, or YAML, where a stray indentation error silently changes the workflow semantics. TOML tells you exactly what went wrong at the line level.

Getting started

pip install ralph-workflow
ralph --init
# Edit .agent/pipeline.toml to customize phases
# Edit .agent/agents.toml to customize agent chains
# Run the workflow
ralph --task "Implement the feature described in MY_TASK.md"

The config files live in plain text next to your code. Commit them. Review them in PRs. Treat your workflow configuration the same way you treat your infrastructure config — because that's what it is.


Primary repo (Codeberg): codeberg.org/RalphWorkflow/Ralph-Workflow Mirror (GitHub): github.com/Ralph-Workflow/Ralph-Workflow Docs: ralphworkflow.com/docs

Best evaluator path

Turn the idea into a real overnight test, not another saved tab.

Codeberg-first: open the primary repo, choose one bounded backlog task, run it tonight, and ask one question tomorrow morning — would I merge this? GitHub stays available as the mirror.

Open the primary Codeberg repo

Read the public source before you install anything.

Pick a first task

Use the guide to choose a bounded backlog item that is honest to review.

Install and run Ralph Workflow

Keep the machine awake, then decide in the morning whether the diff is good enough to merge.