Orchestrate Claude Code, Codex, and Gemini alongside your agents

Dispatch a task to a coding agent — Claude Code, Codex, Gemini CLI, or OpenCode — and get back schema-validated output with cost and turn caps.

Use a coding agent as a step inside a regular reasoner. The harness spawns Claude Code (or Codex, Gemini, OpenCode), gives it tool access, enforces a cost cap, and returns a Pydantic-validated object — not free-form text.

from pydantic import BaseModel
from agentfield import Agent, HarnessConfig

app = Agent(
    node_id="migrator",
    harness_config=HarnessConfig(provider="claude-code", model="sonnet"),
)

class MigrationPlan(BaseModel):
    sql_statements: list[str]
    rollback_steps: list[str]
    risk_assessment: str

@app.reasoner()
async def plan_migration(description: str) -> dict:
    # Coding agent reads the schema, writes SQL, validates against the Pydantic model
    result = await app.harness(
        f"Analyze the database schema and produce a migration plan for: {description}",
        schema=MigrationPlan,
        max_budget_usd=1.00,   # hard cost cap
        max_turns=20,
    )

    if result.is_error:
        return {"error": result.failure_type, "message": result.error_message}

    plan: MigrationPlan = result.parsed
    return {
        "sql": plan.sql_statements,
        "rollback": plan.rollback_steps,
        "cost_usd": result.cost_usd,
        "turns": result.num_turns,
    }

# Swap providers per-call — Codex for test generation, Gemini for big refactors
@app.reasoner()
async def write_tests(module: str) -> dict:
    result = await app.harness(
        f"Write a comprehensive test suite for {module}.",
        provider="codex",
        model="o4-mini",
        max_turns=40,
    )
    return {"output": result.text, "cost_usd": result.cost_usd}

app.run()

What this gives you

A multi-turn coding agent looks like a regular function call — input goes in, validated object comes out.
Cost and turn caps prevent runaway spending. The agent stops cleanly when either is hit.
failure_type distinguishes timeouts, crashes, schema-validation failures, and API errors.

Next

Harness reference
Pair with: Multi-step human approval