Sessions
Realtime and multimodal ingress for agent workflows, routed through the AgentField control plane.
Sessions are long-lived entrypoints for realtime or multimodal interactions. Use them when the caller is not making a single request-response call, but opening an interaction that may include audio, text, tool calls, and multiple turns.
The important boundary is the control plane:
- The browser or external client starts an AgentField session.
- The control plane owns the provider boundary and realtime transport.
- Agent work still happens through reasoners and
app.call. - Tool calls from the session are routed through
execute/asyncwith the session ID attached.
That means a voice call can still produce normal AgentField workflows, DAGs, replay surfaces, and provenance records.
Explicit provider and transport
AgentField validates provider and transport, but does not infer one from the other or switch providers for you. If a provider does not support a transport, the control plane returns a clear validation error.
from agentfield import Agent
app = Agent(node_id="voice-support-af")
@app.session(
"voice",
provider="openai",
model="gpt-realtime-2",
transport="webrtc",
modalities=["audio", "text"],
voice="marin",
tools=["support.resolve_voice_turn"],
tags=["support:voice", "pii:limited"],
)
async def voice(session):
turn = await session.input()
result = await session.call("support.resolve_voice_turn", turn=turn)
await session.say(result["spoken_response"])import { Agent } from "@agentfield/sdk";
const agent = new Agent({ nodeId: "voice-support-af" });
agent.session("voice", {
provider: "openai",
model: "gpt-realtime-2",
transport: "webrtc",
modalities: ["audio", "text"],
voice: "marin",
tools: ["support.resolveVoiceTurn"],
tags: ["support:voice", "pii:limited"],
}, async (session) => {
const turn = await session.input();
const result = await session.call("support.resolveVoiceTurn", { turn });
await session.say(result.spokenResponse);
});app.RegisterSession("voice", "openai", "webrtc",
agent.WithSessionModel("gpt-realtime-2"),
agent.WithSessionModalities("audio", "text"),
agent.WithSessionVoice("marin"),
agent.WithSessionTools("support.resolve_voice_turn"),
agent.WithSessionTags("support:voice", "pii:limited"),
)Sessions vs Reasoners
Reasoners are callable units of work. Sessions are ingress surfaces that can call reasoners.
| Primitive | Use it for | Lifecycle |
|---|---|---|
| Reasoner | One typed decision, workflow step, or API capability | Request-response or async execution |
| Session | Realtime voice, live text, multimodal turns, browser calls | Long-lived interaction with provider setup |
Most applications use both: the session handles realtime input and output, while reasoners do the structured business work.
What tools Means
The tools option is not how your session handler gets access to reasoners. Your handler can call reasoners directly with session.call(...) or app.call(...).
@app.session("voice", provider="openai", transport="webrtc")
async def voice(session):
turn = await session.input()
result = await session.call("support.resolve_voice_turn", turn=turn)
await session.say(result["spoken_response"])Use tools=[...] when the realtime loop itself needs a provider-visible allowlist of AgentField capabilities. For example, during a live audio call, the realtime model may decide it needs to look up an order or request an approval before answering. The tool allowlist tells AgentField which targets may be invoked through the session tool endpoint.
@app.session(
"voice",
provider="openai",
transport="webrtc",
tools=[
"orders.lookup_order",
"refunds.request_approval",
],
)
async def voice(session):
...Each entry is an AgentField target. It is exposed to the live session as an allowed tool, then routed back through the control plane:
provider/client tool call
-> POST /api/v1/session-instances/:session_id/tools/:tool
-> POST /api/v1/execute/async/:targetSo the distinction is:
| API | Who decides to call it? | What it does |
|---|---|---|
session.call(...) | Your session handler code | Calls a reasoner or skill directly from the handler |
tools=[...] | The realtime provider/client tool loop | Allows selected AgentField targets to be invoked autonomously during the live session |
Tags and Access Control
Sessions can declare tags just like reasoners and skills. Those tags are proposed at registration time, approved through the same access-management flow, and included in the target tag set used when a caller starts the session.
@app.session(
"voice",
provider="openai",
transport="webrtc",
tags=["support:voice", "pii:limited"],
)
async def voice(session):
...Use session tags for ingress-level policy: who can start the live session, which data class the session may touch, or which team owns the interaction. Use reasoner and skill tags for the work the session calls after it starts.
Control-Plane Flow
browser/client
-> POST /api/v1/session-targets/:target/start
-> POST /api/v1/session-instances/:session_id/realtime-offer
-> POST /api/v1/session-instances/:session_id/tools/:tool
-> POST /api/v1/execute/async/:targetThe session itself is not a shortcut around AgentField. It is a control-plane entrypoint that keeps the realtime provider boundary separate from the reasoner workflow boundary.
Provider and Transport Matrix
| Provider | Transport | Use it for |
|---|---|---|
openai | webrtc | Browser realtime voice and audio sessions |
openai | websocket | Server-side realtime sessions |
openrouter | audio_turns | Turn-based audio input/output calls |
Unsupported combinations fail early with an error like:
Unsupported session transport 'webrtc' for provider 'openrouter'. Supported transports: audio_turns. AgentField does not infer or switch providers; set provider and transport explicitly.