Versioning & Lifecycle

Agent version tracking, lifecycle states, heartbeat monitoring, and lease-based presence

Canary deployment — one version string change

A full lifecycle state machine, lease-based presence detection, and fleet-wide health monitoring -- built into every agent.

Every agent moves through a defined lifecycle (starting -> ready -> degraded -> offline) and proves it is alive via lease-based heartbeats. The control plane automatically detects unresponsive agents, transitions them through degraded states, and evicts them after configurable TTLs. Run two versions side by side, monitor your entire fleet in one API call, and drain gracefully on shutdown -- no external health checker needed.

from agentfield import Agent

# Blue-green: run two versions side by side, each with its own lifecycle
app_v2 = Agent(
    node_id="processor-blue",
    version="2.0.0",
    tags=["processor", "stable"],
)

app_v3 = Agent(
    node_id="processor-green",
    version="3.0.0",
    tags=["processor", "canary"],
)

# Lifecycle is automatic:
#   startup  → starting → ready  (heartbeats begin)
#   trouble  → ready → degraded            (missed health checks)
#   recovery → degraded → ready            (health checks pass again)
#   shutdown → drains active executions → releases lease → offline

# The SDK handles graceful shutdown automatically (SIGTERM/SIGINT)

import { Agent } from '@agentfield/sdk';

// Version + tags reported at startup and every heartbeat automatically
const agent = new Agent({
  nodeId: 'processor-green',
  version: '3.0.0',
  tags: ['processor', 'canary'],
});

// Lifecycle: starting → ready → degraded → offline
// Heartbeat TTL: 5min | Sweep: 30s | Hard evict: 30min
// All configurable. No external health checker needed.

# Bulk fleet health — version, state, and last heartbeat for every node
curl -X POST http://localhost:8080/api/v1/nodes/status/bulk \
  -H "Content-Type: application/json" \
  -d '{"node_ids": ["processor-blue", "processor-green", "analyzer", "router"]}'
# → {
#   "processor-blue":  {"status":"ready",    "version":"2.0.0", "last_heartbeat":"...05Z"},
#   "processor-green": {"status":"ready",    "version":"3.0.0", "last_heartbeat":"...05Z"},
#   "analyzer":        {"status":"degraded", "version":"1.2.0", "last_heartbeat":"...01Z"},
#   "router":          {"status":"offline",  "version":"1.0.0", "last_heartbeat":"...00Z"}
# }

# Single node deep-dive
curl http://localhost:8080/api/v1/nodes/processor-green/status

# Force immediate reconciliation (bypass 5min cache)
curl -X POST http://localhost:8080/api/v1/nodes/processor-green/status/refresh

# Graceful shutdown — drain active executions, release lease, deregister
curl -X POST http://localhost:8080/api/v1/nodes/processor-green/shutdown

What just happened

Version and lifecycle state were reported to the control plane automatically
Presence was tracked through leases and heartbeats instead of custom health plumbing
Fleet status could be queried centrally without logging into each node

Example fleet snapshot:

{
  "processor-blue": { "status": "ready", "version": "2.0.0" },
  "processor-green": { "status": "ready", "version": "3.0.0" },
  "analyzer": { "status": "degraded", "version": "1.2.0" }
}

Versioning & Lifecycle

What you get

Setting the version

Lifecycle states

Heartbeat protocol

Health monitor and configuration

Patterns

API reference

On this page