The ctx.ai() method provides a unified interface to LLMs with support for structured output via Zod schemas, streaming responses, and automatic rate limit handling with exponential backoff.

Basic Usage

agent.reasoner('analyze', async (ctx) => {
  // Simple text generation
  const response = await ctx.ai('Explain quantum computing in simple terms');
  return { explanation: response };
});

With System Prompt

agent.reasoner('translate', async (ctx) => {
  const { text, targetLanguage } = ctx.input;

  const translation = await ctx.ai(
    `Translate to ${targetLanguage}: ${text}`,
    { system: 'You are a professional translator.' }
  );

  return { translation };
});

Structured Output with Zod

Get type-safe structured responses using Zod schemas:

import { z } from 'zod';

const SentimentSchema = z.object({
  sentiment: z.enum(['positive', 'negative', 'neutral']),
  confidence: z.number().min(0).max(1),
  keywords: z.array(z.string()),
  reasoning: z.string()
});

agent.reasoner('analyze_sentiment', async (ctx) => {
  const result = await ctx.ai(
    `Analyze the sentiment: ${ctx.input.text}`,
    { schema: SentimentSchema }
  );

  // result is fully typed as:
  // { sentiment: 'positive'|'negative'|'neutral', confidence: number, keywords: string[], reasoning: string }
  return result;
});

When using a schema, the response is automatically parsed and validated. Invalid responses trigger automatic retries.

Options

Prop

Type

Streaming

Use ctx.aiStream() for real-time streaming responses:

agent.reasoner('generate_story', async (ctx) => {
  const { prompt } = ctx.input;

  const stream = await ctx.aiStream(
    `Write a short story about: ${prompt}`,
    { system: 'You are a creative storyteller.' }
  );

  let fullResponse = '';
  for await (const chunk of stream) {
    fullResponse += chunk;
    // Each chunk is a string fragment
  }

  return { story: fullResponse };
});

Streaming to HTTP Response

agent.reasoner('stream_response', async (ctx) => {
  const stream = await ctx.aiStream(ctx.input.prompt);

  ctx.res.setHeader('Content-Type', 'text/event-stream');
  ctx.res.setHeader('Cache-Control', 'no-cache');

  for await (const chunk of stream) {
    ctx.res.write(`data: ${JSON.stringify({ chunk })}\n\n`);
  }

  ctx.res.end();
  return null; // Response already sent
});

Rate Limiting

The SDK includes automatic rate limit handling with exponential backoff:

const agent = new Agent({
  nodeId: 'my-agent',
  aiConfig: {
    model: 'gpt-4o',
    enableRateLimitRetry: true,      // Enable automatic retries
    rateLimitMaxRetries: 20,          // Maximum retry attempts
    rateLimitBaseDelay: 1.0,          // Initial delay (seconds)
    rateLimitMaxDelay: 300.0,         // Maximum delay (seconds)
    rateLimitJitterFactor: 0.25,      // ±25% jitter
    rateLimitCircuitBreakerThreshold: 10,  // Open circuit after 10 failures
    rateLimitCircuitBreakerTimeout: 300    // Reset after 5 minutes
  }
});

The rate limiter automatically detects 429 responses and Retry-After headers, applying exponential backoff with jitter to prevent thundering herd problems.

Direct AIClient Access

Access the AIClient directly for advanced usage:

agent.reasoner('advanced', async (ctx) => {
  const aiClient = ctx.aiClient;

  // Generate with full control
  const response = await aiClient.generate('Hello', {
    model: 'gpt-4o-mini',
    temperature: 0.5
  });

  return response;
});

AIClient Methods

Prop

Type

Embeddings

Generate embeddings for semantic search:

agent.reasoner('semantic_search', async (ctx) => {
  const { query, documents } = ctx.input;

  // Embed query
  const queryEmbedding = await ctx.aiClient.embed(query);

  // Embed documents
  const docEmbeddings = await ctx.aiClient.embedMany(documents);

  // Find most similar (cosine similarity)
  const similarities = docEmbeddings.map((emb, i) => ({
    index: i,
    score: cosineSimilarity(queryEmbedding, emb)
  }));

  similarities.sort((a, b) => b.score - a.score);

  return {
    query,
    topMatches: similarities.slice(0, 5).map(s => ({
      document: documents[s.index],
      score: s.score
    }))
  };
});

Examples

Multi-step Analysis

import { z } from 'zod';

const ExtractSchema = z.object({
  entities: z.array(z.object({
    name: z.string(),
    type: z.string()
  })),
  topics: z.array(z.string())
});

const SummarySchema = z.object({
  summary: z.string(),
  keyPoints: z.array(z.string())
});

agent.reasoner('deep_analyze', async (ctx) => {
  const { document } = ctx.input;

  // Step 1: Extract entities
  const extracted = await ctx.ai(
    `Extract entities and topics from:\n${document}`,
    { schema: ExtractSchema }
  );

  // Step 2: Generate summary
  const summary = await ctx.ai(
    `Summarize focusing on: ${extracted.topics.join(', ')}\n\nDocument:\n${document}`,
    { schema: SummarySchema }
  );

  return {
    ...extracted,
    ...summary
  };
});

Different Models for Different Tasks

agent.reasoner('smart_routing', async (ctx) => {
  const { task, content } = ctx.input;

  // Simple tasks use cheaper model
  if (task === 'classify') {
    return await ctx.ai(
      `Classify: ${content}`,
      { model: 'gpt-4o-mini', temperature: 0 }
    );
  }

  // Complex tasks use powerful model
  return await ctx.ai(
    `Analyze deeply: ${content}`,
    { model: 'gpt-4o', temperature: 0.7 }
  );
});

Configuration - AIConfig options
ReasonerContext - Full context reference
ctx.memory - Store AI results