Skip to main content

Building an AI Agent in 2026: Architecture Patterns and Tools

·APIScout Team
ai-agentstool-usevercel-ai-sdkmastramulti-agentagentic-ai2026

TL;DR

An AI agent is just an LLM in a loop that can call tools. The complexity comes from: deciding when to stop, handling tool errors gracefully, managing memory across steps, and preventing infinite loops. In 2026, you have three options: build your own loop (most control), use the Vercel AI SDK's maxSteps feature (easiest for simple agents), or use Mastra/OpenAI Agents SDK for multi-agent orchestration. The choice depends on complexity — single-task agents are easy, multi-step research agents are hard, multi-agent systems with coordination are very hard.

Key Takeaways

  • Agent = LLM + tools + loop — the fundamentals are simple; production is hard
  • Vercel AI SDK: maxSteps enables multi-turn tool use loops in 10 lines of code
  • OpenAI Agents SDK: handoffs between agents, guardrails, tracing — best for OpenAI-specific
  • Mastra: TypeScript-first, provider-agnostic, built-in memory and workflow support
  • Memory patterns: in-context (short-term), vector store (semantic recall), structured DB (facts)
  • The real challenge: error recovery, loop detection, cost caps, graceful degradation

The Core Agent Loop

Every agent starts here:

// The minimal agent loop:
import { generateText, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const result = await generateText({
  model: openai('gpt-4o'),
  maxSteps: 10,  // ← This is what makes it an agent (Vercel AI SDK)
  tools: {
    searchWeb: tool({
      description: 'Search the web for information',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        return await webSearch(query);
      },
    }),
    writeFile: tool({
      description: 'Write content to a file',
      parameters: z.object({
        filename: z.string(),
        content: z.string(),
      }),
      execute: async ({ filename, content }) => {
        await fs.writeFile(filename, content);
        return { success: true, filename };
      },
    }),
  },
  prompt: 'Research the top 5 AI companies in 2026 and save a report to ai-report.md',
});

console.log('Steps taken:', result.steps.length);
console.log('Final output:', result.text);

That's it. maxSteps: 10 tells the SDK to keep calling the LLM until it finishes (no more tool calls) or hits the step limit. Everything else is complexity management.


Memory: The Core Architecture Challenge

Agents need memory across steps, conversations, and sessions:

// Three memory types:

// 1. In-context memory (short-term, current conversation):
const conversationHistory: CoreMessage[] = [];

// 2. Vector store memory (semantic recall across sessions):
async function rememberFact(content: string, userId: string) {
  const embedding = await embed(content);
  await vectorStore.upsert([{
    id: crypto.randomUUID(),
    values: embedding,
    metadata: { content, userId, timestamp: Date.now() },
  }]);
}

async function recall(query: string, userId: string, limit = 5) {
  const embedding = await embed(query);
  const results = await vectorStore.query({
    vector: embedding,
    topK: limit,
    filter: { userId },
  });
  return results.matches.map((m) => m.metadata?.content as string);
}

// 3. Structured memory (facts, preferences, state):
interface AgentMemory {
  userId: string;
  preferences: Record<string, string>;
  completedTasks: string[];
  workingMemory: Record<string, unknown>;
}
// Full agent with memory:
async function runAgentWithMemory(userMessage: string, userId: string) {
  // Load relevant memories:
  const memories = await recall(userMessage, userId);
  const userPrefs = await db.agentMemory.findUnique({ where: { userId } });

  const systemPrompt = `You are a helpful assistant.
${memories.length > 0 ? `\nRelevant context from previous conversations:\n${memories.join('\n')}` : ''}
${userPrefs ? `\nUser preferences: ${JSON.stringify(userPrefs.preferences)}` : ''}`;

  const result = await generateText({
    model: openai('gpt-4o'),
    maxSteps: 5,
    system: systemPrompt,
    messages: [...conversationHistory, { role: 'user', content: userMessage }],
    tools: { /* ... */ },
    onStepFinish: async ({ text, toolResults }) => {
      // Optionally save important facts to memory during execution:
      if (text.includes('REMEMBER:')) {
        const factMatch = text.match(/REMEMBER: (.+)/);
        if (factMatch) await rememberFact(factMatch[1], userId);
      }
    },
  });

  // Save to conversation history:
  conversationHistory.push(
    { role: 'user', content: userMessage },
    { role: 'assistant', content: result.text }
  );

  return result.text;
}

Error Recovery

Production agents must handle tool failures gracefully:

// Resilient tool execution with retry + fallback:
function createResilientTool<TParams, TResult>(config: {
  name: string;
  description: string;
  parameters: z.ZodType<TParams>;
  execute: (params: TParams) => Promise<TResult>;
  fallback?: (params: TParams, error: Error) => TResult;
  maxRetries?: number;
}) {
  return tool({
    description: config.description,
    parameters: config.parameters,
    execute: async (params) => {
      const maxRetries = config.maxRetries ?? 2;
      let lastError: Error | undefined;

      for (let attempt = 0; attempt <= maxRetries; attempt++) {
        try {
          return await config.execute(params);
        } catch (err) {
          lastError = err instanceof Error ? err : new Error(String(err));
          if (attempt < maxRetries) {
            // Exponential backoff:
            await new Promise((r) => setTimeout(r, 1000 * Math.pow(2, attempt)));
          }
        }
      }

      // All retries exhausted — use fallback or return structured error:
      if (config.fallback) {
        return config.fallback(params as TParams, lastError!);
      }

      // Return error as data so LLM can handle it:
      return {
        error: true,
        message: lastError!.message,
        tool: config.name,
      } as TResult;
    },
  });
}

// Usage:
const searchWeb = createResilientTool({
  name: 'searchWeb',
  description: 'Search for information',
  parameters: z.object({ query: z.string() }),
  execute: async ({ query }) => externalSearchAPI(query),
  fallback: ({ query }) => ({ results: [], error: `Search unavailable for "${query}"` }),
  maxRetries: 2,
});

Multi-Agent Patterns

Coordinator + Specialists

// Pattern: one coordinator, multiple specialist agents

const specialists = {
  researcher: async (task: string) => {
    const result = await generateText({
      model: openai('gpt-4o'),
      system: 'You are a research specialist. Find information and cite sources.',
      prompt: task,
      maxSteps: 8,
      tools: { searchWeb, scrapeUrl },
    });
    return result.text;
  },

  writer: async (brief: string) => {
    const result = await generateText({
      model: openai('gpt-4o'),
      system: 'You are a technical writer. Write clear, structured content.',
      prompt: brief,
      maxSteps: 3,
      tools: { formatMarkdown },
    });
    return result.text;
  },

  reviewer: async (content: string) => {
    const { object } = await generateObject({
      model: openai('gpt-4o'),
      system: 'You are a quality reviewer. Check content for accuracy and clarity.',
      prompt: `Review this content:\n\n${content}`,
      schema: z.object({
        approved: z.boolean(),
        issues: z.array(z.string()),
        suggestions: z.array(z.string()),
      }),
    });
    return object;
  },
};

// Coordinator orchestrates the flow:
async function coordinateResearch(topic: string) {
  console.log('Step 1: Research');
  const research = await specialists.researcher(`Research: ${topic}`);

  console.log('Step 2: Write');
  const draft = await specialists.writer(`Write an article based on:\n${research}`);

  console.log('Step 3: Review');
  const review = await specialists.reviewer(draft);

  if (!review.approved) {
    console.log('Step 4: Revise');
    const revised = await specialists.writer(
      `Revise this article:\n${draft}\n\nIssues to fix:\n${review.issues.join('\n')}`
    );
    return revised;
  }

  return draft;
}

OpenAI Agents SDK Handoffs

// OpenAI Agents SDK (official, TypeScript):
import { Agent, Runner, handoff, guardrail } from 'openai/lib/agents';

const supportAgent = new Agent({
  name: 'Support Agent',
  instructions: 'You handle customer support questions. Escalate billing issues.',
  tools: [lookupOrder, updateTicket],
});

const billingAgent = new Agent({
  name: 'Billing Agent',
  instructions: 'You handle billing disputes and refunds. You have authority to issue credits.',
  tools: [lookupInvoice, issueCreditNote, processRefund],
});

const triageAgent = new Agent({
  name: 'Triage Agent',
  instructions: 'Route customer requests to the right specialist.',
  tools: [
    handoff({ agent: supportAgent, condition: 'For general support questions' }),
    handoff({ agent: billingAgent, condition: 'For billing, payment, or refund questions' }),
  ],
  guardrails: [
    guardrail({
      name: 'No PII in logs',
      check: (output) => !containsPII(output),
    }),
  ],
});

// Run:
const result = await Runner.run(triageAgent, 'I was charged twice last month');
console.log(result.finalOutput);
console.log('Handled by:', result.lastAgent.name);

Cost and Safety Controls

// Agent with cost cap and timeout:
class SafeAgent {
  private totalTokensUsed = 0;
  private maxTokens: number;
  private timeoutMs: number;

  constructor({ maxTokens = 50000, timeoutMs = 60000 } = {}) {
    this.maxTokens = maxTokens;
    this.timeoutMs = timeoutMs;
  }

  async run(task: string): Promise<string> {
    const startTime = Date.now();

    const result = await Promise.race([
      generateText({
        model: openai('gpt-4o'),
        maxSteps: 20,
        prompt: task,
        tools: { /* ... */ },
        onStepFinish: ({ usage }) => {
          this.totalTokensUsed += (usage?.totalTokens ?? 0);

          if (this.totalTokensUsed > this.maxTokens) {
            throw new Error(`Token budget exceeded: ${this.totalTokensUsed}/${this.maxTokens}`);
          }

          if (Date.now() - startTime > this.timeoutMs) {
            throw new Error(`Agent timeout after ${this.timeoutMs}ms`);
          }
        },
      }),
      new Promise<never>((_, reject) =>
        setTimeout(() => reject(new Error('Hard timeout')), this.timeoutMs + 5000)
      ),
    ]);

    console.log(`Task complete. Tokens used: ${this.totalTokensUsed}`);
    return result.text;
  }
}

Observability: What's Your Agent Doing?

// Log every step for debugging:
const result = await generateText({
  model: openai('gpt-4o'),
  maxSteps: 10,
  prompt: task,
  tools: { /* ... */ },
  onStepStart: ({ stepType }) => {
    console.log(`[Agent] Step: ${stepType}`);
  },
  onStepFinish: ({ stepType, text, toolResults, usage }) => {
    if (stepType === 'tool-result') {
      toolResults?.forEach((tr) => {
        console.log(`[Tool] ${tr.toolName}(${JSON.stringify(tr.args)}) → ${JSON.stringify(tr.result)}`);
      });
    }
    if (text) console.log(`[LLM] ${text.slice(0, 200)}...`);
    if (usage) console.log(`[Usage] ${usage.promptTokens}+${usage.completionTokens} tokens`);
  },
});

Discover AI APIs and agent frameworks at APIScout.

Comments