Skip to main content

How to Build a Multi-Provider AI App (OpenAI + Anthropic + Gemini)

·APIScout Team
ai apiopenaianthropicgeminitutorial

How to Build a Multi-Provider AI App (OpenAI + Anthropic + Gemini)

Using a single AI provider is a single point of failure. When OpenAI goes down, your app goes down. A multi-provider architecture gives you fallback options, cost optimization, and the ability to route tasks to whichever model handles them best.

What You'll Build

  • Unified interface for OpenAI, Anthropic, and Google Gemini
  • Automatic fallback when a provider is down
  • Task-based routing (use the best model for each task)
  • Cost optimization (route to cheapest provider that meets quality needs)
  • Streaming support across all providers

Prerequisites: Node.js 18+, API keys from at least 2 providers.

1. Setup

Install SDKs

npm install openai @anthropic-ai/sdk @google/generative-ai

Environment Variables

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...

Initialize Clients

// lib/ai-providers.ts
import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';
import { GoogleGenerativeAI } from '@google/generative-ai';

export const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
export const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export const google = new GoogleGenerativeAI(process.env.GOOGLE_AI_API_KEY!);

2. Unified Interface

Define Common Types

// lib/ai-types.ts
export interface AIMessage {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

export interface AIResponse {
  content: string;
  provider: 'openai' | 'anthropic' | 'google';
  model: string;
  usage: {
    inputTokens: number;
    outputTokens: number;
  };
  latencyMs: number;
}

export interface AIOptions {
  messages: AIMessage[];
  maxTokens?: number;
  temperature?: number;
  stream?: boolean;
}

Provider Adapters

// lib/adapters/openai-adapter.ts
import { openai } from '../ai-providers';
import { AIMessage, AIResponse, AIOptions } from '../ai-types';

export async function callOpenAI(options: AIOptions): Promise<AIResponse> {
  const start = Date.now();

  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: options.messages,
    max_tokens: options.maxTokens ?? 1024,
    temperature: options.temperature ?? 0.7,
  });

  return {
    content: response.choices[0].message.content ?? '',
    provider: 'openai',
    model: 'gpt-4o',
    usage: {
      inputTokens: response.usage?.prompt_tokens ?? 0,
      outputTokens: response.usage?.completion_tokens ?? 0,
    },
    latencyMs: Date.now() - start,
  };
}
// lib/adapters/anthropic-adapter.ts
import { anthropic } from '../ai-providers';
import { AIMessage, AIResponse, AIOptions } from '../ai-types';

export async function callAnthropic(options: AIOptions): Promise<AIResponse> {
  const start = Date.now();
  const systemMessage = options.messages.find(m => m.role === 'system');
  const chatMessages = options.messages.filter(m => m.role !== 'system');

  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    system: systemMessage?.content,
    messages: chatMessages.map(m => ({
      role: m.role as 'user' | 'assistant',
      content: m.content,
    })),
    max_tokens: options.maxTokens ?? 1024,
    temperature: options.temperature ?? 0.7,
  });

  const textBlock = response.content.find(b => b.type === 'text');

  return {
    content: textBlock?.text ?? '',
    provider: 'anthropic',
    model: 'claude-sonnet-4-20250514',
    usage: {
      inputTokens: response.usage.input_tokens,
      outputTokens: response.usage.output_tokens,
    },
    latencyMs: Date.now() - start,
  };
}
// lib/adapters/google-adapter.ts
import { google } from '../ai-providers';
import { AIMessage, AIResponse, AIOptions } from '../ai-types';

export async function callGoogle(options: AIOptions): Promise<AIResponse> {
  const start = Date.now();
  const model = google.getGenerativeModel({ model: 'gemini-2.0-flash' });

  const systemMessage = options.messages.find(m => m.role === 'system');
  const chatMessages = options.messages.filter(m => m.role !== 'system');

  const chat = model.startChat({
    systemInstruction: systemMessage?.content,
    history: chatMessages.slice(0, -1).map(m => ({
      role: m.role === 'assistant' ? 'model' : 'user',
      parts: [{ text: m.content }],
    })),
  });

  const lastMessage = chatMessages[chatMessages.length - 1];
  const result = await chat.sendMessage(lastMessage.content);

  return {
    content: result.response.text(),
    provider: 'google',
    model: 'gemini-2.0-flash',
    usage: {
      inputTokens: result.response.usageMetadata?.promptTokenCount ?? 0,
      outputTokens: result.response.usageMetadata?.candidatesTokenCount ?? 0,
    },
    latencyMs: Date.now() - start,
  };
}

3. Router

Fallback Chain

// lib/ai-router.ts
import { callOpenAI } from './adapters/openai-adapter';
import { callAnthropic } from './adapters/anthropic-adapter';
import { callGoogle } from './adapters/google-adapter';
import { AIOptions, AIResponse } from './ai-types';

type Provider = 'openai' | 'anthropic' | 'google';

const providerMap = {
  openai: callOpenAI,
  anthropic: callAnthropic,
  google: callGoogle,
};

export async function callAI(
  options: AIOptions,
  providers: Provider[] = ['anthropic', 'openai', 'google']
): Promise<AIResponse> {
  let lastError: Error | null = null;

  for (const provider of providers) {
    try {
      const result = await providerMap[provider](options);
      return result;
    } catch (error: any) {
      console.error(`${provider} failed:`, error.message);
      lastError = error;
      // Continue to next provider
    }
  }

  throw new Error(`All providers failed. Last error: ${lastError?.message}`);
}

Task-Based Routing

// lib/ai-router.ts
type TaskType = 'code' | 'analysis' | 'creative' | 'simple' | 'long-context';

const taskRouting: Record<TaskType, Provider[]> = {
  code: ['anthropic', 'openai', 'google'],      // Claude excels at code
  analysis: ['anthropic', 'openai', 'google'],   // Claude for careful analysis
  creative: ['openai', 'anthropic', 'google'],   // GPT-4o for creative tasks
  simple: ['google', 'openai', 'anthropic'],     // Gemini Flash for simple/cheap
  'long-context': ['google', 'anthropic', 'openai'], // Gemini for long context
};

export async function callAIForTask(
  options: AIOptions,
  task: TaskType
): Promise<AIResponse> {
  return callAI(options, taskRouting[task]);
}

Cost-Optimized Routing

// Approximate costs per 1M tokens (input/output)
const providerCosts = {
  openai: { input: 2.50, output: 10.00 },     // GPT-4o
  anthropic: { input: 3.00, output: 15.00 },  // Claude Sonnet
  google: { input: 0.075, output: 0.30 },     // Gemini Flash
};

export async function callAICheap(options: AIOptions): Promise<AIResponse> {
  // Try cheapest first, fall back to more expensive
  return callAI(options, ['google', 'openai', 'anthropic']);
}

export async function callAIBest(options: AIOptions): Promise<AIResponse> {
  // Try highest quality first
  return callAI(options, ['anthropic', 'openai', 'google']);
}

4. API Route

// app/api/ai/route.ts
import { NextResponse } from 'next/server';
import { callAIForTask } from '@/lib/ai-router';

export async function POST(req: Request) {
  const { messages, task = 'simple' } = await req.json();

  try {
    const response = await callAIForTask({ messages }, task);

    return NextResponse.json({
      content: response.content,
      metadata: {
        provider: response.provider,
        model: response.model,
        latencyMs: response.latencyMs,
        usage: response.usage,
      },
    });
  } catch (error: any) {
    return NextResponse.json(
      { error: 'All AI providers failed', details: error.message },
      { status: 503 }
    );
  }
}

5. Cost Comparison

ModelInput (per 1M tokens)Output (per 1M tokens)Best For
GPT-4o$2.50$10.00General purpose, creative
Claude Sonnet$3.00$15.00Code, analysis, instruction-following
Gemini Flash$0.075$0.30High-volume, cost-sensitive
GPT-4o mini$0.15$0.60Budget alternative to GPT-4o
Claude Haiku$0.25$1.25Budget alternative to Sonnet

Example: 1M input + 200K output tokens/month:

  • Gemini Flash: $0.14
  • GPT-4o mini: $0.27
  • Claude Haiku: $0.50
  • GPT-4o: $4.50
  • Claude Sonnet: $6.00

6. Monitoring

Track provider performance:

// lib/ai-monitor.ts
interface ProviderMetrics {
  totalCalls: number;
  failures: number;
  avgLatencyMs: number;
  totalCost: number;
}

const metrics: Record<string, ProviderMetrics> = {};

export function recordCall(provider: string, latencyMs: number, tokens: number, failed: boolean) {
  if (!metrics[provider]) {
    metrics[provider] = { totalCalls: 0, failures: 0, avgLatencyMs: 0, totalCost: 0 };
  }
  const m = metrics[provider];
  m.totalCalls++;
  if (failed) m.failures++;
  m.avgLatencyMs = (m.avgLatencyMs * (m.totalCalls - 1) + latencyMs) / m.totalCalls;
}

Common Mistakes

MistakeImpactFix
No fallback chainApp breaks when one provider is downAlways have 2+ providers configured
Same model for every taskOverpaying for simple tasksRoute by task complexity
Not tracking costs per providerBudget surprisesLog tokens + cost per request
Not handling streaming differencesInconsistent UXUnified streaming adapter
Ignoring rate limits429 errors cascadePer-provider rate limiting

Choosing an AI API? Compare OpenAI vs Anthropic vs Google Gemini on APIScout — pricing, quality, and performance benchmarks.

Comments