How to Build a Multi-Provider AI App (OpenAI + Anthropic + Gemini)
·APIScout Team
ai apiopenaianthropicgeminitutorial
How to Build a Multi-Provider AI App (OpenAI + Anthropic + Gemini)
Using a single AI provider is a single point of failure. When OpenAI goes down, your app goes down. A multi-provider architecture gives you fallback options, cost optimization, and the ability to route tasks to whichever model handles them best.
What You'll Build
- Unified interface for OpenAI, Anthropic, and Google Gemini
- Automatic fallback when a provider is down
- Task-based routing (use the best model for each task)
- Cost optimization (route to cheapest provider that meets quality needs)
- Streaming support across all providers
Prerequisites: Node.js 18+, API keys from at least 2 providers.
1. Setup
Install SDKs
npm install openai @anthropic-ai/sdk @google/generative-ai
Environment Variables
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...
Initialize Clients
// lib/ai-providers.ts
import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';
import { GoogleGenerativeAI } from '@google/generative-ai';
export const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
export const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export const google = new GoogleGenerativeAI(process.env.GOOGLE_AI_API_KEY!);
2. Unified Interface
Define Common Types
// lib/ai-types.ts
export interface AIMessage {
role: 'system' | 'user' | 'assistant';
content: string;
}
export interface AIResponse {
content: string;
provider: 'openai' | 'anthropic' | 'google';
model: string;
usage: {
inputTokens: number;
outputTokens: number;
};
latencyMs: number;
}
export interface AIOptions {
messages: AIMessage[];
maxTokens?: number;
temperature?: number;
stream?: boolean;
}
Provider Adapters
// lib/adapters/openai-adapter.ts
import { openai } from '../ai-providers';
import { AIMessage, AIResponse, AIOptions } from '../ai-types';
export async function callOpenAI(options: AIOptions): Promise<AIResponse> {
const start = Date.now();
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: options.messages,
max_tokens: options.maxTokens ?? 1024,
temperature: options.temperature ?? 0.7,
});
return {
content: response.choices[0].message.content ?? '',
provider: 'openai',
model: 'gpt-4o',
usage: {
inputTokens: response.usage?.prompt_tokens ?? 0,
outputTokens: response.usage?.completion_tokens ?? 0,
},
latencyMs: Date.now() - start,
};
}
// lib/adapters/anthropic-adapter.ts
import { anthropic } from '../ai-providers';
import { AIMessage, AIResponse, AIOptions } from '../ai-types';
export async function callAnthropic(options: AIOptions): Promise<AIResponse> {
const start = Date.now();
const systemMessage = options.messages.find(m => m.role === 'system');
const chatMessages = options.messages.filter(m => m.role !== 'system');
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
system: systemMessage?.content,
messages: chatMessages.map(m => ({
role: m.role as 'user' | 'assistant',
content: m.content,
})),
max_tokens: options.maxTokens ?? 1024,
temperature: options.temperature ?? 0.7,
});
const textBlock = response.content.find(b => b.type === 'text');
return {
content: textBlock?.text ?? '',
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
usage: {
inputTokens: response.usage.input_tokens,
outputTokens: response.usage.output_tokens,
},
latencyMs: Date.now() - start,
};
}
// lib/adapters/google-adapter.ts
import { google } from '../ai-providers';
import { AIMessage, AIResponse, AIOptions } from '../ai-types';
export async function callGoogle(options: AIOptions): Promise<AIResponse> {
const start = Date.now();
const model = google.getGenerativeModel({ model: 'gemini-2.0-flash' });
const systemMessage = options.messages.find(m => m.role === 'system');
const chatMessages = options.messages.filter(m => m.role !== 'system');
const chat = model.startChat({
systemInstruction: systemMessage?.content,
history: chatMessages.slice(0, -1).map(m => ({
role: m.role === 'assistant' ? 'model' : 'user',
parts: [{ text: m.content }],
})),
});
const lastMessage = chatMessages[chatMessages.length - 1];
const result = await chat.sendMessage(lastMessage.content);
return {
content: result.response.text(),
provider: 'google',
model: 'gemini-2.0-flash',
usage: {
inputTokens: result.response.usageMetadata?.promptTokenCount ?? 0,
outputTokens: result.response.usageMetadata?.candidatesTokenCount ?? 0,
},
latencyMs: Date.now() - start,
};
}
3. Router
Fallback Chain
// lib/ai-router.ts
import { callOpenAI } from './adapters/openai-adapter';
import { callAnthropic } from './adapters/anthropic-adapter';
import { callGoogle } from './adapters/google-adapter';
import { AIOptions, AIResponse } from './ai-types';
type Provider = 'openai' | 'anthropic' | 'google';
const providerMap = {
openai: callOpenAI,
anthropic: callAnthropic,
google: callGoogle,
};
export async function callAI(
options: AIOptions,
providers: Provider[] = ['anthropic', 'openai', 'google']
): Promise<AIResponse> {
let lastError: Error | null = null;
for (const provider of providers) {
try {
const result = await providerMap[provider](options);
return result;
} catch (error: any) {
console.error(`${provider} failed:`, error.message);
lastError = error;
// Continue to next provider
}
}
throw new Error(`All providers failed. Last error: ${lastError?.message}`);
}
Task-Based Routing
// lib/ai-router.ts
type TaskType = 'code' | 'analysis' | 'creative' | 'simple' | 'long-context';
const taskRouting: Record<TaskType, Provider[]> = {
code: ['anthropic', 'openai', 'google'], // Claude excels at code
analysis: ['anthropic', 'openai', 'google'], // Claude for careful analysis
creative: ['openai', 'anthropic', 'google'], // GPT-4o for creative tasks
simple: ['google', 'openai', 'anthropic'], // Gemini Flash for simple/cheap
'long-context': ['google', 'anthropic', 'openai'], // Gemini for long context
};
export async function callAIForTask(
options: AIOptions,
task: TaskType
): Promise<AIResponse> {
return callAI(options, taskRouting[task]);
}
Cost-Optimized Routing
// Approximate costs per 1M tokens (input/output)
const providerCosts = {
openai: { input: 2.50, output: 10.00 }, // GPT-4o
anthropic: { input: 3.00, output: 15.00 }, // Claude Sonnet
google: { input: 0.075, output: 0.30 }, // Gemini Flash
};
export async function callAICheap(options: AIOptions): Promise<AIResponse> {
// Try cheapest first, fall back to more expensive
return callAI(options, ['google', 'openai', 'anthropic']);
}
export async function callAIBest(options: AIOptions): Promise<AIResponse> {
// Try highest quality first
return callAI(options, ['anthropic', 'openai', 'google']);
}
4. API Route
// app/api/ai/route.ts
import { NextResponse } from 'next/server';
import { callAIForTask } from '@/lib/ai-router';
export async function POST(req: Request) {
const { messages, task = 'simple' } = await req.json();
try {
const response = await callAIForTask({ messages }, task);
return NextResponse.json({
content: response.content,
metadata: {
provider: response.provider,
model: response.model,
latencyMs: response.latencyMs,
usage: response.usage,
},
});
} catch (error: any) {
return NextResponse.json(
{ error: 'All AI providers failed', details: error.message },
{ status: 503 }
);
}
}
5. Cost Comparison
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | General purpose, creative |
| Claude Sonnet | $3.00 | $15.00 | Code, analysis, instruction-following |
| Gemini Flash | $0.075 | $0.30 | High-volume, cost-sensitive |
| GPT-4o mini | $0.15 | $0.60 | Budget alternative to GPT-4o |
| Claude Haiku | $0.25 | $1.25 | Budget alternative to Sonnet |
Example: 1M input + 200K output tokens/month:
- Gemini Flash: $0.14
- GPT-4o mini: $0.27
- Claude Haiku: $0.50
- GPT-4o: $4.50
- Claude Sonnet: $6.00
6. Monitoring
Track provider performance:
// lib/ai-monitor.ts
interface ProviderMetrics {
totalCalls: number;
failures: number;
avgLatencyMs: number;
totalCost: number;
}
const metrics: Record<string, ProviderMetrics> = {};
export function recordCall(provider: string, latencyMs: number, tokens: number, failed: boolean) {
if (!metrics[provider]) {
metrics[provider] = { totalCalls: 0, failures: 0, avgLatencyMs: 0, totalCost: 0 };
}
const m = metrics[provider];
m.totalCalls++;
if (failed) m.failures++;
m.avgLatencyMs = (m.avgLatencyMs * (m.totalCalls - 1) + latencyMs) / m.totalCalls;
}
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| No fallback chain | App breaks when one provider is down | Always have 2+ providers configured |
| Same model for every task | Overpaying for simple tasks | Route by task complexity |
| Not tracking costs per provider | Budget surprises | Log tokens + cost per request |
| Not handling streaming differences | Inconsistent UX | Unified streaming adapter |
| Ignoring rate limits | 429 errors cascade | Per-provider rate limiting |
Choosing an AI API? Compare OpenAI vs Anthropic vs Google Gemini on APIScout — pricing, quality, and performance benchmarks.