How to Build an AI Chatbot with the Anthropic API
How to Build an AI Chatbot with the Anthropic API
Claude is one of the most capable AI models for building chatbots — strong at reasoning, following instructions, and maintaining coherent conversations. This guide covers everything from basic chat to streaming, tool use, and production deployment.
What You'll Build
- Conversational chatbot with message history
- Streaming responses for real-time output
- System prompts for personality and behavior
- Tool use (function calling) for dynamic actions
- Production-ready error handling and rate limiting
Prerequisites: Node.js 18+, Anthropic API key (from console.anthropic.com).
1. Setup
Install the SDK
npm install @anthropic-ai/sdk
Initialize the Client
// lib/anthropic.ts
import Anthropic from '@anthropic-ai/sdk';
export const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
Environment Variables
# .env.local
ANTHROPIC_API_KEY=sk-ant-...
2. Basic Chat
Simple Message
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'What is an API?' }
],
});
console.log(message.content[0].text);
With System Prompt
System prompts define your chatbot's personality, knowledge, and behavior:
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: `You are a helpful API expert assistant. You help developers choose
the right APIs for their projects. Be concise, technical, and always include
code examples when relevant. If you don't know something, say so.`,
messages: [
{ role: 'user', content: 'Which email API should I use for transactional emails?' }
],
});
Conversation with History
Maintain context by sending the full conversation history:
const conversationHistory: Anthropic.MessageParam[] = [];
async function chat(userMessage: string) {
conversationHistory.push({
role: 'user',
content: userMessage,
});
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: 'You are a helpful assistant.',
messages: conversationHistory,
});
const assistantMessage = response.content[0].text;
conversationHistory.push({
role: 'assistant',
content: assistantMessage,
});
return assistantMessage;
}
// Usage
await chat('What is REST?');
await chat('How does it compare to GraphQL?'); // Knows context
await chat('Which should I use for my mobile app?'); // Remembers both
3. Streaming Responses
Streaming shows text as it's generated — essential for a good chatbot UX:
const stream = anthropic.messages.stream({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Explain API rate limiting' }
],
});
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
process.stdout.write(event.delta.text);
}
}
Streaming API Route (Next.js)
// app/api/chat/route.ts
import { anthropic } from '@/lib/anthropic';
export async function POST(req: Request) {
const { messages } = await req.json();
const stream = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 2048,
stream: true,
system: 'You are a helpful API expert.',
messages,
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
controller.enqueue(encoder.encode(event.delta.text));
}
}
controller.close();
},
});
return new Response(readable, {
headers: { 'Content-Type': 'text/plain; charset=utf-8' },
});
}
Streaming Client Component
// components/Chat.tsx
'use client';
import { useState, useRef } from 'react';
type Message = { role: 'user' | 'assistant'; content: string };
export function Chat() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState('');
const [isStreaming, setIsStreaming] = useState(false);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!input.trim() || isStreaming) return;
const userMessage: Message = { role: 'user', content: input };
const updatedMessages = [...messages, userMessage];
setMessages(updatedMessages);
setInput('');
setIsStreaming(true);
const res = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ messages: updatedMessages }),
});
const reader = res.body!.getReader();
const decoder = new TextDecoder();
let assistantContent = '';
setMessages([...updatedMessages, { role: 'assistant', content: '' }]);
while (true) {
const { done, value } = await reader.read();
if (done) break;
assistantContent += decoder.decode(value);
setMessages([
...updatedMessages,
{ role: 'assistant', content: assistantContent },
]);
}
setIsStreaming(false);
};
return (
<div>
<div className="messages">
{messages.map((m, i) => (
<div key={i} className={m.role}>
<strong>{m.role}:</strong> {m.content}
</div>
))}
</div>
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask about APIs..."
disabled={isStreaming}
/>
<button type="submit" disabled={isStreaming}>Send</button>
</form>
</div>
);
}
4. Tool Use (Function Calling)
Give your chatbot the ability to perform actions — look up data, call APIs, execute functions:
const tools: Anthropic.Tool[] = [
{
name: 'search_apis',
description: 'Search the API directory for APIs matching a query',
input_schema: {
type: 'object',
properties: {
query: { type: 'string', description: 'Search query' },
category: {
type: 'string',
enum: ['ai', 'payments', 'email', 'auth', 'search'],
description: 'API category filter',
},
},
required: ['query'],
},
},
{
name: 'compare_apis',
description: 'Compare two APIs side by side',
input_schema: {
type: 'object',
properties: {
api_a: { type: 'string', description: 'First API name' },
api_b: { type: 'string', description: 'Second API name' },
},
required: ['api_a', 'api_b'],
},
},
];
// Handle tool use in conversation
async function chatWithTools(userMessage: string) {
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
tools,
messages: [{ role: 'user', content: userMessage }],
});
// Check if Claude wants to use a tool
for (const block of response.content) {
if (block.type === 'tool_use') {
const toolResult = await executeTool(block.name, block.input);
// Send tool result back to Claude
const followUp = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
tools,
messages: [
{ role: 'user', content: userMessage },
{ role: 'assistant', content: response.content },
{
role: 'user',
content: [{
type: 'tool_result',
tool_use_id: block.id,
content: JSON.stringify(toolResult),
}],
},
],
});
return followUp;
}
}
return response;
}
5. Extended Thinking
For complex questions, enable extended thinking to let Claude reason before responding:
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 16000,
thinking: {
type: 'enabled',
budget_tokens: 10000,
},
messages: [
{
role: 'user',
content: 'Design an API architecture for a multi-tenant SaaS platform with real-time features',
},
],
});
// Response includes thinking blocks + text blocks
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Response:', block.text);
}
}
6. Model Selection
| Model | Best For | Speed | Cost |
|---|---|---|---|
| claude-sonnet-4-20250514 | General chatbot, balanced quality/speed | Fast | Medium |
| claude-opus-4-20250514 | Complex reasoning, high-stakes responses | Slower | Higher |
| claude-haiku-3-5-20241022 | High-volume, simple queries | Fastest | Lowest |
For chatbots: Start with Sonnet. Use Haiku for high-volume, simple interactions. Escalate to Opus for complex queries.
7. Production Best Practices
Rate Limiting
// Implement per-user rate limiting
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(20, '1 m'), // 20 messages per minute
});
// In your API route
const { success } = await ratelimit.limit(userId);
if (!success) {
return NextResponse.json({ error: 'Rate limited' }, { status: 429 });
}
Error Handling
try {
const response = await anthropic.messages.create({ ... });
} catch (error) {
if (error instanceof Anthropic.RateLimitError) {
// Wait and retry
} else if (error instanceof Anthropic.APIError) {
// Log and return user-friendly error
}
}
Context Window Management
Claude has a large context window but costs increase with token count. Manage conversation length:
function trimConversation(messages: Message[], maxTokens: number = 50000) {
// Keep system prompt + last N messages
// Summarize older messages if needed
if (estimateTokens(messages) > maxTokens) {
// Keep first message (system context) and last 10 messages
return [messages[0], ...messages.slice(-10)];
}
return messages;
}
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Not streaming | Users wait for full response — feels slow | Always stream in chat UIs |
| Sending full history forever | Costs increase, hits context limits | Trim or summarize old messages |
| No rate limiting | One user exhausts your API budget | Per-user rate limits |
| Exposing API key to client | Account compromise | Server-side only |
| Ignoring stop reasons | Missing tool calls, truncated responses | Check stop_reason in response |
| No error handling | Crashes on 429/500 responses | Try/catch with retry logic |
Building with the Anthropic API? Explore AI API comparisons and integration guides on APIScout — Claude vs GPT, Claude vs Gemini, and more.