Building Webhooks That Don't Break: Best Practices
Building Webhooks That Don't Break: Best Practices
Webhooks are the internet's callback mechanism — when something happens in your system, you POST a JSON payload to a URL your customer provides. Simple in concept. In practice, webhooks fail silently, arrive out of order, get replayed, and expose security vulnerabilities. Here's how to build webhooks that actually work.
The Basics
A webhook system has three components:
- Event source — something happens in your system (order created, payment completed)
- Delivery — you POST a JSON payload to the customer's URL
- Verification — the customer verifies the payload came from you (signature)
1. Sign Every Payload
Unsigned webhooks can be spoofed. Anyone who knows the endpoint URL can send fake events. Always sign payloads with HMAC-SHA256.
How Stripe does it:
Stripe-Signature: t=1710892800,v1=abc123...
The signature includes: a timestamp (prevents replay attacks) and an HMAC of timestamp.payload using a per-endpoint secret.
Your implementation should:
- Generate a unique signing secret per webhook endpoint
- Include a timestamp in the signature to prevent replay attacks
- Use HMAC-SHA256 (not MD5, not SHA1)
- Sign the raw request body (not parsed JSON)
- Document the verification process with code examples in 5+ languages
2. Retry Failed Deliveries
Webhook endpoints go down. Networks fail. Servers return 500s. Retry with exponential backoff.
Retry schedule (example):
| Attempt | Delay | Total elapsed |
|---|---|---|
| 1 | Immediate | 0 |
| 2 | 5 minutes | 5 min |
| 3 | 30 minutes | 35 min |
| 4 | 2 hours | 2h 35m |
| 5 | 8 hours | 10h 35m |
| 6 | 24 hours | 34h 35m |
Success criteria: 2xx status code within 30 seconds. Anything else (3xx, 4xx, 5xx, timeout) triggers a retry.
After all retries fail: Mark the endpoint as failing. Notify the customer via email. Pause delivery after N consecutive failures. Provide a manual replay mechanism.
3. Make Events Idempotent
Network issues and retries mean endpoints may receive the same event multiple times. Every event should include a unique ID that consumers use for deduplication.
{
"id": "evt_abc123",
"type": "order.completed",
"created_at": "2026-03-08T12:00:00Z",
"data": { ... }
}
Consumer-side: Store processed event IDs. Before processing, check if evt_abc123 was already handled. Skip if yes.
4. Event Design
Consistent Event Schema
Every event should have the same top-level structure:
{
"id": "evt_abc123",
"type": "order.completed",
"api_version": "2026-03-08",
"created_at": "2026-03-08T12:00:00Z",
"data": {
"object": { ... }
}
}
Event Types
Use resource.action naming: order.created, order.updated, payment.succeeded, payment.failed.
Include Full Objects
Include the full current state of the object, not just the changed fields. This way, consumers don't need to make follow-up API calls.
{
"type": "order.updated",
"data": {
"object": {
"id": "ord_123",
"status": "shipped",
"total": 4999,
"items": [...],
"customer": { ... }
}
}
}
5. Delivery Infrastructure
Async Processing
Never block your application to deliver webhooks. Queue events and process delivery asynchronously.
Application → Event Queue → Webhook Worker → HTTP POST
Timeout
Set a 30-second timeout for webhook delivery. If the endpoint doesn't respond in 30 seconds, mark as failed and retry.
Don't Follow Redirects
Webhook delivery should not follow redirects (3xx responses). Treat redirects as failures. The configured URL should be the final destination.
IP Allowlisting
Publish the IP addresses your webhooks are sent from. Customers may need to allowlist them in their firewall.
6. Security
Prevent SSRF
Customers provide webhook URLs — don't let them point to internal services. Validate URLs:
- Block private IP ranges (10.x, 172.16.x, 192.168.x, 127.x)
- Block link-local addresses (169.254.x)
- Block localhost
- Resolve DNS before connecting and check the resolved IP
Rate Limit Deliveries
If a customer configures multiple endpoints, limit the total delivery rate per customer. One endpoint failure shouldn't trigger thousands of retry requests.
Payload Size
Limit webhook payload size (e.g., 256KB). Large payloads can overwhelm consumers. For large data, include a reference URL instead.
7. Developer Experience
Webhook Dashboard
Provide a UI where customers can:
- View delivery attempts (success/failure/pending)
- See request and response bodies
- Manually replay failed events
- Test with sample events
- Manage endpoint URLs and signing secrets
CLI Testing
Provide a CLI tool for local webhook testing:
your-cli webhooks listen --port 3000
This creates a tunnel so developers can receive webhooks on localhost during development.
Event Catalog
Document every event type with example payloads, when they fire, and what data they include.
8. Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| No signatures | Endpoint spoofing | HMAC-SHA256 every payload |
| No retries | Silent data loss | Exponential backoff, 5+ attempts |
| Synchronous delivery | Application slowdown | Queue-based async delivery |
| No event IDs | Duplicate processing | Unique ID per event |
| Following redirects | SSRF vulnerability | Treat 3xx as failure |
| No timeout | Worker threads stuck | 30-second timeout |
| Partial object in payload | Consumer needs follow-up API call | Include full object state |
Building webhook infrastructure? Explore Svix, Hookdeck, Convoy, and more webhook tools on APIScout — comparisons, guides, and developer resources.