Skip to main content

Building Webhooks That Don't Break: Best Practices

·APIScout Team
webhooksapi designevent-drivenbest practicesapi architecture

Building Webhooks That Don't Break: Best Practices

Webhooks are the internet's callback mechanism — when something happens in your system, you POST a JSON payload to a URL your customer provides. Simple in concept. In practice, webhooks fail silently, arrive out of order, get replayed, and expose security vulnerabilities. Here's how to build webhooks that actually work.

The Basics

A webhook system has three components:

  1. Event source — something happens in your system (order created, payment completed)
  2. Delivery — you POST a JSON payload to the customer's URL
  3. Verification — the customer verifies the payload came from you (signature)

1. Sign Every Payload

Unsigned webhooks can be spoofed. Anyone who knows the endpoint URL can send fake events. Always sign payloads with HMAC-SHA256.

How Stripe does it:

Stripe-Signature: t=1710892800,v1=abc123...

The signature includes: a timestamp (prevents replay attacks) and an HMAC of timestamp.payload using a per-endpoint secret.

Your implementation should:

  • Generate a unique signing secret per webhook endpoint
  • Include a timestamp in the signature to prevent replay attacks
  • Use HMAC-SHA256 (not MD5, not SHA1)
  • Sign the raw request body (not parsed JSON)
  • Document the verification process with code examples in 5+ languages

2. Retry Failed Deliveries

Webhook endpoints go down. Networks fail. Servers return 500s. Retry with exponential backoff.

Retry schedule (example):

AttemptDelayTotal elapsed
1Immediate0
25 minutes5 min
330 minutes35 min
42 hours2h 35m
58 hours10h 35m
624 hours34h 35m

Success criteria: 2xx status code within 30 seconds. Anything else (3xx, 4xx, 5xx, timeout) triggers a retry.

After all retries fail: Mark the endpoint as failing. Notify the customer via email. Pause delivery after N consecutive failures. Provide a manual replay mechanism.

3. Make Events Idempotent

Network issues and retries mean endpoints may receive the same event multiple times. Every event should include a unique ID that consumers use for deduplication.

{
  "id": "evt_abc123",
  "type": "order.completed",
  "created_at": "2026-03-08T12:00:00Z",
  "data": { ... }
}

Consumer-side: Store processed event IDs. Before processing, check if evt_abc123 was already handled. Skip if yes.

4. Event Design

Consistent Event Schema

Every event should have the same top-level structure:

{
  "id": "evt_abc123",
  "type": "order.completed",
  "api_version": "2026-03-08",
  "created_at": "2026-03-08T12:00:00Z",
  "data": {
    "object": { ... }
  }
}

Event Types

Use resource.action naming: order.created, order.updated, payment.succeeded, payment.failed.

Include Full Objects

Include the full current state of the object, not just the changed fields. This way, consumers don't need to make follow-up API calls.

{
  "type": "order.updated",
  "data": {
    "object": {
      "id": "ord_123",
      "status": "shipped",
      "total": 4999,
      "items": [...],
      "customer": { ... }
    }
  }
}

5. Delivery Infrastructure

Async Processing

Never block your application to deliver webhooks. Queue events and process delivery asynchronously.

Application → Event Queue → Webhook Worker → HTTP POST

Timeout

Set a 30-second timeout for webhook delivery. If the endpoint doesn't respond in 30 seconds, mark as failed and retry.

Don't Follow Redirects

Webhook delivery should not follow redirects (3xx responses). Treat redirects as failures. The configured URL should be the final destination.

IP Allowlisting

Publish the IP addresses your webhooks are sent from. Customers may need to allowlist them in their firewall.

6. Security

Prevent SSRF

Customers provide webhook URLs — don't let them point to internal services. Validate URLs:

  • Block private IP ranges (10.x, 172.16.x, 192.168.x, 127.x)
  • Block link-local addresses (169.254.x)
  • Block localhost
  • Resolve DNS before connecting and check the resolved IP

Rate Limit Deliveries

If a customer configures multiple endpoints, limit the total delivery rate per customer. One endpoint failure shouldn't trigger thousands of retry requests.

Payload Size

Limit webhook payload size (e.g., 256KB). Large payloads can overwhelm consumers. For large data, include a reference URL instead.

7. Developer Experience

Webhook Dashboard

Provide a UI where customers can:

  • View delivery attempts (success/failure/pending)
  • See request and response bodies
  • Manually replay failed events
  • Test with sample events
  • Manage endpoint URLs and signing secrets

CLI Testing

Provide a CLI tool for local webhook testing:

your-cli webhooks listen --port 3000

This creates a tunnel so developers can receive webhooks on localhost during development.

Event Catalog

Document every event type with example payloads, when they fire, and what data they include.

8. Common Mistakes

MistakeImpactFix
No signaturesEndpoint spoofingHMAC-SHA256 every payload
No retriesSilent data lossExponential backoff, 5+ attempts
Synchronous deliveryApplication slowdownQueue-based async delivery
No event IDsDuplicate processingUnique ID per event
Following redirectsSSRF vulnerabilityTreat 3xx as failure
No timeoutWorker threads stuck30-second timeout
Partial object in payloadConsumer needs follow-up API callInclude full object state

Building webhook infrastructure? Explore Svix, Hookdeck, Convoy, and more webhook tools on APIScout — comparisons, guides, and developer resources.

Comments