Webhook Patterns

A webhook is an HTTP callback. The producer (Stripe, GitHub, your application) sends an HTTP POST to a consumer-provided URL when something happens. The consumer responds with HTTP 2xx to acknowledge.

Webhooks are the standard pattern for server-to-server event notifications. The mechanics are simple; the operational details — delivery guarantees, retries, signatures — are where the complexity lives.

This page is about the patterns from both sides: designing webhooks that work, and consuming webhooks reliably.

Delivery guarantees

Webhooks are at-least-once. The producer sends; if delivery fails (timeout, 5xx), the producer retries. Eventually most events are delivered; sometimes events are delivered multiple times.

Consumers must handle:

- Duplicate deliveries (idempotency)

- Out-of-order arrivals

- Eventual consistency (events may be delayed)

Senders should not assume:

- Delivery order

- Exactly-once delivery

- Synchronous processing

Retry strategy

Standard retry approach for senders:

- Initial delivery: immediate

- 5xx response: retry with exponential backoff

- 4xx response: don't retry (consumer error; not transient)

- Timeout: retry like 5xx

Backoff schedule example: 1m, 5m, 15m, 1h, 6h, 24h, 48h, 72h. After ~3 days of retries, mark as failed.

Document the retry policy. Consumers need to know how long to expect retries to continue.

Signature verification

Webhooks must be signed; consumers must verify. Without signatures, anyone can POST to the webhook URL and trigger consumer logic.

Standard pattern (Stripe, GitHub, Slack):

```

1. Producer creates HMAC-SHA256 of payload using shared secret

2. Producer includes signature in HTTP header

3. Consumer recomputes HMAC and verifies equality

```

```http

POST /webhooks/orders

X-Signature: sha256=<hex-encoded-hmac>

Content-Type: application/json

{ "event": "order.shipped", ... }

```

Consumer code:

```python

expected = hmac.new(secret, payload, hashlib.sha256).hexdigest()

if not hmac.compare_digest(expected, received):

return 401

```

`compare_digest` is constant-time; `==` is timing-attack-vulnerable.

Replay attacks

Even with signatures, an attacker who captures a valid webhook can replay it. Defenses:

- **Timestamp + tolerance**: include timestamp in signature; reject events older than ~5 minutes

- **Nonce tracking**: server stores recent event IDs; rejects duplicates within window

Stripe's pattern: signature includes timestamp; consumer rejects if `|now - timestamp| > 300s`.

Consumer requirements

Reliable webhook consumers need:

Quick acknowledgment

Respond with 2xx as soon as the webhook is received and stored — typically under 5 seconds. Producers often timeout faster than that.

```python

def handle_webhook():

verify_signature()

enqueue_for_processing() # async work

return 200 # ack now

```

Don't do the actual work synchronously; enqueue and process asynchronously. Otherwise slow processing causes producer retries (and duplicate work).

Idempotency

Same event delivered twice should not double-process. Use the event ID:

```python

if not seen_event_ids.exists(event_id):

seen_event_ids.add(event_id, ttl=7days)

process(event)

```

See [IdempotencyPatterns](IdempotencyPatterns).

Persistent queue

Don't process events directly from the HTTP handler. Persist to a queue (database, Kafka, SQS) and process from there. If processing fails, the event is preserved for retry.

Logging and monitoring

- Log every received event with its ID

- Alert on delivery failures (4xx responses, parsing errors)

- Track processing latency

Disaster recovery

If your webhook endpoint is down, events are eventually lost (after producer retry exhaustion). Mitigations:

- High availability (multiple endpoint instances behind load balancer)

- Event log endpoint on the producer side ("get all events since X") for backfill

- Manual replay tools

Producer responsibilities

Idempotency keys

Even with at-least-once delivery, include an event ID:

```json

{

"event_id": "evt_8d4f...",

"event_type": "order.shipped",

"data": {...}

}

```

Consumers use the event ID for deduplication.

Event ordering

Within a single resource (one order's events), events should be sequential. Events to different resources can be parallel.

If strict ordering matters, include a sequence number or rely on retry-with-backoff to maintain order.

Webhook management

Provide a UI or API for:

- Subscribing to events (which event types)

- Configuring endpoint URLs

- Viewing delivery history

- Manually retrying failed deliveries

- Disabling/re-enabling a webhook

The Stripe/GitHub-style management UI sets the bar.

Versioning

Events may evolve. Version them:

- Include a `version` field in payload

- Document what each version means

- Maintain compatibility for old subscribers

Or: namespace event types per major version.

Common patterns to avoid

- **Synchronous processing in the handler.** Causes producer retries.

- **No signature verification.** Anyone can POST.

- **No idempotency.** Duplicate processing.

- **Trusting timestamps from the client without signing them.** Attacker forges them.

- **Single event per webhook.** Sometimes batching events is more efficient.

- **No delivery history.** Consumers can't audit; producers can't debug.

Common failure patterns

- **Slow consumer.** Producer retries; duplicates pile up.

- **No replay protection.** Captured webhooks can be replayed.

- **Drop on signature mismatch silently.** Returns 401 to producer; producer retries.

- **No webhook documentation.** Consumers can't anticipate failure modes.

- **Brittle event schema.** Adding fields breaks consumers; document forward-compatibility.

Further Reading

- [IdempotencyPatterns](IdempotencyPatterns) — Idempotency for retries

- [ApiProtocolComparison](ApiProtocolComparison) — Where webhooks fit

- [ServerSentEventsPatterns](ServerSentEventsPatterns) — Pull-based alternative

- [WebSocketPatterns](WebSocketPatterns) — Bidirectional alternative

- [WebServicesAndApis Hub](WebServicesAndApisHub) — Cluster index