Circuit Breaker
Hook Mesh's circuit breaker automatically detects consistently failing endpoints and temporarily pauses deliveries to prevent wasted retry attempts. When the endpoint recovers, deliveries automatically resume.
How It Works
The circuit breaker monitors endpoint health and takes action when it detects persistent failures. This prevents overwhelming already-failing endpoints and conserves retry budget for transient failures.
Circuit States
- CLOSED (Normal)
- Endpoint is healthy. All webhook jobs are delivered normally with standard retry logic.
- OPEN (Failing)
- Endpoint is consistently failing. New webhook deliveries are paused. A test delivery is attempted every 5 minutes.
- HALF-OPEN (Testing)
- Circuit is testing if the endpoint has recovered. One test delivery is sent to check health.
When the Circuit Opens
The circuit opens (deliveries pause) when Hook Mesh detects consistent failure patterns indicating the endpoint is down or misconfigured.
| Trigger Condition | Threshold | Window |
|---|---|---|
| Consecutive Failures | 5+ failed deliveries | In a row (no successes) |
| Failure Rate | ≥50% failures | Last 10 deliveries |
Failure Definition
A delivery is considered "failed" if it returns a 5xx error, times out, or has a connection error. 4xx errors (except 429) do NOT trigger the circuit breaker since they indicate client-side issues, not endpoint downtime.
Delivery 1: 500 Internal Server Error → Failed
Delivery 2: 503 Service Unavailable → Failed
Delivery 3: Timeout (30s) → Failed
Delivery 4: Connection Refused → Failed
Delivery 5: 500 Internal Server Error → Failed
→ Circuit Opens: Endpoint paused
→ Status changed to "paused" (circuit_breaker_open)
→ Customer notified via webhook (endpoint.circuit_breaker_opened)Circuit Lifecycle
When the circuit opens, Hook Mesh automatically tests the endpoint periodically to detect when it recovers.
┌──────────────┐
│ │
┌────────▶│ CLOSED │◀────────┐
│ │ (Healthy) │ │
│ │ │ │
│ └───────┬──────┘ │
│ │ │
│ 5+ failures│ │ Test succeeds
│ ▼ │
│ ┌──────────────┐ │
│ │ OPEN │ │
│ │ (Paused) │─────────┤
│ │ │ │
│ └──────┬───────┘ │
│ │ │
│ After 5 min │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ HALF-OPEN │ │
└─────────│ (Testing) │─────────┘
│ │
└──────────────┘
State Details:
• CLOSED: Normal delivery with retries
• OPEN: All new jobs queued, test every 5 minutes
• HALF-OPEN: Send one test delivery
├─ Success → Return to CLOSED
└─ Failure → Return to OPEN (wait 5 min)Recovery Process
Circuit Opens
After 5 consecutive failures, the circuit opens. Endpoint status changes to paused.
Jobs Queue Up
New webhook jobs for this endpoint are queued (not delivered) while the circuit is open.
Test After 5 Minutes
Hook Mesh waits 5 minutes, then sends a test delivery to check if the endpoint has recovered.
Test Result
If the test succeeds (2xx response), the circuit closes and queued jobs are delivered. If it fails, wait another 5 minutes and retry.
What Happens to Webhook Jobs
When the circuit is open, webhook jobs are not discarded—they're queued and will be delivered once the endpoint recovers.
| Scenario | Behavior |
|---|---|
| Jobs created before circuit opened | Continue retry schedule (will eventually be discarded after 48h if endpoint doesn't recover) |
| Jobs created while circuit is open | Queued with status pending. Delivered when circuit closes. |
| Circuit closes (endpoint recovers) | All queued jobs are delivered immediately (with rate limiting to avoid overwhelming the endpoint) |
48-Hour Delivery Window
Jobs are still subject to the 48-hour delivery window. If an endpoint is down for more than 48 hours, jobs created during that time will be discarded after 48 hours even if still queued.
Manual Circuit Breaker Reset
If you know an endpoint has recovered (e.g., after a deployment), you can manually close the circuit via the API instead of waiting for the automatic test.
import fetch from 'node-fetch';
const endpointId = 'ep_abc123';
const response = await fetch(
`https://api.hookmesh.com/v1/endpoints/${endpointId}/circuit-breaker/reset`,
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.HOOKMESH_API_KEY}`
}
}
);
if (response.ok) {
console.log('Circuit breaker reset - deliveries will resume');
} else {
console.error('Failed to reset circuit breaker');
}import requests
import os
endpoint_id = 'ep_abc123'
response = requests.post(
f'https://api.hookmesh.com/v1/endpoints/{endpoint_id}/circuit-breaker/reset',
headers={'Authorization': f'Bearer {os.environ["HOOKMESH_API_KEY"]}'}
)
if response.ok:
print('Circuit breaker reset - deliveries will resume')
else:
print('Failed to reset circuit breaker')When to Reset Manually
Manual reset is useful after deployments, configuration fixes, or when you've verified the endpoint is healthy. The circuit will immediately close and queued jobs will be delivered.
Monitoring Circuit State
Track circuit breaker state via the Endpoints API and webhook events.
const response = await fetch(
`https://api.hookmesh.com/v1/endpoints/${endpointId}`,
{
headers: {
'Authorization': `Bearer ${process.env.HOOKMESH_API_KEY}`
}
}
);
const endpoint = await response.json();
console.log('Status:', endpoint.status);
console.log('Circuit state:', endpoint.circuit_breaker_state);
console.log('Last failure:', endpoint.last_failure_at);
console.log('Consecutive failures:', endpoint.consecutive_failures);
// Alert if circuit is open
if (endpoint.circuit_breaker_state === 'open') {
console.warn(`⚠ Endpoint ${endpoint.url} has open circuit!`);
console.warn(`Failing since: ${endpoint.last_failure_at}`);
}Webhook Events
Hook Mesh can send webhook events to notify you of circuit state changes.
| Event | Description |
|---|---|
| endpoint.circuit_breaker_opened | Circuit opened due to failures. Deliveries paused. |
| endpoint.circuit_breaker_closed | Circuit closed after successful test. Deliveries resumed. |
Benefits
- ✓Prevents wasted retries - Don't waste retry attempts on endpoints that are clearly down
- ✓Protects failing endpoints - Prevents overwhelming endpoints during outages or deployments
- ✓Automatic recovery - No manual intervention needed - deliveries resume when endpoint recovers
- ✓Preserves delivery guarantees - Jobs aren't discarded, they're queued until the endpoint is healthy
- ✓Improves overall reliability - System adapts to endpoint health without manual intervention
Best Practices
- ✓Monitor circuit state - Set up alerts for
endpoint.circuit_breaker_openedevents - ✓Fix root causes quickly - Investigate why endpoints are failing to prevent circuit breaker triggers
- ✓Use manual reset after deployments - Don't wait 5 minutes if you know the endpoint is fixed
- ✓Return proper status codes - 5xx triggers circuit breaker, 4xx doesn't. Use appropriate codes for errors.
- ✓Test endpoint health - Use the test webhook endpoint to verify health before deploying changes
- ✓Implement graceful degradation - Handle webhook delivery failures gracefully in your application
Related Documentation
Retry Strategy →
Learn about exponential backoff and how retries work with the circuit breaker
Endpoints API →
Manage endpoints and reset circuit breakers via the API
Delivery Guarantees →
Understand how circuit breakers maintain at-least-once delivery guarantees
Monitoring →
Set up monitoring and alerts for circuit breaker events