Back to Blog
·Hook Mesh Engineering

Debugging Webhooks in Production: A Systematic Approach

Learn how to debug webhook issues in production with a systematic approach covering signature failures, timeouts, parsing errors, and more. Includes practical tools, real examples, and step-by-step checklists.

Debugging Webhooks in Production: A Systematic Approach

Debugging Webhooks in Production: A Systematic Approach

Webhooks fail silently. Issues manifest as missing data, delayed notifications, confused customers. Debugging invisible failures requires systematic approach and right tools.

This guide provides a practical framework for diagnosing and resolving webhook issues in production—whether sending to customers or receiving from third parties.

Common Issues

Signature Failures

Most common source of integration failures. Typically 401 or 403 responses.

Causes:

  • Incorrect webhook secret
  • Payload transformation by middleware
  • Encoding mismatches
  • Clock skew in timestamp signatures
  • Secret rotation without updating receiver

Timeouts

Most providers expect response within 5-30 seconds.

Causes:

  • Synchronous processing of complex logic
  • Database queries blocking response
  • External API calls in handler
  • Cold starts in serverless
  • Resource exhaustion

Parsing Errors

JSON parsing failures.

Causes:

  • Content-Type mismatches
  • Special character encoding
  • Schema changes
  • Middleware consuming raw body
  • Buffer/string conversion issues

Missing Events

Hardest to debug—no error to investigate.

Causes:

  • Events not enabled in provider dashboard
  • Endpoint URL typos or outdated URLs
  • Network connectivity (firewalls, DNS)
  • Provider-side filtering/rate limiting
  • Events queued but not delivered yet

Duplicates

Receiving same webhook multiple times.

Causes:

  • Provider retries after timeout (though processing succeeded)
  • Missing idempotency
  • Multiple endpoints registered
  • Network duplicate transmissions

Debugging Workflow

Step 1: Delivery Logs

Start with the source of truth: provider's delivery logs. Show exactly what sent, when, and response received.

Look for:

  • HTTP status codes from endpoint
  • Response times per attempt
  • Request headers and payload
  • Error messages in response
  • Retry attempts and outcomes

Filter by endpoint, time range, status, event type to narrow down issues.

Step 2: Verify Signatures

If 401 or 403 responses, signature verification is likely culprit. Verify implementation step-by-step:

// Debug signature verification by logging intermediate values
function debugSignatureVerification(payload, receivedSignature, secret) {
  console.log('Raw payload length:', payload.length);
  console.log('Raw payload (first 200 chars):', payload.substring(0, 200));
  console.log('Received signature:', receivedSignature);

  const expectedSignature = crypto
    .createHmac('sha256', secret)
    .update(payload, 'utf8')
    .digest('hex');

  console.log('Expected signature:', expectedSignature);
  console.log('Signatures match:', receivedSignature === expectedSignature);

  return receivedSignature === expectedSignature;
}

Critical: Verify against raw body, not parsed/re-serialized. JSON re-stringify changes whitespace and key ordering.

// WRONG
app.post('/webhooks', express.json(), (req, res) => {
  const payload = JSON.stringify(req.body); // Will fail!
  verifySignature(payload, signature, secret);
});

// CORRECT
app.post('/webhooks', express.raw({ type: 'application/json' }), (req, res) => {
  const payload = req.body.toString(); // Original preserved
  verifySignature(payload, signature, secret);
});

Step 3: Test Locally

Isolate problems by testing captured payloads locally:

curl -X POST http://localhost:3000/webhooks \
  -H "Content-Type: application/json" \
  -H "X-Webhook-Signature: abc123..." \
  -d '{"event":"payment.completed","data":{"id":"pay_123"}}'

Isolates network issues, lets you debug with breakpoints. Add verbose logging to trace through handler.

Step 4: Trace End-to-End

Production-only issues: add correlation IDs and structured logging. Log trace ID, stage, timestamp, event type at each step. Creates audit trail showing exactly where processing fails.

Debugging Tools

Testing Services

Webhook.site captures incoming requests. Perfect for verifying webhooks sent correctly. RequestBin offers similar functionality plus team collaboration.

Request Inspection

ngrok exposes local dev server to the internet:

npm run dev
ngrok http 3000
# Use generated URL as webhook endpoint

Web interface at http://localhost:4040 shows all requests with replay functionality.

CLI Testing

Use curl to test endpoints with various scenarios. Generate signatures with openssl, send test payloads with correct headers.

Replay

Replay failed webhooks for debugging and recovery. Individual or bulk replay by time range.

Real Examples

Example 1: Intermittent Signature Failures

Symptom: 10% fail randomly.

Issue: Failures when payload had Unicode. Handler used payload.length (string length) instead of Buffer.byteLength(payload) for content-length validation.

Solution:

// Before (wrong)
if (payload.length !== parseInt(req.headers['content-length'])) { }

// After (correct)
if (Buffer.byteLength(payload, 'utf8') !== parseInt(req.headers['content-length'])) { }

Example 2: Timeouts

Symptom: Consistent 30-second timeouts.

Issue: Database query for deduplication taking 25+ seconds. Missing index on webhook_event_id.

Solution:

CREATE INDEX idx_webhooks_event_id ON processed_webhooks(event_id);
-- Also: move heavy processing to background queue

Example 3: Missing Events

Symptom: Payments complete, no webhook received.

Issue: Load balancer health check used same path as webhook endpoint, acknowledged requests before application received them.

Solution:

location /health { return 200 'OK'; }
location /webhooks { proxy_pass http://app_server; }

Debugging Checklist

Initial Assessment

  • Check provider delivery logs for attempts
  • Identify HTTP status code
  • Note timestamp and failure frequency
  • All webhooks or specific types?

Signature Issues (401/403)

  • Secret matches provider and config
  • Raw body used (not parsed JSON)
  • No middleware payload transformation
  • Timestamp within acceptable window
  • Test known-good payload

Timeouts (5xx or no response)

  • Check app logs for slow operations
  • Database query performance
  • External API calls in handler
  • Adequate resources (memory, CPU, connections)
  • Consider background processing

Parsing Issues (400)

  • Content-Type header handling
  • Character encoding config
  • Recent provider schema changes
  • Isolated parsing test

Missing Events

  • Events enabled in dashboard
  • Endpoint URL correct and accessible
  • Firewall and network connectivity
  • Provider status page for incidents
  • webhook.site to isolate issues

Duplicates

  • Idempotency using event IDs
  • Timeout settings adequate
  • Multiple endpoint registrations
  • Deduplication logic working

Proactive Monitoring

Track: total received, successfully processed, failed deliveries, processing duration. Alert on: volume drops, failure rate spikes, processing duration increases, signature failures. Catch issues before customers notice.

Conclusion

Debugging requires systematic investigation. Start with delivery logs, verify signatures step-by-step, test locally with captured payloads, trace through system.

Tools matter: ngrok for local testing, webhook.site for inspection, replay functionality. Proper observability from the start makes debugging faster.

Start simple. Observe carefully. Debug systematically.

Related Posts