Event Processing Architecture

How Doone Flow processes webhook events and workflows at scale using message queues and asynchronous workers

Overview

Doone Flow uses an asynchronous, queue-based architecture to process webhook events and workflows reliably at scale.

When a webhook arrives, the system immediately accepts it and queues it for processing. This means your API responds instantly while workflow execution happens in the background. This design provides several important benefits:

High Throughput

Webhooks are accepted instantly without waiting for execution to complete

Reliability

Failed executions are automatically retried with exponential backoff

Scalability

Multiple worker instances can process events in parallel

Resilience

Messages are persisted in the queue, preventing data loss during outages

Architecture Flow

1

Webhook Request

Incoming event received

Validate Payload

Check schema & security

200 OK

Event accepted

2

Message Queue

Events stored until processed

  • • Long polling
  • • Retry support
  • • Delay scheduling

Dead Letter Queue

Failed after max retries

3

Worker Process

Background processing

  • • Poll queue
  • • Normalize data
  • • Prepare workflow context
  • • Execute workflow
4

Workflow Execution

Run actions sequentially

  • • SMS messages
  • • CRM updates
  • • Webhook calls
  • • AI processing
5

Database

Store execution records

Notifications

Email & Slack (on failure)

Failed actions are automatically retried with exponential backoff

This diagram shows the complete flow from webhook ingestion through workflow execution, including retry mechanisms and error handling.

Key Components

1. Webhook Handler

The webhook handler is the entry point for all incoming webhook requests. It performs several critical functions:

  • Authentication: Validates webhook security keys if configured
  • Payload Validation: Validates incoming payloads against the configured JSON schema (if a trigger schema is defined)
  • Organization Status: Checks if the organization is enabled before processing
  • Queue Dispatch: Immediately sends the event to the internal message queue and returns a 200 OK response

The handler does not execute workflows directly. This separation ensures the API remains responsive even when workflows take time to process.

2. Message Queue

A durable message queue acts as the broker between the webhook handler and worker processes. Key features:

  • Message Persistence: Events are stored in the queue until processed, preventing data loss
  • Visibility Timeout: Messages are hidden while being processed, preventing duplicate processing
  • Long Polling: Workers use long polling to reduce empty responses and improve efficiency
  • Delay Support: Messages can be delayed for scheduled actions
  • Receive Count Tracking: Tracks how many times a message has been received, enabling retry limits

The queue configuration includes a Dead Letter Queue (DLQ) that receives messages that fail after maximum retry attempts or are identified as permanent errors.

3. Worker Process

The worker process is a long-running background service that continuously polls the queue and processes events. The worker handles:

  • Message Reception: Polls the message queue with long polling to receive events
  • Retry Management: Checks receive count and handles messages that exceed retry limits
  • Event Normalization: Normalizes trigger data using either schema-based or legacy normalization
  • Workflow Execution: Executes workflow actions sequentially
  • Error Handling: Distinguishes between transient and permanent errors
  • Delayed Events: Processes delayed events from the database for longer delays

Multiple worker instances can run in parallel, allowing horizontal scaling to handle increased load.

4. Workflow Orchestrator

The workflow orchestrator coordinates the execution of workflow actions:

  • Action Execution: Executes actions in sequence (SMS, CRM updates, webhooks, AI steps, etc.)
  • Delay Handling: For delay actions, re-queues the event with a delay or stores in database for longer delays
  • Retry Logic: Implements exponential backoff for failed actions with automatic retry limits
  • Execution Tracking: Creates and updates workflow execution records in the database
  • Failure Notifications: Sends email and Slack notifications when workflows fail (if configured)

Detailed Processing Flow

Step 1: Webhook Reception

  1. Webhook request arrives at /webhook/{path}
  2. Handler validates security key (if configured)
  3. Handler validates payload against trigger schema (if configured)
  4. Handler checks organization status (must be enabled)
  5. Event is serialized and sent to the internal message queue
  6. API immediately returns {"status": "accepted"}

Step 2: Queue Processing

  1. Worker polls the queue with long polling
  2. If message received, worker checks how many times it has already been attempted
  3. If message exceeds maximum retry attempts, worker creates a failed execution record and deletes the message
  4. Otherwise, worker processes the event

Step 3: Event Normalization

  1. Worker retrieves workflow and trigger configuration from database
  2. If trigger schema is configured:
    • Normalizes payload using schema field mappings
    • Merges original payload fields as fallback to ensure all required data is available
  3. If no schema (legacy):
    • Uses normalization service to extract and structure data fields
  4. If normalization fails, creates failed execution record and deletes message

Step 4: Workflow Execution

  1. Worker creates workflow execution record in database
  2. Executes actions sequentially:
    • Delay Actions: Re-queues event with delay or stores in database for longer delays
    • SMS Actions: Sends SMS via Twilio
    • CRM Actions: Updates Salesforce or other CRM systems
    • Webhook Actions: Makes HTTP requests to external endpoints
    • AI Actions: Processes AI enrichment, generation, or decision steps
  3. For each action failure:
    • If within retry limit: Re-queues with exponential backoff
    • If exceeds retry limit: Sends to DLQ and marks action as failed

Step 5: Completion

  1. If workflow completes successfully, updates execution status to "completed"
  2. If workflow fails, updates execution status to "failed" and sends failure notifications (if configured)
  3. Worker deletes message from the queue
  4. Execution record is available in the UI for monitoring and debugging

Error Handling & Retry Logic

Permanent vs. Transient Errors

The system distinguishes between permanent errors (that won't succeed on retry) and transient errors (temporary failures that may succeed on retry):

Permanent Errors

  • Missing required fields in payload
  • Invalid UUID or malformed data
  • Invalid configuration or missing resources

These are immediately deleted from the queue and marked as failed.

Transient Errors

  • Network timeouts
  • Temporary service unavailability
  • Rate limiting
  • Database connection issues

These are retried with exponential backoff until the retry limit is reached.

Retry Strategy

  1. Action-Level Retries: Individual actions are automatically retried with exponential backoff until the retry limit is reached
  2. Message-Level Retries: The queue automatically retries messages that aren't deleted within the visibility timeout
  3. Max Receive Count: Messages that exceed the maximum receive count are considered failed and deleted
  4. Dead Letter Queue: Failed actions after max retries are sent to the DLQ for manual inspection

Scalability & Performance

Horizontal Scaling

Multiple worker instances can run simultaneously, each polling the same message queue. The queue automatically distributes messages across workers, enabling:

  • Parallel processing of multiple workflows
  • Increased throughput during peak loads
  • Fault tolerance (if one worker fails, others continue)

Performance Optimizations

  • Long Polling: Reduces empty responses and API calls
  • Immediate API Response: Webhooks return instantly without waiting for execution
  • Connection Pooling: Efficient database and integration connection management
  • Asynchronous Processing: All workflow execution happens off the request path

Monitoring & Observability

All workflow executions are tracked in the database, providing comprehensive observability:

  • Execution Records: Every workflow execution is stored with status, timestamps, and error messages
  • Action Execution Tracking: Individual action executions are tracked with results and retry counts
  • Raw Payload Storage: Original webhook payloads are stored for debugging and replay
  • Trigger Data: Normalized trigger data is stored for reference
  • Failed Execution Visibility: Failed executions appear in the UI for investigation

Best Practices

For Webhook Senders

  • Always check the HTTP status code (200 = accepted, 400 = validation error, 500 = queue error)
  • Implement retry logic for 500 errors (the webhook may be queued on retry)
  • Keep payloads under 1MB (enforced limit)
  • Use webhook security keys for production workflows
  • Validate payload structure matches your trigger schema

Related Documentation