Back to Blog
Published:
Last Updated:
Fresh Content

When to Mix SQS FIFO and Standard Queues in an Agent Pipeline

7 min read
1,400 words
medium priority
Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI

When to Mix SQS FIFO and Standard Queues in an Agent Pipeline — Cognilium AI

TL;DR

FIFO for chunk ordering, Standard for parallel analysis fan-out. Why a single queue type for the whole pipeline is the wrong default, with the dead-letter and

FIFO for chunk ordering, Standard for parallel analysis fan-out. Why a single queue type for the whole pipeline is the wrong default, with the dead-letter and retry settings that make the split work.
SQS FIFOSQS Standardqueue topologymessage group IDdead letter queueagent fan-outECS FargateLambda triggers

An agent pipeline with multiple stages tends to default to one queue type for the whole topology. That works for tutorials and breaks at production scale. The pipeline this writeup describes has three queue boundaries and uses three different topology choices for them — and the reasoning is worth writing down because most teams hit this and pick the wrong default.

Stage 1: chunk extraction (FIFO)

A document is split into 50–100 chunks. The assembler downstream stitches them into a structured contract analysis. Chunks must arrive in order — chunk 7 cannot land before chunk 6, otherwise the assembler either reorders (expensive) or skips and waits (slow). FIFO with message group ID = job_id keeps order inside one job while different jobs run in parallel. Per-group throughput is 300 messages/sec without batching, 3,000 with — enough for a job that emits 100 chunks in a couple seconds.

Stage 2: analysis fan-out (Standard)

Each chunk fans out to 11 specialist analyst agents. There is no ordering relationship — bias, severity, ambiguity, party, and the rest can finish in any order; the merger just collects them. FIFO here would force per-group serialization and tank parallelism. Standard queue with at-least-once delivery + idempotent worker is the right call.

Stage 3: result assembly (Standard with deduplication)

Results from 11 agents per chunk merge back. Standard works because the merge step is idempotent — writing the same agent result twice produces the same output. The trick: each merge writes to DynamoDB with a conditional update on (chunk_id, agent_id). Duplicate deliveries hit the conditional and short-circuit.

Dead-letter settings per queue

  • FIFO chunk queue: maxReceiveCount = 3, DLQ retention 14 days. A wedged chunk blocks its group — fail fast and alert.
  • Standard analysis queue: maxReceiveCount = 10, retention 14 days. Leaf-level retries are cheap, model-side rate limits resolve naturally.
  • Standard merge queue: maxReceiveCount = 5, retention 7 days. Failures here usually mean a chunk_id has been deleted from DynamoDB — short DLQ, fast triage.

When to ignore this advice

Pipelines with fewer than ~50 messages per request rarely justify the topology split — the operational overhead of three queue types and three DLQ alerts costs more than reordering at the assembler. The split earns its keep when one job emits 1,000+ messages and a single bad message must not block the rest.

What we measured

  • 22 chunks per contract → 22 FIFO messages → ~5 sec to drain
  • 22 chunks × 11 agents = 242 Standard messages → ~25 sec parallel processing
  • DLQ rate steady-state: <0.1% of messages
  • P95 end-to-end: 154 sec from upload to assembled report

Share this article

Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI | 10+ years

Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...

Founder & CEO of Cognilium AI; 100+ production AI systems shipped; multi-cloud AI architecture (AWSGCPAzure); built and operated 4 production AI products
Agentic AIRAG → GraphRAG retrievalVoice AIMulti-Agent Orchestration

Frequently Asked Questions

Find answers to common questions about the topics covered in this article.

Still have questions?

Get in touch with our team for personalized assistance.

Contact Us

Related Articles

Continue exploring related topics and insights from our content library.

Multi-Agent Orchestration on AWS Bedrock AgentCore
9 min
1
Muhammad Mudassir
May 4, 2026

Multi-Agent Orchestration on AWS Bedrock AgentCore

The supervisor + specialist pattern is the most reliable way to ship multi-agent systems on AWS — here is how to wire it, observe it, and bound its cost.

words
Read Article
Surviving Partial Failure in a 3,300-Call Agent Pipeline
8 min
2
Muhammad Mudassir
May 5, 2026

Surviving Partial Failure in a 3,300-Call Agent Pipeline

Two-tier retries, atomic DynamoDB chunk claims, and checkpoint-based cancellation — the failure-recovery layer that lets a multi-agent contract review pipeline finish even when 5% of LLM calls fail.

words
Read Article
Supervisor-Router on Google ADK with Per-Org Tool Registration
9 min
3
Muhammad Mudassir
May 5, 2026

Supervisor-Router on Google ADK with Per-Org Tool Registration

Building a multi-tenant agent platform on Google ADK where the supervisor binds only the tools each org has paid for and integrated — without forking the agent definition per tenant.

words
Read Article

Explore More Insights

Discover more expert articles on AI, engineering, and technology trends.