Amazon Simple Queue Service (SQS)
Amazon SQS is a managed messaging service that helps decouple the parts of a distributed system. Think of it like this: you’re running a busy restaurant. Waiters take customer orders, and the kitchen prepares them. If each waiter had to hand an order directly to a chef, they’d have to wait until the chef was ready—which would slow everything down. Instead, imagine a whiteboard where the waiter writes down orders, and the kitchen picks them up when they’re ready. This whiteboard is your message queue—SQS works the same way.
With SQS, applications (the producers) send messages to a queue, and other components (the consumers) pick up those messages and process them independently. The beauty of this approach is that producers and consumers don’t need to be available at the same time. This enables greater scalability and reliability, particularly during traffic spikes or slowdowns in one part of the system.
Standard vs FIFO Queues
SQS offers two types of queues:
- Standard queues are the default option and support high throughput. They guarantee at-least-once message delivery, meaning a message might be delivered more than once, but it won’t be lost. Message order is not guaranteed, so if processing order matters, this may not be suitable. However, they support essentially unlimited messages and transactions per second, with very low latency—under 10 milliseconds.
-
FIFO (First-In, First-Out) queues guarantee that each message is processed exactly once and in the exact order it was sent—which is critical for use cases like financial transactions or order processing.
FIFO queues provide strong ordering and deduplication, but their throughput is limited compared to Standard queues.
- Without batching, FIFO queues support up to 300 messages per second per
MessageGroupId
. - With batching, you can reach up to 3,000 messages per second (10 messages per batch × 300 batch requests).
This gives you deterministic behavior with decent throughput, as long as you use batching and multiple message groups when needed.
- Without batching, FIFO queues support up to 300 messages per second per
In both types, each message can be up to 256KB in size, and messages can be kept in the queue for up to 14 days, allowing for delayed or retry-based processing if needed.
Visibility Timeout and Message Lifecycle
A key part of how SQS works is the visibility timeout. When a consumer picks up a message, that message becomes “invisible” to other consumers for a short period—this is to prevent it from being processed more than once at the same time. If the consumer successfully finishes processing, it deletes the message. If it fails to do so within the visibility timeout (which can be configured from 0 seconds to 12 hours), the message becomes visible again and can be retried.
To handle messages that repeatedly fail to process correctly—perhaps due to a bug or bad input—you can use a Dead Letter Queue (DLQ). Messages that fail to process a certain number of times are moved to this separate queue for inspection, without blocking other messages in the main queue.
Long Polling for Efficiency
Instead of checking constantly to see if new messages are available (which costs more and wastes compute), SQS supports long polling. This allows the consumer to wait for a message to arrive instead of checking repeatedly. Think of it like waiting at a bus stop: instead of asking someone every few seconds if the bus has arrived, you simply sit and wait to be notified when it does.
Real-World Architecture Example
Let’s say you run a photo-sharing app. When users upload photos, your front-end service stores the image in S3 and sends a message to an SQS queue. Behind the scenes, an EC2 instance or a fleet of instances (managed by an Auto Scaling Group) are polling the queue. When a message arrives, one of the instances picks it up, processes the image (e.g., resizes or watermarks it), and then deletes the message once complete.
This architecture is powerful because:
- The front-end remains fast and responsive
- You can scale the backend up or down depending on queue length
- Failed jobs can be retried
- Problematic messages can be routed to a DLQ for investigation
This separation of responsibilities—between accepting the upload and processing it—makes your system more resilient and scalable.
Security and Encryption
SQS supports in-transit encryption using HTTPS to ensure messages are securely transmitted. You can also enable server-side encryption with AWS KMS, so messages are encrypted at rest within the queue. IAM policies can be used to control access to sending, receiving, or deleting messages.
Amazon Simple Notification Service (SNS)
Amazon SNS (Simple Notification Service) is a fully managed pub/sub (publish-subscribe) messaging service, ideal for broadcasting messages to multiple systems at once. Think of it like the public address system in an airport—someone speaks into the microphone (publishes a message), and everyone tuned into that channel (subscribers) hears it and reacts accordingly. It’s a powerful tool for building event-driven architectures where many parts of your system need to respond to the same event.
How SNS Works
At the core of SNS are topics—named channels to which you can publish messages. When a message is sent to a topic, all subscribers to that topic receive a copy of the message. These subscribers could be:
- SQS queues (for buffering messages)
- Lambda functions (for immediate processing)
- Email addresses (for human notifications)
- SMS messages (for mobile alerts)
- HTTP/S endpoints (for integration with other systems)
- Mobile push notifications (for iOS, Android, etc.)
This fan-out pattern is incredibly useful. For example, imagine you run an e-commerce site and want to notify different systems whenever an order is placed. You could publish an “order placed” message to an SNS topic, and have:
- A Lambda function updating inventory
- An SQS queue storing orders for batch processing
- An email sent to the customer
- A webhook that notifies your analytics platform
All of these subscribers would get the message instantly and independently.
Topic-Based vs Direct Publishing
SNS generally operates through topics, but it also allows direct publishing, particularly for SMS and mobile push messages. You can publish directly to an individual phone number or device endpoint if you don’t want to use topics. This is handy for alerting a single user (e.g., sending a one-time passcode via SMS). However, for most architectural designs involving multiple consumers, topic-based publishing is preferred because it scales better and allows easy addition or removal of subscribers without touching the producer code.
Standard vs FIFO Topics
SNS offers two types of topics: Standard topics and FIFO (First-In-First-Out) topics.
Standard topics are the default and support:
- At-least-once delivery
- Best-effort ordering (order not guaranteed)
- Very high throughput (millions of messages per second)
- Broad integration support
Use standard topics when you need high fan-out and can tolerate occasional duplicates or out-of-order messages, like broadcasting events to multiple systems or notifying users across multiple channels.
FIFO topics, on the other hand, are designed for use cases where exact message ordering and deduplication matter. With FIFO:
- Messages are delivered exactly once
- Order is preserved within a message group
- Throughput is limited (300 messages/sec by default, can scale with batching)
- Subscribers must be FIFO-compatible (e.g., FIFO SQS queues)
Use FIFO topics when processing financial transactions, ledgers, or workflows where order and precision matter. Note: Only SQS FIFO queues can subscribe to FIFO topics—other endpoints like Lambda are not supported.
Message Filtering
Sometimes, not all subscribers care about every message. SNS allows you to attach filter policies to each subscriber. These are like email filters—they let a subscriber declare, “only send me messages where the type is order_placed
or the region is EU
.” This reduces unnecessary traffic and processing, and lets you use one topic for multiple message types.
Filtering is done using message attributes, which are key-value pairs included with the message. This gives you flexibility to route messages intelligently.
Message Durability and Retry Behavior
Unlike SQS, SNS is not a message queue—it doesn’t store messages for long. Once a message is delivered (or a delivery attempt is made), it’s gone. If the subscriber is unavailable, the message may be lost unless it’s being delivered to an endpoint that has built-in retry logic, like Lambda or SQS.
For example:
- SNS will retry delivery to HTTP/S endpoints with exponential backoff for several minutes.
- If you publish to an SQS queue, the message will be stored safely until the consumer picks it up.
- If a Lambda is subscribed and fails, SNS will retry for a short period.
For systems where you can’t afford to lose messages, it’s common to subscribe an SQS queue to the topic. The queue provides buffering and durability, and other components can consume from it at their own pace.
Security and Access Control
SNS integrates with IAM, so you can control who can publish to or subscribe to a topic. This is critical in large systems to avoid rogue components sending messages to unintended places.
For sensitive use cases (e.g., healthcare or financial data), you can enforce encryption:
- In-transit encryption is provided by default over HTTPS.
- At-rest encryption is available using AWS Key Management Service (KMS). You can associate a KMS key with your SNS topic to encrypt all messages stored temporarily during delivery.
You can also use access policies on topics to limit access by IP address, VPC endpoint, or account ID—similar to S3 bucket policies.
Delivery Limits and Quotas
SNS is built to scale, but it does have some soft limits:
- Message size: Max 256KB
- Delivery attempts for HTTP/S: Up to 23 hours with exponential backoff
- Subscriptions per topic: 12,500 (can be increased)
- Topics per account: 100,000 (soft limit)
These are generally more than enough for most use cases, but if you’re building at massive scale, it’s good to be aware of them.
Common Use Case: Multi-System Alerting
Let’s say you’re running a financial application that processes transactions. When a transaction is successful, you publish an event to an SNS topic. That topic fans out to:
- An SQS queue feeding a fraud detection service
- A Lambda function updating the transaction log
- An email to the user
- An SMS to your internal support team (for large transfers)
This setup ensures the message is handled appropriately by each component, without your original transaction processor needing to care who’s listening. It’s decoupled, flexible, and easy to scale.
Amazon Kinesis Data Streams
Amazon Kinesis Data Streams (KDS) is a fully managed, real-time data streaming service designed for high-throughput, low-latency data ingestion and processing. Unlike services like SQS or SNS, which are optimized for discrete message decoupling, KDS excels at handling continuous streams of data—such as logs, metrics, or sensor readings—enabling real-time analytics and processing.
Imagine a CCTV camera streaming live footage: instead of capturing individual snapshots, you want a continuous feed that can be analyzed on the fly. KDS provides this capability, allowing multiple consumers to read from the same stream in parallel, each maintaining its own position (shard iterator) without interfering with others.
Shards: The Building Blocks of KDS
A Kinesis data stream is composed of shards, which are the fundamental units of capacity:
- Write capacity per shard: Up to 1 MB/sec or 1,000 records/sec.
- Read capacity per shard: Up to 2 MB/sec or 5 read transactions/sec.
Each shard acts as a sequenced log of data records, ensuring ordered data within the shard. To scale your stream’s capacity, you can increase the number of shards. As of April 2025, AWS has increased the default shard limits to up to 1,000 shards per stream and up to 6,000 shards per account, depending on the region .
Capacity Modes: Provisioned vs. On-Demand
KDS offers two capacity modes:
- Provisioned Mode: You specify the number of shards, managing capacity manually. This mode is cost-effective for predictable workloads.
- On-Demand Mode: KDS automatically scales the number of shards based on your application’s traffic, accommodating gigabytes of write and read throughput per minute without manual intervention. This mode is ideal for applications with unpredictable or variable traffic patterns .
Data Retention and Replay
By default, KDS retains data for 24 hours. However, you can extend the retention period up to 365 days, allowing for late-arriving consumers to catch up or for reprocessing historical data . This flexibility supports use cases like:
- Replaying data for testing or debugging.
- Handling downstream processing delays.
- Meeting compliance requirements for data retention.
Data Ordering and Partition Keys
Within each shard, KDS maintains strict ordering of records. To control data distribution and ordering, you use partition keys when writing data:
- Records with the same partition key are routed to the same shard, preserving order.
- Effective partition key design ensures balanced shard utilization and prevents hot shards.
Producers and Consumers
Producers: Applications or services that send data to KDS. AWS provides several tools:
- AWS SDKs: For direct integration with your applications.
- Kinesis Producer Library (KPL): A high-performance library that aggregates and batches records to optimize throughput.
- Kinesis Agent: A pre-built Java application for collecting and sending data from log files to KDS.
Consumers: Applications or services that process data from KDS. Options include:
- AWS SDKs: For manual data retrieval.
- Kinesis Client Library (KCL): Simplifies consuming data by handling tasks like load balancing and checkpointing.
- Enhanced Fan-Out (EFO): Allows multiple consumers to receive data with dedicated throughput, reducing latency and avoiding shared throughput limits .
Security and Access Control
KDS provides multiple layers of security:
- Data in transit: Encrypted using HTTPS.
- Data at rest: Encrypted using AWS Key Management Service (KMS) with server-side encryption .
- Access control: Managed via AWS Identity and Access Management (IAM) policies, allowing fine-grained permissions for producers and consumers.
- VPC Endpoints: Enable secure, private connectivity between your VPC and KDS without traversing the public internet.
Monitoring and Scaling
KDS integrates with Amazon CloudWatch, providing metrics such as:
- IncomingBytes and IncomingRecords: Monitor the volume of data ingested.
- ReadProvisionedThroughputExceeded and WriteProvisionedThroughputExceeded: Indicate when your application exceeds the provisioned throughput, signaling the need to scale your stream.
To adjust capacity, you can:
- Split shards: Increase capacity by dividing a shard into two.
- Merge shards: Decrease capacity by combining two shards into one.
These operations allow you to fine-tune your stream’s capacity based on your application’s needs.
Cost Considerations
KDS pricing is based on:
- Shard hours: The number of shards provisioned and the duration they are active.
- PUT payload units: The volume of data ingested.
- Extended data retention: Additional charges apply for retention periods beyond 24 hours.
- Enhanced Fan-Out: Additional charges apply for each consumer using EFO.
AWS Data Firehose
Amazon Kinesis Data Firehose is a fully managed service that lets you deliver real-time streaming data to storage and analytics destinations — without needing to write custom code or manage infrastructure.
If Kinesis Data Streams is the raw pipe for streaming data, Firehose is the smart delivery truck that picks up data and drops it off at the right places in near real-time. Imagine a courier service that:
- Picks up data parcels (events/logs/records) as they come in,
- Optionally opens the parcel to transform or compress it,
- Delivers it to destinations like a warehouse (S3), a data warehouse (Redshift), or a search engine (OpenSearch).
You don’t have to worry about the vehicle, the route, or traffic. You just hand it the data, and it gets it there — that’s Firehose.
Common Use Cases
- Sending application logs to Amazon S3 or OpenSearch for analysis
- Streaming clickstream or IoT data into S3 or Redshift for near real-time analytics
- Ingesting security events into SIEM tools via OpenSearch
- Feeding data into a lake house architecture (S3 + Athena/Glue + Redshift)
Key Features
- Fully managed: No need to manage infrastructure, scaling, or throughput.
- Near real-time delivery: Typically 60–90 seconds delay from ingest to delivery.
- Built-in destinations: S3, Redshift, OpenSearch, and custom HTTP endpoints via API Gateway or third-party tools.
- Data transformation: Supports AWS Lambda functions to transform or enrich data before delivery.
- Data format conversion: Can automatically convert JSON to Parquet or ORC before storing in S3.
- Compression and encryption: Supports GZIP, ZIP, SNAPPY, and encryption at rest with KMS.
- Error handling and retry logic: Automatically retries failed data delivery attempts.
Supported Destinations
Out of the box, Kinesis Data Firehose can deliver to:
- Amazon S3
- Amazon Redshift
- Amazon OpenSearch Service (or self-managed Elasticsearch)
- HTTP endpoints
- Datadog, Splunk, New Relic (via HTTP)
Transformation with Lambda
You can attach a Lambda function to Firehose to:
- Mask sensitive fields (e.g., remove PII)
- Convert formats (e.g., XML to JSON)
- Filter unwanted data
This is optional, but powerful when you need to reshape the data mid-stream.
Format Conversion for S3
When sending data to S3, Firehose can convert JSON to Parquet or ORC. These are efficient columnar formats often used for analytics. This is a big benefit when you’re planning to use Athena, Redshift Spectrum, or Glue to query data.
Buffering and Delivery
Firehose buffers incoming data based on size (1–128 MB) or time (60–900 seconds) before sending a batch to the destination. You can configure these settings to trade off between delivery latency and throughput.
Error Handling
If delivery fails (e.g., because Redshift is unreachable), Firehose will retry for up to 24 hours. You can also configure a backup S3 bucket to store failed records for debugging.
Pricing
Firehose pricing depends on:
- Ingestion volume (per GB ingested)
- Data transformation (if using Lambda)
- Format conversion (if converting to Parquet/ORC)
- VPC delivery (if data is delivered into a private subnet)
There are no charges for delivery to S3, but there are small charges for Redshift/OpenSearch delivery.
Security
- Encryption in transit with HTTPS
- Encryption at rest with AWS KMS
- Supports IAM-based access control
- Can deliver into private VPCs using VPC delivery stream
Firehose vs. Kinesis Data Streams
Feature | Kinesis Data Streams | Kinesis Data Firehose |
---|---|---|
Use case | Real-time processing | Near real-time delivery |
Needs consumer app? | Yes (e.g., EC2, Lambda) | No |
Fully managed? | No | Yes |
Delivers to S3/Redshift? | Not directly | Yes |
Built-in data transformation? | No | Yes (with Lambda) |
Latency | ~millisecond | ~60 seconds |
Replay capability | Yes | No |
Data storage | Up to 365 days | None |
Architecture Example
Real-life scenario: An e-commerce site wants to analyze user activity.
Architecture:
- The front-end app sends clickstream events to Firehose.
- Firehose buffers and optionally transforms data (e.g., adds timestamps).
- It then:
- Delivers raw data to S3 (for lakehouse & auditing),
- Converts and stores in Parquet (for Athena),
- Pushes some events to Redshift for dashboards.
This is a simple, low-maintenance, scalable pipeline.
Limitations
- Not for ultra-low latency use cases (like fraud detection in milliseconds)
- No support for replaying data (unlike Kinesis Data Streams)
- Buffering adds delivery delay (60 seconds minimum)
- Transformation with Lambda has size and timeout limits
Amazon MQ
Amazon MQ is a managed message broker service that supports industry-standard messaging protocols like AMQP, MQTT, and STOMP. It’s based on open-source brokers like Apache ActiveMQ and RabbitMQ and is best suited for enterprises with legacy systems that already use these protocols.
Let’s say your company has a Java-based backend that communicates using JMS (Java Message Service). Rewriting it to use SQS or Kinesis would require significant changes. Amazon MQ provides a drop-in replacement with minimal code changes, while still offering the benefits of AWS-managed infrastructure.
Unlike SQS and SNS, Amazon MQ maintains message order, supports transactions, and has advanced message routing features. It’s typically used in more traditional enterprise applications.
However, it’s not designed for serverless or highly scalable cloud-native systems. Think of it as the bridge between your modern microservices and your legacy enterprise backend.
In order to have high availability, you will need to define MQ Brokers in more than one AZ, and use EFS as backend storage backing all the individual Brokers.