Your consumer must scale reads from Kinesis Data Streams with low‑latency fan‑out for multiple applications while avoiding hot shards. What configuration is most effective?

Question

Assisting AI · Accepted Answer

The most effective configuration is to enable Enhanced Fan‑Out (EFO) for each consumer, design a balanced partition key strategy, and monitor iterator age to trigger scaling. EFO gives each consumer a dedicated read throughput of 2 MB/s per shard, eliminating contention and ensuring low latency. To avoid hot shards, choose a partition key that distributes events evenly—e.g., hash a composite of user ID and timestamp or use a random UUID prefix. Implement a monitoring loop that checks the iterator age metric; if it exceeds a threshold, add more shards or create additional consumers. This approach keeps per‑shard throughput within limits, prevents throttling, and scales horizontally as the number of applications grows. Options that batch to S3 or use SQS introduce unnecessary latency and complexity, while legacy shard‑by‑consumer models are prone to hot‑shard issues.

Your consumer must scale reads from Kinesis Data Streams with low‑latency fan‑out for multiple applications while avoiding hot shards. What configuration is most effective?

💡 Model Answer

🎤 Get questions like this answered in real-time