HomeInterview QuestionsHave you designed any pipeline for processing stre…

Have you designed any pipeline for processing streaming data from an implementation perspective?

🟡 Medium Behavioral Mid level
1Times asked
Jul 2026Last seen
Jul 2026First seen

💡 Model Answer

S: In my previous role, we needed to process real‑time clickstream data for a marketing dashboard.

T: I was tasked with designing a scalable, fault‑tolerant pipeline that could ingest millions of events per day.

A: I chose Amazon Kinesis Data Streams as the ingestion layer, creating 10 shards to handle peak throughput. A consumer application written in Java used the Kinesis Client Library to read records, batch them, and forward to Apache Flink for real‑time aggregation. Flink performed windowed joins with a PostgreSQL reference dataset and wrote results to Amazon S3 in Parquet format for downstream BI tools. I added CloudWatch metrics and alarms for shard lag, and used Kinesis Data Firehose to deliver a copy of the raw stream to Elasticsearch for search.

R: The pipeline achieved <200 ms latency, handled 5 M events per hour, and scaled automatically during traffic spikes. It reduced dashboard refresh time from 10 minutes to 2 minutes and was adopted company‑wide.

This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.

🎤 Get questions like this answered in real-time

Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.

Get Assisting AI — Starts at ₹500