Design a system for a news website that can capture all user details and cookie information for every article view, handling millions of users arriving at any time.
💡 Model Answer
Front‑end: Each article page includes a tracking script that collects user ID (if logged in), session ID, cookies, referrer, device info, and IP. The script sends a lightweight event to a CDN edge endpoint (e.g., CloudFront, Akamai) via HTTPS.
Edge: The CDN forwards the event to an API Gateway that validates the payload and publishes it to a Kafka topic "article_views". Partition by user ID to keep per‑user ordering.
Ingestion: A Kafka Streams application enriches the event with geo‑location (via IP lookup), user profile (from a cache or microservice), and article metadata (title, category). It then writes enriched events to a second topic "article_views_enriched".
Storage: Use a data lake (S3/Blob) for raw events and a data warehouse (Snowflake/BigQuery) for analytics. A Kafka Connect sink writes to the warehouse. For real‑time dashboards, a separate stream processor feeds data into a time‑series database (InfluxDB, ClickHouse).
Privacy: Store only hashed identifiers, comply with GDPR/CCPA, and provide opt‑out mechanisms. Use encryption at rest and in transit.
Scalability: Kafka handles millions of events per second; partitions and replication ensure high availability. API Gateway auto‑scales, and the stream processors run in a Kubernetes cluster with horizontal pod autoscaling. Monitoring with Prometheus and Grafana tracks latency, error rates, and throughput.
This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.
🎤 Get questions like this answered in real-time
Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.
Get Assisting AI — Starts at ₹500