HomeInterview QuestionsIs there a daily job that performs compliance dele…

Is there a daily job that performs compliance deletes of specific customer rows across massive historical Iceberg tables?

🟡 Medium Conceptual Mid level
1Times asked
May 2026Last seen
May 2026First seen

💡 Model Answer

In Apache Iceberg, compliance deletes are handled by writing delete files rather than rewriting entire partitions. A daily job can use Spark, Flink, or Hive to generate delete files that reference the primary key or row identifier of the rows to be removed. The job should target only the partitions that contain the affected customers to keep the operation efficient. After the delete files are written, Iceberg’s metadata is updated to mark those rows as deleted, and query engines will automatically exclude them. To keep the table healthy, a compaction job should run periodically to merge small delete files and data files, reducing the number of tiny files and improving query performance. Using partition pruning, predicate pushdown, and the Iceberg catalog’s snapshot mechanism ensures that the deletes are consistent and that queries see a single, up‑to‑date view of the data.

This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.

🎤 Get questions like this answered in real-time

Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.

Get Assisting AI — Starts at ₹500