I have experience building end‑to‑end solutions. For example, if I have around three sources, Oracle, consider only two sources, Oracle SQL, around 60 odd tables that need to be loaded daily into the system. For example, in Azure Synapse SQL, what would be the approach to load these tables daily?

Question

Assisting AI · Accepted Answer

In Azure Synapse, I would use Synapse Pipelines (built on Azure Data Factory) to orchestrate the ETL. First, create linked services for each Oracle source and the Synapse SQL pool. For each of the ~60 tables, design a pipeline that extracts data using the Oracle connector, writes it to a staging area in ADLS Gen2 in Parquet format, and then loads it into Synapse using PolyBase or COPY statements. To handle incremental loads, enable CDC on Oracle or use a timestamp column to fetch only changed rows. Use parameterized pipeline activities so the same template can run for all tables. Schedule the pipeline to run nightly, and use triggers or Azure Scheduler. Add error handling with retry policies and logging to a monitoring table. Finally, after loading, run data quality checks and update metadata. This approach scales, is maintainable, and leverages Synapse’s high‑performance ingestion capabilities.

💡 Model Answer

🎤 Get questions like this answered in real-time