Provide code for SCD Type 2.
💡 Model Answer
SCD Type 2 preserves historical data by adding new rows with a surrogate key and validity dates. Below is a sample SQL implementation using a staging table (stg_customer) and a dimension table (dim_customer).
-- 1. Identify new and changed records
WITH changes AS (
SELECT
s.customer_id,
s.name,
s.email,
s.address,
d.customer_key,
d.end_date
FROM stg_customer s
LEFT JOIN dim_customer d
ON s.customer_id = d.customer_id
AND d.end_date = '9999-12-31'
WHERE d.customer_key IS NULL
OR s.name <> d.name
OR s.email <> d.email
OR s.address <> d.address
)
-- 2. Close existing rows
UPDATE dim_customer
SET end_date = CURRENT_DATE - INTERVAL '1' DAY
FROM changes c
WHERE dim_customer.customer_id = c.customer_id
AND dim_customer.end_date = '9999-12-31';
-- 3. Insert new rows
INSERT INTO dim_customer (customer_key, customer_id, name, email, address, start_date, end_date)
SELECT
NEXTVAL('dim_customer_seq') AS customer_key,
customer_id,
name,
email,
address,
CURRENT_DATE AS start_date,
'9999-12-31' AS end_date
FROM changes;Explanation:
- The CTE
changesfinds records that are new or have changed compared to the current active row (whereend_dateis the future sentinel). - The
UPDATEstep sets theend_dateof the old row to yesterday, effectively closing it. - The
INSERTstep adds a new row with a fresh surrogate key and the current date asstart_date.
This pattern ensures that every change is tracked, and queries can filter on start_date/end_date to reconstruct the state at any point in time. Complexity is O(n) for the CTE and O(m) for the update/insert, where n is the number of staging rows and m is the number of affected dimension rows.
This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.
🎤 Get questions like this answered in real-time
Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.
Get Assisting AI — Starts at ₹500