Solution Pattern - SQL Server Family - Joining Objects in the Persistent Staging Area
Purpose
Section titled “Purpose”This Solution Pattern shows how to join two time-variant data objects from the Persistent Staging Area (PSA) into a single, continuous timeline using set-based SQL.
Motivation
Section titled “Motivation”Combining two time-variant objects means reconciling two independent timelines into one. Each object changes on its own cadence — a customer’s details and a category’s details each have their own change history — so the combined result must reflect a change whenever either object changes.
Joining directly on the key alone would lose this temporal detail, and joining on exact timestamps would miss changes that occurred between each other’s change moments. Instead, the combined timeline is assembled from both points of view and merged.
Applicability
Section titled “Applicability”This pattern applies when data from two (or more) PSA objects must be combined at a shared key while preserving the full change history of both — for example, to prepare an integrated set for loading downstream data objects, or to evaluate combined states across objects. It assumes each object carries a from timestamp per change (with the before timestamp derivable), plus the PSA’s standard columns (inscription timestamp, change data indicator, checksum).
Structure
Section titled “Structure”The statement builds the combined timeline from both perspectives:
- For each object, the effective time periods are derived at runtime:
LEADper key produces the timestamp that closes each interval (closed-open, with9999-12-31as the high-end default). - The first perspective takes every change of the first object and joins it to the period of the second object that was in effect at that moment (
from >= start AND from < end). - The second perspective does the same in reverse, so changes of the second object that fall between changes of the first are also represented.
- A
UNION ALLmerges both perspectives, and a finalLEADover the combined set derives the before date that closes each interval — producing a single timeline with no gaps or overlaps.
Three runnable variants are provided. The complete example:
/******************************************************************************* * Data Engine Thinking ******************************************************************************* * * Purpose: * - Orchestration example for joining objects in the PSA (variant 1). * * Disclaimer: * - See disclaimer.md in the repository root. * ******************************************************************************/
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#Customer') IS NOT NULL DROP TABLE #Customer
/* Create the customer table */CREATE TABLE #Customer( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CustomerID] integer NOT NULL ,[CategoryID] varchar(100) NOT NULL ,[FavoriteColor] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CustomerSELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CustomerID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[FavoriteColor])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[EffectiveDate])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CustomerID] ,[CategoryID] ,[FavoriteColor] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2023-01-24 08:40:13') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-01-22 18:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Yellow' AS [FavoriteColor],'2023-01-22' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2023-03-30 01:40:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-03-29 08:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Orange' AS [FavoriteColor],'2023-03-29' AS [EffectiveDate]UNION-- Record 3SELECT CONVERT(DATETIME2(7),'2023-06-22 07:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-06-22 05:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Purple' AS [FavoriteColor],'2023-06-22' AS [EffectiveDate]) sub
SELECT * FROM #Customer
/* Category */
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#Category') IS NOT NULL DROP TABLE #Category
/* Create the category table */CREATE TABLE #Category( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CategoryID] varchar(100) NOT NULL ,[CategoryGroupID] varchar(100) NULL ,[CategoryName] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CategorySELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryName])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[EffectiveDate])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CategoryID] ,[CategoryGroupID] ,[CategoryName] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2018-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2018-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'FirstCategoryName' AS [CategoryName],'2018-01-01' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2020-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2020-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'SecondCategoryName' AS [CategoryName],'2020-01-01' AS [EffectiveDate]UNION-- Record 3SELECT CONVERT(DATETIME2(7),'2023-02-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-02-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'ThirdCategoryName' AS [CategoryName],'2023-02-01' AS [EffectiveDate]UNION-- Record 4SELECT CONVERT(DATETIME2(7),'2024-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2024-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'FourthCategoryName' AS [CategoryName],'2024-01-01' AS [EffectiveDate]) sub
SELECT * FROM #Category
/* Category Group*/
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#CategoryGroup') IS NOT NULL DROP TABLE #CategoryGroup
/* Create the category group table */CREATE TABLE #CategoryGroup( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CategoryGroupID] varchar(100) NOT NULL ,[CategoryGroupName] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CategoryGroupSELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupName])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CategoryGroupID] ,[CategoryGroupName] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2017-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2017-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'Group10' AS [CategoryGroupID],'TopGroup' AS [CategoryGroupName],'2017-01-01' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2023-02-25 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-02-25 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'Group10' AS [CategoryGroupID],'TopGroupUpdated' AS [CategoryGroupName],'2023-02-25' AS [EffectiveDate]) sub
SELECT * FROM #CategoryGroup
/* Join */
/* Customer perspective*/SELECT cust.EffectiveDate ,cust.INSCRIPTION_TIMESTAMP ,cust.SOURCE_TIMESTAMP ,cust.AUDIT_TRAIL_ID ,cust.INSCRIPTION_RECORD_ID ,cust.CHANGE_DATA_INDICATOR ,cust.CustomerID ,cust.FavoriteColor ,cat.CategoryID ,cat.CategoryName ,grp.CategoryGroupNameFROM #Customer custLEFT JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Category) catON cust.CategoryID = cat.CategoryIdAND cust.EffectiveDate >= cat.START_DATE_KEY AND cust.EffectiveDate < cat.END_DATE_KEYLEFT JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryGroupId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #CategoryGroup) grpON cat.CategoryGroupId = grp.CategoryGroupIDAND cust.EffectiveDate >= grp.START_DATE_KEY AND cust.EffectiveDate < grp.END_DATE_KEY
UNION ALL
/* Category perspective*/SELECT cat.EffectiveDate ,cat.INSCRIPTION_TIMESTAMP ,cat.SOURCE_TIMESTAMP ,cat.AUDIT_TRAIL_ID ,cat.INSCRIPTION_RECORD_ID ,cat.CHANGE_DATA_INDICATOR ,cust.CustomerID ,cust.FavoriteColor ,cat.CategoryID ,cat.CategoryName ,grp.CategoryGroupNameFROM #Category catINNER JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CustomerId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Customer) custON cat.CategoryID = cust.CategoryIDAND cat.EffectiveDate >= cust.START_DATE_KEY AND cat.EffectiveDate < cust.END_DATE_KEYLEFT JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryGroupId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #CategoryGroup) grpON cat.CategoryGroupId = grp.CategoryGroupIDAND cat.EffectiveDate >= grp.START_DATE_KEY AND cat.EffectiveDate < grp.END_DATE_KEY
UNION ALL
/* Category Group perspective*/SELECT grp.EffectiveDate ,grp.INSCRIPTION_TIMESTAMP ,grp.SOURCE_TIMESTAMP ,grp.AUDIT_TRAIL_ID ,grp.INSCRIPTION_RECORD_ID ,grp.CHANGE_DATA_INDICATOR ,cust.CustomerID ,cust.FavoriteColor ,cat.CategoryID ,cat.CategoryName ,grp.CategoryGroupNameFROM #CategoryGroup grpINNER JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Category) catON grp.CategoryGroupId = cat.CategoryGroupIDAND grp.EffectiveDate >= cat.START_DATE_KEY AND grp.EffectiveDate < cat.END_DATE_KEYINNER JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CustomerId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Customer) custON cat.CategoryID = cust.CategoryIDAND grp.EffectiveDate >= cust.START_DATE_KEY AND grp.EffectiveDate < cust.END_DATE_KEY
ORDER BY 1A pared-down version that is easier to follow:
/******************************************************************************* * Data Engine Thinking ******************************************************************************* * * Purpose: * - Simplified orchestration example for joining objects in the PSA. * * Disclaimer: * - See disclaimer.md in the repository root. * ******************************************************************************/
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#Customer') IS NOT NULL DROP TABLE #Customer
/* Create the customer table */CREATE TABLE #Customer( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CustomerID] integer NOT NULL ,[CategoryID] varchar(100) NOT NULL ,[FavoriteColor] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CustomerSELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CustomerID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[FavoriteColor])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[EffectiveDate])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CustomerID] ,[CategoryID] ,[FavoriteColor] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2023-01-24 08:40:13') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-01-22 18:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Yellow' AS [FavoriteColor],'2023-01-22' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2023-03-30 01:40:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-03-29 08:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Orange' AS [FavoriteColor],'2023-03-29' AS [EffectiveDate]UNION-- Record 3SELECT CONVERT(DATETIME2(7),'2023-06-22 07:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-06-22 05:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Purple' AS [FavoriteColor],'2023-06-22' AS [EffectiveDate]) sub
SELECT * FROM #Customer
/* Category */
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#Category') IS NOT NULL DROP TABLE #Category
/* Create the category table */CREATE TABLE #Category( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CategoryID] varchar(100) NOT NULL ,[CategoryGroupID] varchar(100) NULL ,[CategoryName] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CategorySELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryName])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[EffectiveDate])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CategoryID] ,[CategoryGroupID] ,[CategoryName] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2018-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2018-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'FirstCategoryName' AS [CategoryName],'2018-01-01' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2020-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2020-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'SecondCategoryName' AS [CategoryName],'2020-01-01' AS [EffectiveDate]UNION-- Record 3SELECT CONVERT(DATETIME2(7),'2023-02-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-02-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'ThirdCategoryName' AS [CategoryName],'2023-02-01' AS [EffectiveDate]UNION-- Record 4SELECT CONVERT(DATETIME2(7),'2024-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2024-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'FourthCategoryName' AS [CategoryName],'2024-01-01' AS [EffectiveDate]) sub
SELECT * FROM #Category
/* Join Query */
SELECT * ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CustomerId ORDER BY EffectiveDate ASC) AS EffectiveBeforeDateFROM(/* Customer perspective*/SELECT cust.EffectiveDate ,cust.INSCRIPTION_TIMESTAMP ,cust.SOURCE_TIMESTAMP ,cust.AUDIT_TRAIL_ID ,cust.INSCRIPTION_RECORD_ID ,cust.CHANGE_DATA_INDICATOR ,cust.CustomerID ,cust.FavoriteColor ,cat.CategoryID ,cat.CategoryNameFROM #Customer custLEFT JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Category) catON cust.CategoryID = cat.CategoryIdAND cust.EffectiveDate >= cat.START_DATE_KEY AND cust.EffectiveDate < cat.END_DATE_KEY
UNION ALL
/* Category perspective*/SELECT cat.EffectiveDate ,cat.INSCRIPTION_TIMESTAMP ,cat.SOURCE_TIMESTAMP ,cat.AUDIT_TRAIL_ID ,cat.INSCRIPTION_RECORD_ID ,cat.CHANGE_DATA_INDICATOR ,cust.CustomerID ,cust.FavoriteColor ,cat.CategoryID ,cat.CategoryNameFROM #Category catINNER JOIN( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CustomerId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Customer) custON cat.CategoryID = cust.CategoryIDAND cat.EffectiveDate >= cust.START_DATE_KEY AND cat.EffectiveDate < cust.END_DATE_KEY) finalORDER BY 1An alternative formulation:
/******************************************************************************* * Data Engine Thinking ******************************************************************************* * * Purpose: * - Orchestration example for joining objects in the PSA (variant 2). * * Disclaimer: * - See disclaimer.md in the repository root. * ******************************************************************************/
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#Customer') IS NOT NULL DROP TABLE #Customer
/* Create the customer table */CREATE TABLE #Customer( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CustomerID] integer NOT NULL ,[CategoryID] varchar(100) NOT NULL ,[FavoriteColor] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CustomerSELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CustomerID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[FavoriteColor])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[EffectiveDate])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CustomerID] ,[CategoryID] ,[FavoriteColor] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2023-01-24 08:40:13') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-01-22 18:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Yellow' AS [FavoriteColor],'2023-01-22' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2023-03-30 01:40:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-03-29 08:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Orange' AS [FavoriteColor],'2023-03-29' AS [EffectiveDate]UNION-- Record 3SELECT CONVERT(DATETIME2(7),'2023-06-22 07:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-06-22 05:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],100 AS [CustomerID],'A' AS [CategoryID],'Purple' AS [FavoriteColor],'2023-06-22' AS [EffectiveDate]) sub
SELECT * FROM #Customer
/* Category */
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#Category') IS NOT NULL DROP TABLE #Category
/* Create the category table */CREATE TABLE #Category( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CategoryID] varchar(100) NOT NULL ,[CategoryGroupID] varchar(100) NULL ,[CategoryName] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CategorySELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryName])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[EffectiveDate])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CategoryID] ,[CategoryGroupID] ,[CategoryName] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2018-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2018-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'FirstCategoryName' AS [CategoryName],'2018-01-01' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2020-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2020-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'SecondCategoryName' AS [CategoryName],'2020-01-01' AS [EffectiveDate]UNION-- Record 3SELECT CONVERT(DATETIME2(7),'2023-02-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-02-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'ThirdCategoryName' AS [CategoryName],'2023-02-01' AS [EffectiveDate]UNION-- Record 4SELECT CONVERT(DATETIME2(7),'2024-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2024-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'A' AS [CategoryID],'Group10' AS [CategoryGroupID],'FourthCategoryName' AS [CategoryName],'2024-01-01' AS [EffectiveDate]) sub
SELECT * FROM #Category
/* Category Group*/
/* Drop the table, if it exists */IF OBJECT_ID('tempdb..#CategoryGroup') IS NOT NULL DROP TABLE #CategoryGroup
/* Create the category group table */CREATE TABLE #CategoryGroup( [INSCRIPTION_TIMESTAMP] datetime2(7) ,[INSCRIPTION_RECORD_ID] int ,[SOURCE_TIMESTAMP] datetime2(7) ,[CHANGE_DATA_INDICATOR] char(1) ,[AUDIT_TRAIL_ID] int ,[CHECKSUM] binary(16) ,[CategoryGroupID] varchar(100) NOT NULL ,[CategoryGroupName] varchar(100) NULL ,[EffectiveDate] date NULL)
INSERT INTO #CategoryGroupSELECT [INSCRIPTION_TIMESTAMP] ,[INSCRIPTION_RECORD_ID] ,[SOURCE_TIMESTAMP] ,[CHANGE_DATA_INDICATOR] ,[AUDIT_TRAIL_ID] ,HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupID])), 'N/A') + '#~!' + ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),[CategoryGroupName])), 'N/A') + '#~!' ) AS [CHECKSUM] ,[CategoryGroupID] ,[CategoryGroupName] ,[EffectiveDate]FROM(-- Record 1SELECT CONVERT(DATETIME2(7),'2017-01-01 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2017-01-01 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'Group10' AS [CategoryGroupID],'TopGroup' AS [CategoryGroupName],'2017-01-01' AS [EffectiveDate]UNION-- Record 2SELECT CONVERT(DATETIME2(7),'2023-02-25 00:00:00') AS [INSCRIPTION_TIMESTAMP],1 AS [INSCRIPTION_RECORD_ID],CONVERT(DATETIME2(7),'2023-02-25 00:00:00') AS [SOURCE_TIMESTAMP],'C' AS [CHANGE_DATA_INDICATOR],0 AS [AUDIT_TRAIL_ID],'Group10' AS [CategoryGroupID],'TopGroupUpdated' AS [CategoryGroupName],'2023-02-25' AS [EffectiveDate]) sub
SELECT * FROM #CategoryGroup
/* Join */SELECT START_DATE_KEY AS EffectiveDate ,CustomerID ,FavoriteColor ,CategoryName ,CategoryGroupNameFROM( SELECT *, LAG(ATTRIBUTE_CHECKSUM, 1, 0x00000000000000000000000000000000) OVER(PARTITION BY CustomerId ORDER BY START_DATE_KEY ASC) AS PREVIOUS_ATTRIBUTE_CHECKSUM FROM ( SELECT *, ISNULL(RTRIM(CONVERT(VARCHAR(100),CustomerID)),'NA')+'|'+ ISNULL(RTRIM(CONVERT(VARCHAR(100),FavoriteColor)),'NA')+'|' + ISNULL(RTRIM(CONVERT(VARCHAR(100),CategoryName)),'NA')+'|' + ISNULL(RTRIM(CONVERT(VARCHAR(100),CategoryGroupName)),'NA')+'|' AS ATTRIBUTE_CHECKSUM FROM ( SELECT cust.CustomerID ,cust.FavoriteColor ,cat.CategoryName ,grp.CategoryGroupName ,TimeRanges.START_DATE_KEY ,TimeRanges.END_DATE_KEY FROM ( SELECT CustomerID ,EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CustomerID ORDER BY EffectiveDate ASC) AS END_DATE_KEY FROM ( SELECT CustomerID, EffectiveDate FROM #Customer UNION SELECT CustomerId, #Category.EffectiveDate FROM #Category INNER JOIN #Customer ON #Customer.CategoryID = #Category.CategoryID UNION SELECT CustomerId, #CategoryGroup.EffectiveDate FROM #CategoryGroup INNER JOIN #Category ON #CategoryGroup.CategoryGroupID = #Category.CategoryGroupID INNER JOIN #Customer ON #Category.CategoryID = #Customer.CategoryID ) PIT ) TimeRanges INNER JOIN ( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CustomerId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Customer ) cust ON TimeRanges.CustomerID = cust.CustomerID AND cust.START_DATE_KEY <= TimeRanges.START_DATE_KEY AND cust.END_DATE_KEY >= TimeRanges.END_DATE_KEY LEFT JOIN ( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #Category ) cat ON cust.CategoryID = cat.CategoryID AND cat.START_DATE_KEY <= TimeRanges.START_DATE_KEY AND cat.END_DATE_KEY >= TimeRanges.END_DATE_KEY LEFT JOIN ( SELECT EffectiveDate AS START_DATE_KEY ,LEAD(EffectiveDate,1,'9999-12-31') OVER (PARTITION BY CategoryGroupId ORDER BY EffectiveDate ASC) AS END_DATE_KEY ,* FROM #CategoryGroup ) grp ON cat.CategoryGroupId = grp.CategoryGroupId AND grp.START_DATE_KEY <= TimeRanges.START_DATE_KEY AND grp.END_DATE_KEY >= TimeRanges.END_DATE_KEY ) sub ) comparison) finalWHERE ATTRIBUTE_CHECKSUM <> PREVIOUS_ATTRIBUTE_CHECKSUMFor a detailed walkthrough of this approach, see the blog post Joining tables in the persistent staging area.
Implementation guidelines
Section titled “Implementation guidelines”- Derive each object’s time periods at runtime with
LEAD, rather than relying on persisted end dates — the derivation is then always consistent with the data being joined. - Use closed-open period comparisons (
>= start AND < end) so each change maps to exactly one period of the other object, without double-counting boundary moments. - Use a
LEFT JOINfor the leading perspective when changes may precede any period of the joined object (no match yet), and anINNER JOINfor the reverse perspective to avoid duplicating the unmatched rows. - Carry the PSA’s standard columns (inscription timestamp, inscription record identifier, change data indicator, audit trail identifier) through the join, so the combined set remains traceable.
- Apply the final
LEADover the merged set partitioned by the driving key, ordered by the effective date, to close the combined intervals.
Considerations and consequences
Section titled “Considerations and consequences”- The combined timeline contains a record for every change in either object, so the result is more granular than either input. Identical consecutive states can appear when a change in one object does not affect the selected columns of the other; compacting can be applied afterwards if this redundancy is unwanted.
- The pattern generalises to more than two objects by applying it iteratively, but each additional object multiplies the timeline fragments; combining many objects this way warrants performance testing.
- Window functions over large PSA objects benefit from indexing on the key and timestamp columns used in the
PARTITION BYandORDER BYclauses. - This join resolves timelines at read time in the staging layer. In a bitemporal integration layer the same concerns are addressed structurally when loading Satellites.