Solution Pattern - SQL Server Family - Creating a Fact Object
Purpose
Section titled “Purpose”This Solution Pattern shows how to create a fact object from a Point-In-Time (PIT) table and the Satellite that carries the measurable events, with a single SQL statement.
Motivation
Section titled “Motivation”Facts combine references to dimensions with measures aggregated at a chosen grain. When a PIT table already resolves which Satellite records apply per key and moment, the fact statement reduces to: join the metric Satellite by its record locator, derive the dimension keys, and aggregate the measures.
Applicability
Section titled “Applicability”This pattern applies when delivering measurable events from the integration layer as a fact object — persisted or virtual — and a PIT table exists at (or near) the fact’s intended grain.
Structure
Section titled “Structure”The statement selects from the PIT, attaches the metric Satellite, and aggregates:
- The date dimension key is derived directly from the event timestamp (formatted as
YYYYMMDD). - The dimension keys are generated at runtime with
ROW_NUMBERper dimension’s surrogate key, ordered by the state from timestamp — mirroring how the corresponding dimensions generate their keys. When the keys are persisted in the PIT, they are selected instead. - The metric Satellite is joined on the record locator the PIT carries for it: surrogate key, inscription timestamp, and inscription record identifier.
- The measures are aggregated with a
GROUP BYon the dimension references, setting the fact’s grain.
/******************************************************************************* * Data Engine Thinking ******************************************************************************* * * Purpose: * - SQL example for Creating A Fact Object. * * Disclaimer: * - See disclaimer.md in the repository root. * ******************************************************************************/
SELECT -- Date Dimension Key CONVERT(CHAR(8), sat.Transaction_Timestamp, 112) AS Transaction_Date_Key,
-- Runtime-generated dimension keys (if not persisted in PIT) ROW_NUMBER() OVER ( PARTITION BY PIT.Customer_Surrogate_Key ORDER BY PIT.State_From_Timestamp ) AS Customer_Key,
ROW_NUMBER() OVER ( PARTITION BY PIT.Product_Surrogate_Key ORDER BY PIT.State_From_Timestamp ) AS Product_Key,
-- Aggregated metric COUNT(sat.Quantity_Sold) AS Quantity_Sold
FROM PIT_Customer_Sale_Transactions PIT
-- Join to the Satellite providing the metric and timestampLEFT JOIN Sat_Product_Sale_Transaction sat ON PIT.Sat_Product_Sale_Transaction_Surrogate_Key = sat.Customer_Surrogate_Key AND PIT.Sat_Product_Sale_Transaction_Inscription_Timestamp = sat.Inscription_Timestamp AND PIT.Sat_Product_Sale_Transaction_Inscription_Record_Id = sat.Inscription_Record_Id
GROUP BY PIT.Customer_Surrogate_Key, PIT.Product_Surrogate_Key, CONVERT(CHAR(8), sat.Transaction_Timestamp, 112)Implementation guidelines
Section titled “Implementation guidelines”- Generate the statement from metadata: the PIT, the metric Satellite, the dimension references, and the measures with their aggregation functions are supplied by the fact’s mapping metadata.
- If dimension keys are generated at runtime, the generation must be identical to the corresponding dimension statements — same partition, same ordering — or facts and dimensions will not join correctly. Persisting the keys in the PIT removes this coupling and is the more robust choice.
- Derive calendar keys (such as the date key) from the event timestamp with a deterministic format, so they join the date dimension without lookups.
- Choose the
GROUP BYdeliberately: it defines the fact’s grain. Aggregating at the PIT’s grain is not required — the example aggregates to customer/product/date. - Use a
LEFT JOINfor the metric Satellite if events may be absent for some PIT records, and decide how empty aggregates should be represented.
Considerations and consequences
Section titled “Considerations and consequences”- Runtime key generation makes the fact and its dimensions a matched set: they must be produced from the same data state. A refresh of one without the other breaks the key alignment — the main argument for persisting keys when facts and dimensions are delivered independently.
- The aggregation discards event-level detail; if consumers also need the individual transactions, deliver a separate transactional fact at event grain.
- Virtualising the fact (a view over the PIT and Satellite) is viable when the aggregation is inexpensive; heavy aggregations over large PITs favour persistence.