Historical functions

FeatureQL provides specialized functions for working with temporal data, enabling point in time analysis and historical feature computation.

SCD_AT_TIME()

The function SCD_AT_TIME extracts the valid value of a feature at a specific point in time from a Slowly Changing Dimension (SCD type 2) array structure.

Syntax

SCD_AT_TIME(scd_array, point_in_time)
sql

Parameters

  • scd_array: An ARRAY of ROW structures containing time_update (TIMESTAMP) and value fields
  • point_in_time: TIMESTAMP to query the value at

Behavior

  • Returns NULL if point_in_time is before the first update
  • Returns the most recent value where time_update <= point_in_time
  • Assumes updates are ordered chronologically in the array

ACCUMULATION_AT_TIME()

The function ACCUMULATION_AT_TIME filters accumulated events based on their lifecycle timestamps, returning only events that were active at a specific point in time.

Syntax

ACCUMULATION_AT_TIME(events_array, point_in_time)
sql

Parameters

  • events_array: An ARRAY of ROW structures containing:
    • Primary key field (e.g., order_id)
    • time_create: TIMESTAMP when the event was created
    • time_update: TIMESTAMP of last update
    • time_delete: TIMESTAMP when deleted (NULL if still active)
  • point_in_time: TIMESTAMP to evaluate active events at

Behavior

  • Returns events where time_create <= point_in_time
  • Excludes events where time_delete is not NULL and time_delete <= point_in_time
  • Useful for computing metrics like "active orders at time X" or "valid subscriptions on date Y"

ADD_FIELDS with temporal context

ADD_FIELDS operator enriches records by binding temporal context from related entities. This enables point in time feature engineering across entity relationships.

Syntax

ADD_FIELDS(
    computed_feature AS alias
    TO ROW(field1 AS bind1, field2 AS bind2, ...)
    BINDING
        FIELDS bind1, bind2, ...
        WITH target_field1, target_field2, ...
)
sql

Use case

Computing historical features at the time of an event, such as:

  • Customer's order count at the time they placed each order
  • Account balance at transaction time
  • Inventory levels when orders were placed

Example

The following computes how many orders each customer had at the time they placed each order:

Best practices

  1. Data ordering: Ensure SCD arrays are chronologically ordered by time_update
  2. NULL handling: Always handle NULL returns from SCD_AT_TIME for timestamps before first update
  3. Time precision: Be explicit about timestamp precision (seconds, milliseconds) to avoid boundary issues
  4. Performance: For large accumulation arrays, consider pre filtering or indexing strategies
  5. Consistency: Use consistent timezone handling across all temporal operations

Common patterns

Computing metrics at event time

-- Get customer status when order was placed
SELECT SCD_AT_TIME(customer_status_history, order_timestamp)
sql

Rolling window aggregations

-- Count events in last 30 days from point in time
SELECT
    NUM_EVENTS_LAST_28_DAYS := ACCUMULATION_AT_TIME(events, point_in_time)  -- Only valid events at point in time
        .TRANSFORM(SELECT COUNT(1) WHERE time_create >= DATE_ADD(point_in_time, 'DAY', -28))  -- Count events in last 28 days
        .UNWRAP_ONE()  -- Unwrap the count as a scalar
sql

Historical joins

-- Enrich orders with customer features at order time
SELECT
    order_id,
    order_date,
    tier_at_order := ADD_FIELDS(
        customer_tier AS tier_at_order
        TO ROW(order_customer_id AS ocid, order_date AS odate)
        BINDING
            FIELDS ocid, odate
            WITH customer_id, point_in_time
    )
sql
Last update at: 2025/10/13 10:23:46