Build customer analytics with FeatureQL

FeatureQL replaces multi-CTE SQL with composable feature definitions. This tutorial builds customer analytics for an e-commerce dataset, progressing from simple aggregations to parameterized A/B tests.

Set up the data

The tutorial uses three tables: a customer dimension, an order fact table, and a pre-aggregated "One Big Table" (OBT) that stores each customer's last order and an array of all their order IDs.

FeatureQL
/* SQL */
--
CREATE SCHEMA IF NOT EXISTS home;
DROP TABLE IF EXISTS home.dim_customers;
DROP TABLE IF EXISTS home.fct_orders;
DROP TABLE IF EXISTS home.agg_customers_obt;
--
CREATE TABLE home.dim_customers (
  customer_id BIGINT,
  name VARCHAR,
  created_at DATE
);
INSERT INTO home.dim_customers VALUES
  (100, 'Alice', '2022-03-15'),
  (101, 'Bob', '2023-06-10'),
  (102, 'Charlie', '2024-01-20'),
  (103, 'Diana', '2024-11-01');
--
CREATE TABLE home.fct_orders (
  order_id BIGINT,
  order_customer_id BIGINT,
  price DECIMAL(10, 2),
  created_at TIMESTAMP
);
INSERT INTO home.fct_orders VALUES
  (1001, 100, 450.00, '2025-06-01 08:00:00'),
  (1002, 100, 380.00, '2025-07-15 14:45:00'),
  (1003, 101, 600.00, '2025-08-20 17:30:00'),
  (1004, 101, 550.00, '2025-10-05 16:20:00'),
  (1005, 100, 520.00, '2025-11-12 11:30:00'),
  (1006, 102, 1200.00, '2025-11-15 14:30:00'),
  (1007, 102, 300.00, '2025-12-20 10:00:00'),
  (1008, 101, 400.00, '2025-12-25 09:45:00'),
  (1009, 103, 850.00, '2026-01-10 12:00:00'),
  (1010, 103, 400.00, '2026-01-14 09:30:00');
--
CREATE TABLE home.agg_customers_obt (
  customer_id BIGINT,
  last_order_id BIGINT,
  orders BIGINT[]
);
INSERT INTO home.agg_customers_obt
SELECT
    order_customer_id AS customer_id,
    MAX_BY(order_id, created_at) AS last_order_id,
    ARRAY_AGG(order_id)
FROM home.fct_orders
GROUP BY order_customer_id;
Result
Count BIGINT
4

Define entities and map external data

Every FeatureQL model starts by declaring entities and mapping table columns to features. This single CREATE FEATURES statement sets up two entities (customers and orders), their primary key inputs, and the EXTERNAL_COLUMNS() mappings that connect each table column to a named feature.

FeatureQL
CREATE OR REPLACE FEATURES IN fm.home AS
SELECT
    -- Entities
    customers := ENTITY(),
    orders := ENTITY(),
    -- Primary keys
    customer_id := INPUT(BIGINT#customers),
    order_id := INPUT(BIGINT#orders),
    -- Customer dimensions
    tables.dim_customers := EXTERNAL_COLUMNS(
        customer_id BIGINT#customers BIND TO customer_id,
        name VARCHAR,
        created_at TIMESTAMP
        FROM TABLE(home.dim_customers)
    ),
    customer_name := tables.dim_customers[name],
    customer_created_at := tables.dim_customers[created_at],
    -- Order facts
    tables.fct_orders := EXTERNAL_COLUMNS(
        order_id BIGINT#orders BIND TO order_id,
        order_customer_id BIGINT#customers,
        price DECIMAL,
        created_at TIMESTAMP
        FROM TABLE(home.fct_orders)
    ),
    order_price := tables.fct_orders[price],
    order_customer_id := tables.fct_orders[order_customer_id],
    order_created_at := tables.fct_orders[created_at],
    -- Customer facts aggregations (OBT)
    tables.agg_customers_obt := EXTERNAL_COLUMNS(
        customer_id BIGINT#customers BIND TO customer_id,
        last_order_id BIGINT#orders,
        orders ARRAY(BIGINT#orders),
        FROM TABLE(home.agg_customers_obt)
    ),
    last_order_id := tables.agg_customers_obt[last_order_id],
    customer_orders := tables.agg_customers_obt[orders],
    -- Keysets: Define where to find the keys for each entity.
    dim_customers_keyset := KEYSET(
        'all', customers,
        'SELECT customer_id AS "fm.home.customer_id"
         FROM home.dim_customers'
    ),
    fct_orders_keyset := KEYSET(
        'all', orders,
        'SELECT order_id AS "fm.home.order_id"
         FROM home.fct_orders'
    ),
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
fm.home.customersCREATEDFeature created as not exists
fm.home.ordersCREATEDFeature created as not exists
fm.home.customer_idCREATEDFeature created as not exists
fm.home.order_idCREATEDFeature created as not exists
fm.home.tables.dim_customersCREATEDFeature created as not exists
fm.home.customer_nameCREATEDFeature created as not exists
fm.home.customer_created_atCREATEDFeature created as not exists
fm.home.tables.fct_ordersCREATEDFeature created as not exists
fm.home.order_priceCREATEDFeature created as not exists
fm.home.order_customer_idCREATEDFeature created as not exists
fm.home.order_created_atCREATEDFeature created as not exists
fm.home.tables.agg_customers_obtCREATEDFeature created as not exists
fm.home.last_order_idCREATEDFeature created as not exists
fm.home.customer_ordersCREATEDFeature created as not exists
fm.home.dim_customers_keysetCREATEDFeature created as not exists
fm.home.fct_orders_keysetCREATEDFeature created as not exists

Notice the BIND TO clauses — they tell FeatureQL which column is the key for each entity. The KEYSET() definitions at the bottom declare how to enumerate all keys, which lets later queries use @BIND_KEYSET() instead of listing IDs manually.

Customer lifetime value with RELATED()

RELATED() aggregates data across entity boundaries in a single expression. Here, it sums order_price grouped by order_customer_id and joins the result back to each customer — replacing the CTE + LEFT JOIN pattern you'd write in SQL.

FeatureQL

FeatureQL
SELECT
    customer_id,
    customer_ltv := customer_id.RELATED(
        SUM(order_price)
        GROUP BY order_customer_id
    ),
FROM fm.home
FOR
    customer_id := @BIND_KEYSET(all, customers),
    order_id := @BIND_KEYSET(all, orders),
;
Result
fm.home.customer_id BIGINTcustomer_ltv VARCHAR
1001350.0
1011550.0
1021500.0
1031250.0

Equivalent SQL

FeatureQL
/* SQL */
WITH orders_agg AS (
    SELECT
        order_customer_id,
        SUM(price) AS customer_ltv
    FROM home.fct_orders
    GROUP BY order_customer_id
)
SELECT
    c.customer_id,
    o.customer_ltv
FROM home.dim_customers c
LEFT JOIN orders_agg o
    ON c.customer_id = o.order_customer_id
;
Result
customer_id BIGINTcustomer_ltv VARCHAR
1001350.0
1011550.0
1021500.0
1031250.0

The SQL version requires a CTE with GROUP BY, then a LEFT JOIN. FeatureQL expresses the same logic in one line: customer_id.RELATED(SUM(order_price) GROUP BY order_customer_id).

Customer lifetime value from arrays

When data is stored in a denormalized format — arrays of foreign keys in an OBT or key-value store — FeatureQL handles it with ZIP() and EXTEND(). ZIP() unpacks a scalar array into an array of rows, EXTEND() enriches each row with data from another entity, and ARRAY_SUM() aggregates back to a scalar.

FeatureQL

FeatureQL
WITH
    customer_orders_details := EXTEND(
        ZIP(customer_orders AS order_id)
        WITH order_price AS order_price
        VIA order_id BIND TO order_id
    )
SELECT
    customer_id,
    customer_ltv_arr := ARRAY_SUM(customer_orders_details[order_price]),
FROM fm.home
FOR
    customer_id := @BIND_KEYSET(all, customers),
;
Result
fm.home.customer_id BIGINTcustomer_ltv_arr VARCHAR
1001350.0
1011550.0
1021500.0
1031250.0

Equivalent SQL

FeatureQL
/* SQL */
WITH orders_unnested AS (
    SELECT
        c.customer_id,
        UNNEST(c.orders) AS order_id
    FROM home.agg_customers_obt c
),
orders_with_price AS (
    SELECT
        u.customer_id,
        o.price
    FROM orders_unnested u
    LEFT JOIN home.fct_orders o
        ON u.order_id = o.order_id
)
SELECT
    customer_id,
    SUM(price) AS customer_ltv_arr
FROM orders_with_price
GROUP BY customer_id
;
Result
customer_id BIGINTcustomer_ltv_arr VARCHAR
1001350.0
1011550.0
1021500.0
1031250.0

The SQL version requires UNNEST, a JOIN, and re-aggregation — three separate steps for what FeatureQL chains in a single expression.

Business rules: send a promo code

Features compose naturally. Here, we persist customer_ltv so it can be reused across queries, then build a business rule: show a promo code to customers who haven't ordered in 30+ days and have LTV above 1000.

FeatureQL
CREATE OR REPLACE FEATURES IN fm.home AS
SELECT
    customer_ltv := customer_id.RELATED(
        SUM(order_price)
        GROUP BY order_customer_id
    )
;
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
fm.home.customer_ltvCREATEDFeature created as not exists

The RELATED() call on last_order_id follows the foreign key to look up the order's timestamp, and DATE_DIFF() computes recency. The boolean show_promocode combines both conditions.

FeatureQL

FeatureQL
SELECT
    customer_id,
    customer_ltv,
    recency := DATE_DIFF(
        last_order_id.RELATED(order_created_at),
        TIMESTAMP '2026-02-01',
        'day'
    ),
    show_promocode := recency > 30 AND customer_ltv > 1000::DECIMAL,
FROM fm.home
FOR
    customer_id := @BIND_KEYSET(all, customers),
    order_id := @BIND_KEYSET(all, orders),
;
Result
fm.home.customer_id BIGINTfm.home.customer_ltv VARCHARrecency BIGINTshow_promocode VARCHAR
1001350.081true
1011550.038true
1021500.043true
1031250.018false

Equivalent SQL

FeatureQL
/* SQL */
WITH customer_ltv AS (
    SELECT
        order_customer_id,
        SUM(price) AS customer_ltv
    FROM home.fct_orders
    GROUP BY order_customer_id
)
SELECT
    c.customer_id,
    ltv.customer_ltv,
    DATE_DIFF('day', o.created_at, DATE '2026-02-01') AS recency,
    DATE_DIFF('day', o.created_at, DATE '2026-02-01') > 30
        AND ltv.customer_ltv > 1000.00 AS show_promocode
FROM home.dim_customers c
LEFT JOIN home.agg_customers_obt agg
    ON c.customer_id = agg.customer_id
LEFT JOIN home.fct_orders o
    ON agg.last_order_id = o.order_id
LEFT JOIN customer_ltv ltv
    ON c.customer_id = ltv.order_customer_id
;
Result
customer_id BIGINTcustomer_ltv VARCHARrecency BIGINTshow_promocode VARCHAR
1001350.081true
1011550.038true
1021500.043true
1031250.018false

A/B testing with MACRO() and HASH01()

MACRO() turns any expression into a parameterized template. Here, the promo-code rule becomes a macro with a configurable recency threshold. HASH01() provides deterministic bucketing — the same customer always lands in the same experiment arm, with no external randomization service needed.

First, persist the recency feature so both experiment arms can reference it:

FeatureQL
CREATE OR REPLACE FEATURES IN fm.home AS
SELECT
    recency := DATE_DIFF(
        last_order_id.RELATED(order_created_at),
        TIMESTAMP '2026-02-01',
        'day'
    )
;
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
fm.home.recencyCREATEDFeature created as not exists

The macro show_promocode_macro accepts a threshold input and produces a boolean. It's instantiated twice — once with 30 days (control) and once with 90 days (test). HASH01() assigns each customer to an arm based on a deterministic hash of their ID.

FeatureQL

FeatureQL
WITH
    threshold := INPUT(BIGINT),
    show_promocode_macro := MACRO(
        recency > threshold AND customer_ltv > 1000.
        USING INPUT threshold
    ),
    show_promocode_control := show_promocode_macro(30),
    show_promocode_test := show_promocode_macro(90),
    expose_to_test :=
        HASH01(UNSAFE_CAST(customer_id AS VARCHAR) || 'SALT')
        BETWEEN.5 0e0 AND 0.5e0,
SELECT
    customer_id,
    show_promocode_control,
    expose_to_test,
    show_promocode := CASE
        WHEN expose_to_test THEN show_promocode_test
        ELSE show_promocode_control
    END,
FROM fm.home
FOR
    customer_id := @BIND_KEYSET(all, customers),
    order_id := @BIND_KEYSET(all, orders),
;
Result
fm.home.customer_id BIGINTshow_promocode_control BOOLEANexpose_to_test BOOLEANshow_promocode BOOLEAN
100truefalsetrue
102truetruefalse
101truefalsetrue
103falsetruefalse

Equivalent SQL

FeatureQL
/* SQL */
WITH customer_ltv AS (
    SELECT
        order_customer_id,
        SUM(price) AS customer_ltv
    FROM home.fct_orders
    GROUP BY order_customer_id
),
customer_recency AS (
    SELECT
        c.customer_id,
        DATE_DIFF('day', o.created_at, DATE '2026-02-01') AS recency
    FROM home.dim_customers c
    LEFT JOIN home.agg_customers_obt agg
        ON c.customer_id = agg.customer_id
    LEFT JOIN home.fct_orders o
        ON agg.last_order_id = o.order_id
)
SELECT
    c.customer_id,
    r.recency > 30
        AND ltv.customer_ltv > 1000.00 AS show_promocode_control,
    CAST(
        CAST(
            ('0x' || SUBSTRING(
                MD5(ARRAY_TO_STRING(
                    ARRAY[c.customer_id::VARCHAR, 'SALT'], ''
                )), 1, 15
            )) AS BIGINT
        ) AS DOUBLE
    ) / POW(2, 60) BETWEEN 0 AND 0.50 AS expose_to_test,
    CASE
        WHEN expose_to_test
            THEN (r.recency > 90 AND ltv.customer_ltv > 1000.00)
        ELSE (r.recency > 30 AND ltv.customer_ltv > 1000.00)
    END AS show_promocode
FROM home.dim_customers c
LEFT JOIN customer_recency r
    ON c.customer_id = r.customer_id
LEFT JOIN customer_ltv ltv
    ON c.customer_id = ltv.order_customer_id
;
Result
customer_id BIGINTshow_promocode_control BOOLEANexpose_to_test BOOLEANshow_promocode VARCHAR
100truefalsetrue
101truefalsetrue
102truetruefalse
103falsetruefalse

The SQL version duplicates the business logic for each arm and manually implements hash-based bucketing. The FeatureQL version defines the logic once and parameterizes the difference.

What's next

This tutorial covered the core workflow: map external data with EXTERNAL_COLUMNS(), aggregate across entities with RELATED(), work with arrays using ZIP() and EXTEND(), create reusable logic with MACRO(), and run experiments with HASH01().

For a deeper look at the e-commerce data model with more entity types and relationship patterns, see the E-commerce tutorial . For language fundamentals, start with Hello World .

Last update at: 2026/03/18 16:18:57
Last updated: 2026-03-18 16:19:34